Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

narizaka

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

narizaka

Tool to make high quality text to speech (tts) corpus from audio + text books.

  • 1.2.4
  • PyPI
  • Socket score

Maintainers
1

Narizaka

Tool to make high quality text to speech (tts) corpus from audio + text books.

How it works

First it transcribes audio with whisper ASR, saving all word level timestamps, then it alligns this transcription with original text, if distance is very small we consider it as match and add it to the dataset.

Installation

First, you should install several system dependancies:

On deb linux:

sudo apt install ffmpeg pandoc

on MacOSX:

brew install ffmpeg pandoc libmagic

Then you can install narizaka:

pip install narizaka

or if you want to use the latest development version:

pip install git+https://github.com/patriotyk/narizaka.git

Also if you plan to modify sources:

git clone https://github.com/patriotyk/narizaka.git
pip install -e narizaka/

Flag -e means that you can edit source files in the directory where you have cloned this project and they will be reflected when you run command narizaka

Every tagged commit on the main branch, automatically generates and pushes image to the docker hub. So you can also pull this images:

docker pull patriotyk/narizaka:latest

How to use

Application as input accepts directory that contains audio data, it can be folder or subfolder of audio files, or just one audio file and there also should be one text file tat represents this audio. This text file, can be any document that accepts pandoc application. Example:

narizaka test_data/farshrutka 

Or

narizaka test_data

to process all books.

This repository contains test_data that includes two audio and text books that you can use for testing.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc