Socket
Socket
Sign inDemoInstall

@bbc/stt-align-node

Package Overview
Dependencies
Maintainers
17
Versions
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@bbc/stt-align-node

<!-- _One liner + link to confluence page_ _Screenshot of UI - optional_ -->


Version published
Weekly downloads
42
decreased by-23.64%
Maintainers
17
Weekly downloads
 
Created
Source

Stt-align-node

See The alignment problem in the docs for more background of the problem this module set out to address.

Originally developed as a node version of python's stt-align by Chris Baume - BBC R&D.

Setup - development

git clone git@github.com:bbc/stt-align-node.git
cd stt-align-node
npm install

Setup - in production

npm install @bbc/stt-align-node

Usage

Other then to realign STT results with accurate text, this modules can also be used to perform related oprations in the same domain, such as benchmarking STT.

FunctionDescriptiontype
alignSTTRealign STT json with accurate text. by transposing words from accurate text to timecodes of STT.json
diffsListreturn a diff json of STT vs accurate textjson
diffsListAsHtmlreturn a diff of STT vs accurate text as HTMLhtml
diffsCountreturn a diff of STT vs accurate text as HTMLjson
calculateWordDurationreturn a diff of STT vs accurate text as HTMLNumber

See See README in example-usage folder as well as code examples for more.


System Architecture

Node version of stt-align by Chris Baume - R&D.

In pseudo code overview of alignSTT:

  • input, output as described in the example usage.

    • Accurate base text transcription, string.
    • Array of word objects transcription from STT service.
  • Align words

    • normalize words, by removing capitalization and punctuation and converting numbers to letters

    • generate array list of words from base text, and array list of words from stt transcript.

      • get opcodes using difflib comparing two arrays
      • for equal matches, add matched STT word objects segment to results array base text index position.
      • Then iterate to result array to replace STT word objects text with words from base text
    • interpolate missing words

      • calculates missing timecodes
      • first optimization
        • using neighboring words to do a first pass at setting missing start and end time when present
      • Then Missing word timings are interpolated using interpolation library 'everpolate.

Development env

  • node 10
  • npm 6.1.0

Build

npm run build

bundles the code with react, into a ./build folder.

build demo

npm run build:demo

Demo is in docs folder

Publish demo to github pages

npm run deploy:ghpages

Tests

npm run test:watch
  • add more tests

Deployment

Deploy to npm

npm run publish:public

FAQs

Package last updated on 23 Feb 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc