🚀 Socket Launch Week Day 5:Introducing Repository Access Permissions and Custom Roles.Learn more →

@marswave/coli

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

@marswave/coli - npm Package Compare versions

Comparing version

0.0.15

0.0.16

+12

distribution/source/deprecations.d.ts

		/**
		* Direct audio playback depends on platform-specific tools (afplay on macOS,
		* ffplay on Linux) and is not supported on Windows. Users should use -o <file>
		* to save the audio file and play it with their preferred player.
		*/
		export declare const deprecationDirectPlayback = "COLI_DEP001";
		/**
		* Passing a file path string to `runAsr` is deprecated. Non-WAV formats
		* require a system-installed ffmpeg, and WAV reading ties the API to the
		* filesystem. Pass an `AudioData` object (`{ sampleRate, samples }`) instead.
		*/
		export declare const deprecationAsrFilePath = "COLI_DEP002";

+12

distribution/source/deprecations.js

		/**
		* Direct audio playback depends on platform-specific tools (afplay on macOS,
		* ffplay on Linux) and is not supported on Windows. Users should use -o <file>
		* to save the audio file and play it with their preferred player.
		*/
		export const deprecationDirectPlayback = 'COLI_DEP001';
		/**
		* Passing a file path string to `runAsr` is deprecated. Non-WAV formats
		* require a system-installed ffmpeg, and WAV reading ties the API to the
		* filesystem. Pass an `AudioData` object (`{ sampleRate, samples }`) instead.
		*/
		export const deprecationAsrFilePath = 'COLI_DEP002';

+17

docs/deprecations.md

		# Deprecated APIs

		Coli uses deprecation codes (`COLI_DEPxxx`) to communicate deprecated features. When a deprecated feature is used, a `DeprecationWarning` is emitted via `process.emitWarning`. You can suppress these warnings with the `--no-deprecation` Node.js flag.

		## List of deprecations

		### COLI_DEP001: Direct audio playback

		- Affected: `runCloudTts()` without `output` option, `coli cloud-tts` without `-o`

		Direct audio playback relies on platform-specific tools (`afplay` on macOS, `ffplay` on Linux) and is not supported on Windows. Use the `-o <file>` option to save the audio file and play it with your preferred player instead.

		### COLI_DEP002: File path input for ASR

		- Affected: `runAsr()` with a file path string

		Passing a file path string to `runAsr()` is deprecated. Non-WAV formats require a system-installed `ffmpeg` for conversion, which adds an external dependency. Pass an `AudioData` object (`{ sampleRate: number, samples: Float32Array }`) instead. The caller is responsible for reading and decoding the audio file. The `coli asr` CLI handles conversion internally and is not affected by this deprecation.

+0

-1

distribution/source/_api/listenhub-openapi.js

		@@ -34,3 +34,2 @@ import ky from 'ky';
		async tts(options) {
		// eslint-disable-next-line @typescript-eslint/await-thenable
		const response = await this.api.post('v1/tts', { json: options });
		@@ -37,0 +36,0 @@ if (!response.body)

+28

-8

distribution/source/asr/_cli.js

		import { Buffer } from 'node:buffer';
		import fs from 'node:fs';
		import path from 'node:path';
		import process from 'node:process';
		import { runAsr } from './asr.js';
		import { convertToWav, readWave, runAsr, } from './asr.js';
		import { ensureModels, ensureVadModel } from './models.js';
		@@ -24,8 +26,27 @@ import { streamAsr } from './stream-asr.js';
		await ensureModels([model]);
		await runAsr(file, {
		json: options.json,
		model,
		// eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion
		language: options.language,
		});
		const resolvedPath = path.resolve(file);
		const ext = path.extname(resolvedPath).toLowerCase();
		let wavPath;
		let needsCleanup = false;
		if (ext === '.wav') {
		wavPath = resolvedPath;
		}
		else {
		wavPath = await convertToWav(resolvedPath);
		needsCleanup = true;
		}
		try {
		const input = readWave(wavPath);
		await runAsr(input, {
		json: options.json,
		model,
		// eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion
		language: options.language,
		});
		}
		finally {
		if (needsCleanup && fs.existsSync(wavPath)) {
		fs.unlinkSync(wavPath);
		}
		}
		});
		@@ -49,3 +70,2 @@ program
		async function* stdinAudio() {
		// eslint-disable-next-line @typescript-eslint/await-thenable
		for await (const chunk of process.stdin) {
		@@ -52,0 +72,0 @@ // eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion

+1

-1

distribution/source/asr/_index.d.ts

		export { ensureModels, ensureVadModel, getModelPath, getVadModelPath, modelDisplayNames, } from './models.js';
		export { runAsr, type AsrOptions, type SenseVoiceLanguage } from './asr.js';
		export { convertToWav, readWave, runAsr, type AsrOptions, type AudioData, type SenseVoiceLanguage, } from './asr.js';
		export { streamAsr, type AsrStreamResult, type StreamAsrOptions, type VadOptions, } from './stream-asr.js';

+1

-1

distribution/source/asr/_index.js

		export { ensureModels, ensureVadModel, getModelPath, getVadModelPath, modelDisplayNames, } from './models.js';
		export { runAsr } from './asr.js';
		export { convertToWav, readWave, runAsr, } from './asr.js';
		export { streamAsr, } from './stream-asr.js';

+7

-1

distribution/source/asr/asr.d.ts

		@@ -0,1 +1,3 @@
		export declare function readWave(filename: string): AudioData;
		export declare function convertToWav(inputPath: string): Promise<string>;
		type ModelName = 'whisper' \| 'sensevoice';
		@@ -8,3 +10,7 @@ export type SenseVoiceLanguage = 'auto' \| 'zh' \| 'en' \| 'ja' \| 'ko' \| 'yue';
		};
		export declare function runAsr(filePath: string, options: AsrOptions): Promise<void>;
		export type AudioData = {
		sampleRate: number;
		samples: Float32Array;
		};
		export declare function runAsr(input: string \| AudioData, options: AsrOptions): Promise<void>;
		export {};

+26

-16

distribution/source/asr/asr.js

		@@ -5,3 +5,5 @@ import fs from 'node:fs';
		import path from 'node:path';
		import process from 'node:process';
		import { execa } from 'execa';
		import { deprecationAsrFilePath } from '../deprecations.js';
		import { getModelPath, modelDisplayNames } from './models.js';
		@@ -16,6 +18,8 @@ const require = createRequire(import.meta.url);
		}
		async function convertToWav(inputPath) {
		export function readWave(filename) {
		return sherpaOnnx().readWave(filename);
		}
		export async function convertToWav(inputPath) {
		const outputPath = path.join(os.tmpdir(), `coli-${Date.now()}.wav`);
		try {
		// eslint-disable-next-line @typescript-eslint/await-thenable
		await execa('ffmpeg', [
		@@ -76,22 +80,28 @@ '-i',
		}
		export async function runAsr(filePath, options) {
		const resolvedPath = path.resolve(filePath);
		if (!fs.existsSync(resolvedPath)) {
		throw new Error(`File not found: ${resolvedPath}`);
		}
		const ext = path.extname(resolvedPath).toLowerCase();
		export async function runAsr(input, options) {
		let wave;
		let needsCleanup = false;
		let wavPath;
		let needsCleanup = false;
		if (ext === '.wav') {
		wavPath = resolvedPath;
		if (typeof input === 'string') {
		process.emitWarning('Passing a file path to runAsr() is deprecated. Pass an AudioData object ({ sampleRate, samples }) instead.', { type: 'DeprecationWarning', code: deprecationAsrFilePath });
		const resolvedPath = path.resolve(input);
		if (!fs.existsSync(resolvedPath)) {
		throw new Error(`File not found: ${resolvedPath}`);
		}
		const ext = path.extname(resolvedPath).toLowerCase();
		if (ext === '.wav') {
		wavPath = resolvedPath;
		}
		else {
		wavPath = await convertToWav(resolvedPath);
		needsCleanup = true;
		}
		wave = sherpaOnnx().readWave(wavPath);
		}
		else {
		wavPath = await convertToWav(resolvedPath);
		needsCleanup = true;
		wave = input;
		}
		try {
		const onnx = sherpaOnnx();
		const recognizer = createRecognizer(options.model, options.language);
		const stream = recognizer.createStream();
		const wave = onnx.readWave(wavPath);
		stream.acceptWaveform({ sampleRate: wave.sampleRate, samples: wave.samples });
		@@ -117,3 +127,3 @@ recognizer.decode(stream);
		finally {
		if (needsCleanup && fs.existsSync(wavPath)) {
		if (needsCleanup && wavPath && fs.existsSync(wavPath)) {
		fs.unlinkSync(wavPath);
		@@ -120,0 +130,0 @@ }

+1

-2

distribution/source/asr/models.js

		@@ -64,4 +64,3 @@ import fs from 'node:fs';
		console.log(' Extracting...');
		// eslint-disable-next-line @typescript-eslint/await-thenable
		await execa('tar', ['xjf', tarPath, '-C', modelsDirectory]);
		await execa('tar', ['xf', tarPath, '-C', modelsDirectory]);
		fs.unlinkSync(tarPath);
		@@ -68,0 +67,0 @@ console.log(` ${dirName} ready.\n`);

+0

-2

distribution/source/asr/stream-asr.js

		@@ -83,3 +83,2 @@ import { createRequire } from 'node:module';
		}
		// eslint-disable-next-line @typescript-eslint/await-thenable
		for await (const chunk of audio) {
		@@ -113,3 +112,2 @@ const combined = new Float32Array(pending.length + chunk.length);
		let lastText = '';
		// eslint-disable-next-line @typescript-eslint/await-thenable
		for await (const chunk of audio) {
		@@ -116,0 +114,0 @@ buffers.push(chunk);

+15

-2

distribution/source/cloud-tts/cloud-tts.js

		@@ -5,5 +5,7 @@ import { Buffer } from 'node:buffer';
		import path from 'node:path';
		import process from 'node:process';
		import { Writable } from 'node:stream';
		import { execa } from 'execa';
		import { ListenHubApi } from '../_api/listenhub-openapi.js';
		import { deprecationDirectPlayback } from '../deprecations.js';
		export async function listSpeakers(options) {
		@@ -45,2 +47,6 @@ const api = new ListenHubApi({
		}
		process.emitWarning('Direct audio playback is deprecated. Use -o <file> to save the audio file instead.', { type: 'DeprecationWarning', code: deprecationDirectPlayback });
		if (process.platform === 'win32') {
		throw new Error('Direct audio playback is not supported on Windows. Use -o <file> to save the audio file.');
		}
		const mp3Path = path.join(os.tmpdir(), `coli-cloud-tts-${Date.now()}.mp3`);
		@@ -50,4 +56,11 @@ const audio = await collectStream(stream);
		try {
		// eslint-disable-next-line @typescript-eslint/await-thenable
		await execa('afplay', [mp3Path]);
		await (process.platform === 'darwin'
		? execa('afplay', [mp3Path])
		: execa('ffplay', [
		'-nodisp',
		'-autoexit',
		'-loglevel',
		'quiet',
		mp3Path,
		]));
		}
		@@ -54,0 +67,0 @@ finally {

+8

-0

distribution/source/tts/tts.js

		@@ -0,6 +1,14 @@
		import process from 'node:process';
		import { getVoices as macGetVoices, say } from 'mac-say';
		function assertMacOs() {
		if (process.platform !== 'darwin') {
		throw new Error('Local TTS is only supported on macOS. Use the cloud-tts command instead.');
		}
		}
		export async function getVoices() {
		assertMacOs();
		return macGetVoices();
		}
		export async function runTts(text, options = {}) {
		assertMacOs();
		await say(text, {
		@@ -7,0 +15,0 @@ voice: options.voice,

+29

-26

docs/asr.md

		@@ -7,12 +7,4 @@ # ASR (Automatic Speech Recognition)

		- [ffmpeg](https://ffmpeg.org/) (for non-WAV audio formats like `.m4a`, `.mp3`, etc.)
		No external dependencies are required for WAV files. Non-WAV format support via the CLI is deprecated and requires [ffmpeg](https://ffmpeg.org/) (see [COLI_DEP002](deprecations.md#coli_dep002-file-path-input-for-asr)).

		```sh
		# macOS
		brew install ffmpeg

		# Debian / Ubuntu
		sudo apt install ffmpeg
		```

		## CLI
		@@ -22,6 +14,6 @@
		# Plain text output
		coli asr recording.m4a
		coli asr recording.wav

		# JSON output
		coli asr -j recording.m4a
		coli asr -j recording.wav

		@@ -32,3 +24,3 @@ # Select model
		# Specify language (sensevoice only)
		coli asr --language zh recording.m4a
		coli asr --language zh recording.wav
		```
		@@ -99,7 +91,22 @@

		### `runAsr(filePath, options)`
		### `readWave(filename)`

		Run speech recognition on an audio file. Results are printed to stdout.
		Read a WAV file and return an `AudioData` object. Use this to load WAV files for `runAsr`.

		```js
		import {ensureModels, readWave, runAsr} from '@marswave/coli';

		await ensureModels();

		const audio = readWave('/path/to/recording.wav');
		await runAsr(audio, {json: false, model: 'sensevoice'});
		```

		### `runAsr(input, options)`

		Run speech recognition on audio data. Results are printed to stdout.

		The `input` parameter accepts either an `AudioData` object (recommended) or a file path string (deprecated).

		```js
		import {ensureModels, runAsr} from '@marswave/coli';
		@@ -109,14 +116,10 @@

		// Plain text output
		// Recommended: pass AudioData directly
		await runAsr(
		{sampleRate: 16000, samples: myFloat32Array},
		{json: false, model: 'sensevoice'},
		);

		// Deprecated: file path input (requires ffmpeg for non-WAV formats)
		await runAsr('recording.m4a', {json: false, model: 'sensevoice'});

		// JSON output
		await runAsr('recording.m4a', {json: true, model: 'whisper'});

		// Force Chinese language (sensevoice only)
		await runAsr('recording.m4a', {
		json: false,
		model: 'sensevoice',
		language: 'zh',
		});
		```
		@@ -250,2 +253,2 @@

		WAV files are passed directly to the recognizer. All other formats (m4a, mp3, ogg, flac, etc.) are automatically converted to 16 kHz mono WAV via ffmpeg.
		The CLI accepts WAV files directly. For the programmatic API, use `readWave()` to load WAV files into an `AudioData` object, or provide your own `AudioData` from any source. Non-WAV file path input is deprecated (see [COLI_DEP002](deprecations.md#coli_dep002-file-path-input-for-asr)).

+1

-1

package.json

		{
		"name": "@marswave/coli",
		"private": false,
		"version": "0.0.15",
		"version": "0.0.16",
		"description": "A CLI for the Cola",
		@@ -6,0 +6,0 @@ "repository": "marswaveai/coli",

+1

-0

README.md

		@@ -59,2 +59,3 @@ # coli
		- [ListenHub OpenAPI](docs/listenhub-openapi.md) — ListenHub OpenAPI client
		- [Deprecations](docs/deprecations.md) — Deprecated APIs

		@@ -61,0 +62,0 @@ ## License

@marswave/coli - npm Package Compare versions

New alerts

Fixed alerts

Improved metrics