web-speech-cognitive-services
Advanced tools
Comparing version 5.0.2-master.d388ae9 to 5.0.2-master.e462e9e
@@ -9,2 +9,10 @@ # Changelog | ||
### Fixed | ||
- Speech recognition: Removed extraneous finalized `result` event in continuous mode, by [@compulim](https://github.com/compulim), in PR [#79](https://github.com/compulim/web-speech-cognitive-services/pull/79) | ||
### Added | ||
- Speech recognition: New `loosenEvents` option, default is `false`. When enabled, we will no longer follow observed browser event order. We will send finalized `result` event as early as possible. This will not break conformance to W3C specifications. By [@compulim](https://github.com/compulim), in PR [#79](https://github.com/compulim/web-speech-cognitive-services/pull/79) | ||
## [5.0.1] - 2019-10-25 | ||
@@ -11,0 +19,0 @@ |
@@ -46,3 +46,3 @@ "use strict"; | ||
var VERSION = "5.0.2-master.d388ae9"; | ||
var VERSION = "5.0.2-master.e462e9e"; | ||
@@ -49,0 +49,0 @@ function buildSpeechResult(transcript, confidence, isFinal) { |
@@ -66,4 +66,4 @@ "use strict"; | ||
meta.setAttribute('name', 'web-speech-cognitive-services'); | ||
meta.setAttribute('content', "version=".concat("5.0.2-master.d388ae9")); | ||
meta.setAttribute('content', "version=".concat("5.0.2-master.e462e9e")); | ||
document.head.appendChild(meta); | ||
//# sourceMappingURL=SpeechServices.js.map |
@@ -140,2 +140,4 @@ "use strict"; | ||
enableTelemetry = _ref3$enableTelemetry === void 0 ? true : _ref3$enableTelemetry, | ||
looseEvent = _ref3.looseEvent, | ||
looseEvents = _ref3.looseEvents, | ||
referenceGrammars = _ref3.referenceGrammars, | ||
@@ -157,2 +159,7 @@ _ref3$region = _ref3.region, | ||
if (typeof looseEvent !== 'undefined') { | ||
console.warn('web-speech-cognitive-services: The option "looseEvent" should be named as "looseEvents".'); | ||
looseEvents = looseEvent; | ||
} | ||
var _onAudibleChunk; | ||
@@ -618,11 +625,24 @@ | ||
})); | ||
} // If it is continuous, we just sent the finalized results. So we don't need to send it again after "audioend" event. | ||
if (_this3.continuous && recognizable) { | ||
finalEvent = null; | ||
} else { | ||
finalEvent = { | ||
results: finalizedResults, | ||
type: 'result' | ||
}; | ||
} | ||
finalEvent = { | ||
results: finalizedResults, | ||
type: 'result' | ||
}; | ||
if (!_this3.continuous) { | ||
recognizer.stopContinuousRecognitionAsync(); | ||
} // If event order can be loosened, we can send the recognized event as soon as we receive it. | ||
// 1. If it is not recognizable (no-speech), we should send an "error" event just before "end" event. We will not loosen "error" events. | ||
if (looseEvents && finalEvent && recognizable) { | ||
_this3.dispatchEvent(new SpeechRecognitionEvent(finalEvent.type, finalEvent)); | ||
finalEvent = null; | ||
} | ||
@@ -629,0 +649,0 @@ } else if (recognizing) { |
{ | ||
"name": "web-speech-cognitive-services", | ||
"version": "5.0.2-master.d388ae9", | ||
"version": "5.0.2-master.e462e9e", | ||
"description": "Polyfill Web Speech API with Cognitive Services Speech-to-Text service", | ||
@@ -68,2 +68,3 @@ "keywords": [ | ||
"microsoft-speech-browser-sdk": "^0.0.12", | ||
"prettier": "^1.19.1", | ||
"rimraf": "^2.6.3" | ||
@@ -70,0 +71,0 @@ }, |
@@ -5,4 +5,2 @@ # web-speech-cognitive-services | ||
> This scaffold is provided by [`react-component-template`](https://github.com/compulim/react-component-template/). | ||
[![npm version](https://badge.fury.io/js/web-speech-cognitive-services.svg)](https://badge.fury.io/js/web-speech-cognitive-services) [![Build Status](https://travis-ci.org/compulim/web-speech-cognitive-services.svg?branch=master)](https://travis-ci.org/compulim/web-speech-cognitive-services) | ||
@@ -137,5 +135,16 @@ | ||
<td><code>undefined</code></td> | ||
<td>Pass-through option to enable or disable telemetry for Speech SDK recognizer as <a href="https://github.com/Microsoft/cognitive-services-speech-sdk-js#data--telemetry">outlined in Speech SDK</a>. This adapter does not collect any telemetry.<br /><br />By default, Speech SDK will collect telemetry unless this is set to <code>false</code>.</td> | ||
<td> | ||
Pass-through option to enable or disable telemetry for Speech SDK recognizer as <a href="https://github.com/Microsoft/cognitive-services-speech-sdk-js#data--telemetry">outlined in Speech SDK</a>. This adapter does not collect any telemetry.<br /><br />By default, Speech SDK will collect telemetry unless this is set to <code>false</code>. | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><code>looseEvents: boolean</code></td> | ||
<td><code>"false"</code></td> | ||
<td> | ||
Specifies if the event order should strictly follow observed browser behavior (<code>"false"</code>), or loosened behavior (<code>"true"</code>). Regardless of the option, the package will continue to <a href="https://wicg.github.io/speech-api/#eventdef-speechrecognition-result">conform with W3C specifications</a>. | ||
<br /><br /> | ||
You can read more about this option in <a href="#event-order">event order section</a>. | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><code>ponyfill.AudioContext: <a href="https://developer.mozilla.org/en-US/docs/Web/API/AudioContext">AudioContext</a></code></td> | ||
@@ -151,3 +160,5 @@ <td><code>window.AudioContext ||</code><br /><code>window.webkitAudioContext</code></td> | ||
<td><code>undefined</code></td> | ||
<td>Reference grammar IDs to send for speech recognition.</td> | ||
<td> | ||
Reference grammar IDs to send for speech recognition. | ||
</td> | ||
</tr> | ||
@@ -177,3 +188,3 @@ <tr> | ||
<td><code>speechSynthesisOutputFormat: string</code></td> | ||
<td><code>audio-24khz-160kbitrate-mono-mp3</code></td> | ||
<td><code>"audio-24khz-160kbitrate-mono-mp3"</code></td> | ||
<td>Audio format for speech synthesis. Please refer to <a href="https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech#audio-outputs">this article</a> for list of supported formats.</td> | ||
@@ -434,2 +445,38 @@ </tr> | ||
## Event order | ||
According to [W3C specifications](https://wicg.github.io/speech-api/#eventdef-speechrecognition-result), the `result` event can be fire at any time after `audiostart` event. | ||
In continuous mode, finalized `result` event will be sent as early as possible. But in non-continuous mode, we observed browsers send finalized `result` event just before `audioend`, instead of as early as possible. | ||
By default, we follow event order observed from browsers (a.k.a. strict event order). For a speech recognition in non-continuous mode and with interims, the observed event order will be: | ||
1. `start` | ||
1. `audiostart` | ||
1. `soundstart` | ||
1. `speechstart` | ||
1. `result` (these are interim results, with `isFinal` property set to `false`) | ||
1. `speechend` | ||
1. `soundend` | ||
1. `audioend` | ||
1. `result` (with `isFinal` property set to `true`) | ||
1. `end` | ||
You can loosen event order by setting `looseEvents` to `false`. For the same scenario, the event order will become: | ||
1. `start` | ||
1. `audiostart` | ||
1. `soundstart` | ||
1. `speechstart` | ||
1. `result` (these are interim results, with `isFinal` property set to `false`) | ||
1. `result` (with `isFinal` property set to `true`) | ||
1. `speechend` | ||
1. `soundend` | ||
1. `audioend` | ||
1. `end` | ||
For `error` events (abort, `"no-speech"` or other errors), we always sent it just before the last `end` event. | ||
In some cases, loosening event order may improve recognition performance. This will not break conformance to W3C standard. | ||
# Test matrix | ||
@@ -436,0 +483,0 @@ |
Sorry, the diff of this file is not supported yet
Sorry, the diff of this file is not supported yet
Sorry, the diff of this file is too big to display
Sorry, the diff of this file is too big to display
4283397
9028
516
14