ssml-document
Follow the W3C SSML (Speech Synthesis Markup Language) specification to build SSML documents that meet the requirements of all major voice service providers.
遵循 W3C SSML(语音合成标记语言)规范,构建符合大部分语音/数字人服务提供商要求的 SSML 文档。
Currently supports the generation of SSML documents for these service providers:
Microsoft Azure / Aliyun / Tencent Cloud / Google Cloud / Amazon AWS / GuijiAI / Tencent YunXiaoWei / XMOV / Microsoft XiaoBing / Sensetime / Eta ...
当前支持生成这些服务商的SSML文档:
微软云 / 阿里云 / 腾讯云 / 谷歌云 / 亚马逊云 / 火山云 / 硅基智能 / 腾讯云小微 / XMOV魔珐科技 / 微软小冰 / 商汤科技 / 艾塔 ...
Every month we check the development documents of these service providers to ensure that the library is not out of date.
每月我们会检查这些服务商的开发文档,以确保本库不会过时。
Initialization
npm i ssml-document --save
Features
- Support the construction of SSML that meets the requirements of mainstream voice service providers 支持构建符合主流语音服务商要求的SSML
- Supports parsing and construction of aggregated SSML (intermediate state) 支持聚合SSML(中间态)的解析和构建
- Supports almost all SSML document tags 支持几乎全部的SSML文档标签
- Support for selectively constructing or discarding tags and attributes based on service providers 支持根据服务商选择性构建或丢弃标签与属性
- Support expression compilation to make content change dynamically 支持JavaScript表达式求值使内容动态变化
- The library supports the following SSML tags and corresponding elements 该库支持以下SSML标签和对应的元素
- speak - Document
- voice - Voice
- prosody - Prosody
- p - Paragraph
- s - Sentence
- w - Word
- break - Break
- phoneme - Phoneme
- say-as - SayAs
- sub - Subsitute
- audio - Audio
- background-audio - BackgroundAudio
- express-as - ExpressAs
- emotion - Emotion
- effect - Effect
- emphasis - Emphasis
- lang - Language
- mark - Mark
- bookmark - Bookmark
- seq - Sequential
- par - Parallel
- lexicon - Lexicon
- auto-breaths - AutoBreaths
- silence - Silence
Basic Usage
Build microsoft-azure SSML:
构建微软云语音SSML:
const { Document, ServiceProvider } = require("ssml-document");
const doc = new Document();
const ssml = doc
.voice("zh-CN-XiaoxiaoNeural")
.prosody({ rate: 1.2, pitch: 1.1 })
.say("Hello World")
.pause(500)
.say("GO GO GO")
.sayAs("123456", { interpretAs: "digits" })
.up()
.up()
.render({
pretty: true,
provider: ServiceProvider.Microsoft
});
console.log(ssml);
Build aliyun SSML:
构建阿里云语音SSML:
const { Document, ServiceProvider } = require("ssml-document");
const doc = new Document();
const ssml = doc
.voice("aixia")
.prosody({ rate: 1.2, pitch: 1.1 })
.say("我的身高")
.phoneme("长", {
alphabet: "py",
ph: "zhǎng"
})
.say("高了")
.s("欢迎来到")
.sub("W3C", "万维网")
.up()
.up()
.up()
.render({ pretty: true, provider: ServiceProvider.Aliyun });
console.log(ssml);
Build aggregated SSML for storage or transport:
构建用于存储或传输的聚合SSML:
const { Document } = require("ssml-document");
const doc = new Document();
const ssml = doc
.voice("aixia")
.say("Hello World")
.break(2000)
.say("Bye")
.up()
.render({ pretty: true });
console.log(ssml);
Support for marking expected voice service providers in aggregated SSML:
支持在聚合SSML中标记期望的语音服务商:
const { Document } = require("ssml-document");
const doc = new Document({ provider: ServiceProvider.Aliyun });
const ssml = doc
.voice("aida")
.say("Hello World")
.up()
.render({ pretty: true });
console.log(ssml);
const doc2 = Document.parse(ssml);
console.log(doc2.provider);
const ssml2 = doc2.render({ provider: doc2.provider });
console.log(ssml2);
Get speech rate, pitch, volume and speaker parameters.
获取语音语速、语调、音量以及发音人参数
const { Document } = require("ssml-document");
const doc = new Document();
doc.voice("aixia")
.prosody({ rate: 1.2, pitch: 1.1 })
.say("Example Text");
console.log(doc.rate, doc.pitch, doc.volume, doc.speaker);
Building SSML based on elements:
基于元素构建SSML:
const { Document, ServiceProvider, elements } = require("ssml-document");
const { Prosody, Paragraph } = elements;
const doc = new Document();
const prosody = new Prosody({ rate: 1.2, pitch: 1.1 });
const p = new Paragraph();
p.appendChild("Hello World");
prosody.appendChild(p);
doc.appendChild(prosody);
const ssml = doc.render({ provider: ServiceProvider.Amazon });
console.log(ssml);
Setting the compile attribute for tags enables JavaScript syntax expression evaluation:
给标签设置compile属性启用JavaScript语法表达式求值:
const { Document } = require("ssml-document");
const aggregationSSML = '<?xml version="1.0"?>\
<speak compile="true" version="1.0" xmlns="http://www.w3.org/2001/10/synthesis">\
<voice name="aixia">\
Master, {{[\"today is a beautiful day\",\"have a good day from now\",\"the weather is not bad today\"][parseInt(Math.random() * 3)]}}.\
<break time="500"/>\
The current time is {{time}}.\
</voice>\
</speak>';
const date = new Date();
const doc = Document.parse(aggregationSSML, {
dataset: {
time: date.getHours() + ":" + date.getMinutes()
}
});
const ssml = doc.render({
provider: doc.provider,
pretty: true
});
console.log(ssml);
Setting for/if/elif/else attributes for tags can also implement loops and conditional divergence.
给标签设置 for/if/elif/else 属性还能够实现循环和条件分歧。
const { Document } = require("ssml-document");
const aggregationSSML = '<?xml version="1.0"?>\
<speak compile="true" version="1.0" xmlns="http://www.w3.org/2001/10/synthesis">\
<voice name="aixia">\
<s if="{{Math.random() > 0.5}}" for="{{s1}}">{{item}}<break time="100"></s>\
<s else="true" for="{{s2}}">{{item}}<break time="100"></s>\
</voice>\
</speak>';
const doc = Document.parse(aggregationSSML, {
dataset: {
s1: ["Oh", "my", "god"],
s2: ["Holy", "crap"]
}
});
const ssml = doc.render({
provider: doc.provider,
pretty: true
});
console.log(ssml);
Elements
Document
Tag: speak
Support: w3c / microsoft / aliyun / tencent / google / amazon
Voice
Tag: voice
Support: w3c / microsoft / google
Prosody
Tag: prosody
Support: w3c / microsoft / google / amazon
Paragraph
Tag: p
Support: w3c / microsoft / aliyun / google / amazon
Sentence
Tag: s
Support: w3c / microsoft / aliyun / google / amazon
Word
Tag: w
Support: w3c / aliyun / google / amazon
Break
Tag: break
Support: w3c / microsoft / aliyun / tencent / google / amazon / yunXiaoWei
Phoneme
Tag: phoneme
Support: w3c / microsoft / aliyun / tencent / google / amazon / yunXiaoWei
SayAs
Tag: say-as
Support: w3c / microsoft / aliyun / tencent / google / amazon
Subsitute
Tag: sub
Support: w3c / aliyun / tencent / google / amazon
Audio
Tag: audio
Support: w3c / microsoft / aliyun* / google
BackgroundAudio
Tag: background-audio / backgroundAudio / backgroundaudio
Support: microsoft / aliyun*
ExpressAs
Tag: express-as
Support: microsoft / aliyun* / amazon*
Emotion
Tag: emotion
Support: microsoft* / aliyun / amazon*
Effect
Tag: effect
Support: aliyun* / amazon
Emphasis
Tag: emphasis
Support: w3c / microsoft / google / amazon
Language
Tag: lang
Support: w3c / microsoft / google
Mark
Tag: mark
Support: w3c / google / amazon
Bookmark
Tag: bookmark
Support: microsoft
Sequential
Tag: seq
Support: microsoft
Parallel
Tag: par
Support: w3c / google
Sequential
Tag: seq
Support: w3c / google
Lexicon
Tag: lexicon
Support: w3c / microsoft
AutoBreaths
Tag: auto-breaths
Support: amazon
Silence
Tag: silence
Support: microsoft