@jackdbd/eleventy-plugin-text-to-speech
![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)
Eleventy plugin that synthesizes any text you want, on any page of your Eleventy site, using the Google Cloud Text-to-Speech API. You can either self-host the audio assets this plugin generates, or host them on Cloud Storage.
:warning: The Cloud Text-to-Speech API has a limit of 5000 characters.
See also:
Installation
npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech
Preliminary Operations
Enable the Text-to-Speech API
Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
gcloud services enable texttospeech.googleapis.com
Set up authentication via a service account
This plugin uses the official Node.js client library for the Text-to-Speech API. In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. You just need the service account, no need to configure any IAM permissions.
gcloud iam service-accounts create sa-text-to-speech-user \
--display-name "Text-to-Speech user SA"
Second, download the JSON key of this service account and store it somewhere safe. Do not track this file in git.
Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
Create a Cloud Storage bucket in your desired location. Enable uniform bucket-level access and use the nearline
storage class.
gsutil mb \
-p $GCP_PROJECT_ID \
-l $CLOUD_STORAGE_LOCATION \
-c nearline \
-b on \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
If you want, you can check that uniform bucket-level access is enabled using this command:
gsutil uniformbucketlevelaccess get \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
gsutil iam ch allUsers:objectViewer \
gs://bkt-eleventy-plugin-text-to-speech-audio-files
Usage
Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the environment variable CF_PAGES_URL
.
Self-hosting the generated audio assets
If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
module.exports = function (eleventyConfig) {
eleventyConfig.addPlugin(tts, {
audioHost: process.env.CF_PAGES_URL
? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
: new URL('http://localhost:8090/assets/audio')
})
}
Hosting the generated audio assets on Cloud Storage
If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
module.exports = function (eleventyConfig) {
eleventyConfig.addPlugin(tts, {
audioHost: {
bucketName: 'some-bucket-containing-publicly-readable-files'
},
rules: [
{
regex: new RegExp('posts\\/.*\\.html$'),
cssSelectors: ['h1']
},
{
regex: new RegExp('^((?!404).)*\\.html$'),
xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
}
],
voice: 'en-GB-Wavenet-C'
})
}
Multiple hosts
If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
- self-host some audio assets, and host on a Cloud Storage bucket some other assets
- host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
Have a look at the Eleventy configuration of the demo-site in this monorepo.
Configuration
Required parameters
Parameter | Explanation |
---|
audioHost | Each audio host should have a matching writer responsible for writing/uploading the assets to the host. |
Options
Option | Default | Explanation |
---|
audioEncodings | ['OGG_OPUS', 'MP3'] | List of audio encodings to use when generating audio assets from text matches. |
audioInnerHTML | see in src/dom.ts | Function to use to generate the innerHTML of the <audio> tag to inject in the page for each text match. |
cacheExpiration | 365d | Expiration for the 11ty AssetCache. See here. |
collectionName | audio-items | Name of the 11ty collection created by this plugin. |
keyFilename | process.env.GOOGLE_APPLICATION_CREDENTIALS | credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in audioHost ). |
rules | see in src/constants.ts | Rules that determine which texts to convert into speech. |
transformName | inject-audio-tags-into-html | Name of the 11ty transform created by this plugin. |
voice | en-US-Standard-J | Voice to use when generating audio assets from text matches. The Speech-to-Text API supports these voices, and might have different pricing for diffent voices. |
:warning: Don't forget to set either keyFilename
or the GOOGLE_APPLICATION_CREDENTIALS
environment variable on your build server.
Tip: check what I did in the Eleventy configuration file for the demo-site of this monorepo.
Debug
This plugin uses the debug library for logging. You can control what's logged using the DEBUG
environment variable. For example, if you set your environment variables in a .envrc
file, you could do:
export DEBUG=eleventy-plugin-text-to-speech/*
export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms
Credits
I had the idea of this plugin while reading the code of the homonym eleventy-plugin-text-to-speech by Larry Hudson. There are a few differences between these plugins, the main one is that this plugin uses the Google Cloud Text-to-Speech API, while Larry's plugin uses the Microsoft Azure Speech SDK.