ebt-deepl
The ebt-deepl repository is a Javascript library
for Early Buddhist Texts (EBT) translation using DeepL.
Translation is difficult. It is difficult because definitions
are fluid as they are adapted to communicate societal needs
and desires. And when societal needs and desires change
quickly, as they are today, the translation also has to
deal with the communication gaps between living generations
within a single society.
For an internet world, that translation problem is global.
Translating texts takes time. Translation takes years.
And in those long years, a new generation will be born,
a new generation that will grow up to create their own
terms and catch phrases, unknowing and unwilling
to learn the words for the needs and desires of
older generations.
Yet in that stormy flood of change, there are invariants.
The invariants that remain do so regardless of the names
by with which they are painted.
Praised, reviled, ignored, remembered or forgotten,
those invariants stand true through the generations,
through the ages.
And, if we are to understand each other,
we have to start with those invariants.
Indeed, reliable communication relies on shared,
verifiable truths. The security of the internet
itself relies on shared verifiable truths.
For the internet, those truths are SSL certificates.
But to understand each other meaningfully,
we need to talk about meaningful truths.
What is meaningful?
Well, judging from the internet, there appears to
be a lot of meaningless drama in the world.
There is a lot of suffering.
And if suffering is a problem,
then if there was an end to suffering, perhaps
that might be something good to talk about.
Indeed, the internet devotes a lot of bandwidth
to various purported means of ending suffering.
We have pop-up ads and we have math videos.
Oddly, pop-up ads claim to end suffering but
rarely do, whereas math videos exclude
suffering from discourse since suffering is
not a computable axiom.
Interestingly, thousands of years ago, someone
did talk about noble truths. In fact, that
very notable person did call them Noble Truths
and declared them to be self-verifiable.
And thousands of years later, people are still
finding those Noble Truths meaningful.
DN1:1.3.2: “It’s incredible, reverends, it’s amazing how the diverse convictions of sentient beings have been clearly comprehended by the Blessed One, who knows and sees, the perfected one, the fully awakened Buddha.
One of the most fascinating things about these
Noble Truths is that they have survived
and flourished with repeated translation.
They have remained meaningful through generations
and ages. In other words, the Four Noble Truths,
might be a remarkable basis for translation
and communication about meaningful topics.
So although we could talk about things like
painting ourselves to look young,
we might instead
wish to talk about ending suffering
reliably and verifiably.
But wait. We need more than four truths
for translation. Bummer.
Well, math found a solution to a paucity of truth.
It's called the proof. Math has few axioms and many
proofs that extend those axioms in useful ways
that mathematicians discuss endlessly.
Can the four Noble Truths be extended in the same
way? Well, the EBTs do claim so:
MN80:16.5: Let a sensible person come—neither devious nor deceitful, a person of integrity. I teach and instruct them.
MN80:16.6: Practicing as instructed they will soon know and see for themselves,
It would appear then that the Early Buddhist Texts
might prove to be a consistent, reliable, and verifiable
candidate for a contemporary translation basis.
And why does that matter?
Well, if we take a consistent,
reliable, and verifiable basis for contemporary translation
and feed it to a machine translator,
we might be able to understand each other better
and talk about finding meaningful solutions
to meaningful things...
DeepL
DeepL is a state-of-the-art
automated translator.
DeepL has a glossary feature
that supports the customization of vocabulary
for selected purposes.
And the glossary feature is remarkably well-suited
for representing a large translation basis.
DeepL allows for up to 5000 glossary entries,
which provides a customizable basis of up to a solid B2 level
And with DeepL customized this way,
we could talk together,
if we wanted to, about
everyday things and about meaningful things.
iti75:5.2: It’s when some person gives to everyone—whether ascetics and brahmins, paupers, vagrants, nomads, or beggars—such things as food, drink, clothing, vehicles; garlands, perfumes, and makeup; and bed, house, and lighting.
The EBT-DeepL project
The EBT-DeepL project is still in a research phase
researching how to create
a reliable translation basis using the EBTs.
In particular, we have found it quite useful to
rely on the semantic web of meaning
discovered in
curated EBT examples.
The EBT examples guide EBT readers through
modern translations of the Pali Early Buddhist Texts
found on SuttaCental.
What the EBT examples reveal is the remarkable attention to
consistency in the authoring of the prose EBTs.
Terms are repeated throughout the EBTs
and used quite consistently with remarkable nuance,
in contexts both broad and deep.
Searching for EBT examples reveals
an amazingly rich semantic web that has
assisted and guided ongoing human translation efforts
over the ages.
Indeed, the process of translation itself
fosters additional discussions which
reveal deeper connections within the EBTs.
Because of this, we rely on the EBT examples
to guide our efforts in building
the EBT-DeepL glossaries.
Briefly, our process is simple:
- Choose an example
- Choose a short document with that example
- Compare the DeepL translation with the human EBT translation
- Update the DeepL glossaries with the guidance of human EBT translators
- Publicly provide DeepL translations with MIT License, thereby allowing human EBT translators to refine their own translations.
- Repeat with another example
Finally, we are very grateful to Bhante Sujato
and the entire SuttaCentral community
for their joint work in continuosly translating
the Early Buddhist Texts for all to read.
Create DeepL account
- Login to your DeepL account and use the top-right menu to open your
Account
settings
- Choose the
Account
tab to show your authentication key
- Copy the authentication key
Clone and test repositiory
git clone https://github.com/sc-voice/ebt-deepl.git
npm install
mkdir local
echo YOUR_AUTHENTICATION_KEY > local/deepl.auth
npm run test
Translate a sutta
EBT-DeepL translates from two sources having consistent and extensive Pali EBT coverage.
The first source by default is Bhante Sujato's EN translations.
The second source by default is Ayya Sabbamitta's DE translations.
DeepL translations will be provided for both translation sources.
A human reference is also useful--by default the reference
author will be a Bilara author having
existing consistent, segmented EBT translations.
For example, the reference for PT is laera-quaresma
and the reference for FR is noeismet.
References are not translated by DeepL--they are simply shown
in the output to aid in verification.
The Pali MS segmented text is also shown in the output
for an absolute reference of comparison.
To translate a sutta to portuguese, try this:
scripts/translate.mjs -dl pt an3.49
For a description of the options to translate.mjs:
scripts/translate.mjs --help
Glossaries and Transforms
Glossaries and transforms are located in src/glossary:
- Glossary files have the form ebt_SRCLANG_DSTLANG.kvg
- Transform files have the form transform_SRCLANG.json
Glossary
Glossary files are turned into
DeepL Glossaries
using the DeepLAPI.
An uploaded glossaries will always replace the previous version.
You'll always have a current glossary for your translations.
Transform files
Transform files are handled only by EBT-DeepL.
Transform files are written in JSON:
{
"Mönch oder eine Nonne": "Moench",
"SRC_PATTERN": "REPLACEMENT"
}
Each entry in a transform file consists of a key/value pair.
The key will be converted into a
Javascript regular expression pattern.
The value is the replacement.
If a transform file exists for a source language,
all source text read by EBT-DeepL will be
transformed by replacing text matching
the regular expressions with their corresponding replacements.
A transform is useful when DeepL glossaries fail.
For example, DeepL ignores "Mönch" glossary entries,
but happily accepts and converts "Moench" glossary entries.
A transform is a useful hack that can steer DeepL when it feels stubborn.