🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
DemoInstallSign in
Socket

llmtranslate

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

llmtranslate

A Python library for language detection and translation using langchain ChatModels

0.7.20
PyPI
Maintainers
1

llmtranslate

Test Python package - Publish Docs PyPI version License: MIT codecov Downloads

The llmtranslate library is a Python tool that leverages LangChain's ChatModels 🦜🔗 to translate text between languages and recognize the language of a given text. This library provides both synchronous and asynchronous translators to accommodate various use cases.

📖 Documentation: llm-translate.com

Installation

To ensure proper dependency management, install the required libraries in the following order:

  • Install LangChain chat libraries for your preferred LLM. For example:

    pip install langchain-openai
    

    For other LLM integrations, refer to LangChain ChatModels documentation.

  • Install the llmtranslate library:

    pip install llmtranslate
    

Poetry problem

If you’re using Poetry, you must set a Python requirement capped below 4.0 to match langchain-openai (which requires <4.0). So, if your pyproject.toml has:

python = ">=3.12"

replace it with:

python = "^3.12"

or:

python = ">=3.12,<4.0"

Otherwise, Poetry sees Python 4.x as valid, which conflicts with langchain-openai’s <4.0 requirement.

Example Using OpenAI's GPT

Using Translator (Synchronous)

from llmtranslate import Translator
from langchain_openai import ChatOpenAI

# Initialize the LLM and Translator
llm = ChatOpenAI(model_name="gpt-4o", openai_api_key="your_openai_api_key")
translator = Translator(llm=llm)

# Detect the language of the text
text_language = translator.get_text_language("Hi how are you?")
if text_language:
    print(text_language.ISO_639_1_code) # Output: en
    print(text_language.ISO_639_2_code) # Output: eng
    print(text_language.ISO_639_3_code) # Output: eng
    print(text_language.language_name) # Output: English

# Translate text
translated_text = translator.translate("Bonjour tout le monde", "English")
print(translated_text)  # Output: Hello everyone

Using AsyncTranslator (Asynchronous)

This example demonstrates how to use the AsyncTranslator class to asynchronously detect the language of a text and translate it into another language. The AsyncTranslator is designed to work with asynchronous operations by splitting the input text into manageable chunks and by handling multiple concurrent requests to your language model.

Key Configuration Options

  • max_translation_chunk_length: Sets the maximum length for each text chunk that is translated.
  • max_translation_chunk_length_multilang: Sets the maximum length for text chunks when dealing with multiple languages.
  • max_concurrent_llm_calls: Limits the number of concurrent calls made to the underlying language model.
  • max_concurrent_time_period: (New) Limits the maximum number of concurrent LLM calls within a specific time period (in seconds). For instance, if you want to allow up to 50 requests per minute, set:
    • max_concurrent_llm_calls to 50
    • max_concurrent_time_period to 60

If you only need to limit the number of concurrent calls without a time constraint, you can leave max_concurrent_time_period at its default value (typically 0).

Code Example

import asyncio
from llmtranslate import AsyncTranslator
from langchain_openai import ChatOpenAI

# Initialize the language model (LLM) with your API key
llm = ChatOpenAI(model_name="gpt-4o", openai_api_key="your_openai_api_key")

async def translate_text():
    # Initialize AsyncTranslator with appropriate parameters
    translator = AsyncTranslator(
        llm=llm,
        max_translation_chunk_length=100,           # Maximum chunk length for translation
        max_translation_chunk_length_multilang=50,    # Maximum chunk length for multi-language texts
        max_concurrent_llm_calls=10,                  # Maximum number of concurrent LLM calls
        # To enforce a rate limit (e.g., 50 calls per minute), uncomment and adjust the following:
        # max_concurrent_llm_calls=50,
        # max_concurrent_time_period=60,
    )
    
    # Prepare tasks: one to detect language and one to translate the text
    tasks = [
        translator.get_text_language("Hi how are you?"),
        translator.translate("Hi how are you?", "Spanish")
    ]
    
    # Run the tasks concurrently
    results = await asyncio.gather(*tasks)
    
    # Retrieve and display detected language information
    text_language = results[0]
    if text_language:
        print(text_language.ISO_639_1_code)  # Expected Output: en
        print(text_language.ISO_639_2_code)  # Expected Output: eng
        print(text_language.ISO_639_3_code)  # Expected Output: eng
        print(text_language.language_name)   # Expected Output: English

    # Display the translated text
    print(results[1])  # Expected Output: Hola, ¿cómo estás?

# Run the asynchronous translation process
asyncio.run(translate_text())

Summary

  • Asynchronous Execution: Uses asyncio to concurrently run language detection and translation tasks.
  • Language Detection: The get_text_language function returns language details (e.g., ISO 639-1, ISO 639-2, ISO 639-3 codes and language name).
  • Translation: The translate function converts the input text to the desired target language.
  • Concurrency Control:
    • Without Time Limit: Simply set max_concurrent_llm_calls to limit concurrent calls.
    • With Time Limit: Specify both max_concurrent_llm_calls and max_concurrent_time_period to control the number of calls within a specific time window (e.g., 50 calls per 60 seconds).

This example should help you understand how to integrate and configure all available parameters in the AsyncTranslator, including the new time-based rate limiting feature.

Key Parameters

max_translation_chunk_length

  • Description: Defines the maximum length (in characters) of a text chunk to be translated in a single call when the text is in one language.
  • Recommendation: If translations are not accurate, try reducing this value as weaker LLMs struggle with large chunks of text.

max_translation_chunk_length_multilang

  • Description: Defines the maximum length (in characters) of a text chunk when the text contains multiple languages.
  • Recommendation: Reduce this value for better accuracy with multi-language inputs.

max_concurrent_llm_calls

  • Description: Limits the number of concurrent calls made to the LLM.
  • Default: 10
  • Usage: Adjust this parameter to align with your LLM provider's concurrency limits.

Other Supported LLMs Examples

You can find more examples of LangChain Chat models in the documentation at:

Anthropic's Claude

To create an instance of an Anthropic-based Chat LLM:

  pip install langchain-anthropic
  pip install llmtranslate
from langchain_anthropic import Anthropic
from llmtranslate import Translator
llm = Anthropic(model="claude-2", anthropic_api_key="your_anthropic_api_key")
translator = Translator(llm=llm)

Mistral via API

To create an instance of a Chat LLM using the Mistral API:

pip install -qU langchain_mistralai
pip install llmtranslate
from langchain_mistral import MistralChat
from llmtranslate import Translator

llm = MistralChat(api_key="your_mistral_api_key")
translator = Translator(llm=llm)

Supported Languages

llmtranslate supports all languages supported by GPT-4o. For a complete list of language codes, you can visit the ISO 639-1 website.

Here is a table showing which languages are supported by gpt-4o and gpt4o-mini:

Language NameLanguage CodeSupported by gpt-4oSupported by gpt4o-mini
EnglishenYesYes
Mandarin ChinesezhYesYes
HindihiYesYes
SpanishesYesYes
FrenchfrYesYes
GermandeYesYes
RussianruYesYes
ArabicarYesYes
ItalianitYesYes
KoreankoYesYes
PunjabipaYesYes
BengalibnYesYes
PortugueseptYesYes
IndonesianidYesYes
UrduurYesYes
Persian (Farsi)faYesYes
VietnameseviYesYes
PolishplYesYes
SamoansmYesYes
ThaithYesYes
UkrainianukYesYes
TurkishtrYesYes
MaorimiNoNo
NorwegiannoYesYes
DutchnlYesYes
GreekelYesYes
RomanianroYesYes
SwahiliswYesYes
HungarianhuYesYes
HebrewheYesYes
SwedishsvYesYes
CzechcsYesYes
FinnishfiYesYes
AmharicamNoNo
TagalogtlYesYes
BurmesemyYesYes
TamiltaYesYes
KannadaknYesYes
PashtopsYesYes
YorubayoYesYes
MalaymsYesYes
Haitian CreolehtYesYes
NepalineYesYes
SinhalasiYesYes
CatalancaYesYes
MalagasymgYesYes
LatvianlvYesYes
LithuanianltYesYes
EstonianetYesYes
SomalisoYesYes
TigrinyatiNoNo
BretonbrNoNo
FijianfjYesNo
MaltesemtYesYes
CorsicancoYesYes
LuxembourgishlbYesYes
OccitanocYesYes
WelshcyYesYes
AlbaniansqYesYes
MacedonianmkYesYes
IcelandicisYesYes
SlovenianslYesYes
GalicianglYesYes
BasqueeuYesYes
AzerbaijaniazYesYes
UzbekuzYesYes
KazakhkkYesYes
MongolianmnYesYes
TibetanboNoNo
KhmerkmYesNo
LaoloYesYes
TeluguteYesYes
MarathimrYesYes
ChichewanyYesYes
EsperantoeoYesYes
KurdishkuNoNo
TajiktgYesYes
XhosaxhYesNo
YiddishyiYesYes
ZuluzuYesYes
SundanesesuYesYes
TatarttYesYes
QuechuaquNoNo
UighurugNoNo
WolofwoNoNo
TswanatnYesYes

Additional Resources

Authors

License

llmtranslate is licensed under the MIT License. See the LICENSE file for more details.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts