Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

mistral-common

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

mistral-common

  • 1.5.1
  • PyPI
  • Socket score

Maintainers
2

Mistral Common

What is it?

mistral-common is a set of tools to help you work with Mistral models.

Our first release contains tokenization. Our tokenizers go beyond the usual text <-> tokens, adding parsing of tools and structured conversation. We also release the validation and normalization code that is used in our API.

We are releasing three versions of our tokenizer powering different sets of models.

Open ModelTokenizer
Mistral 7B Instruct v0.1v1
Mistral 7B Instruct v0.2v1
Mistral 7B Instruct v0.3v3
Mixtral 8x7B Instruct v0.1v1
Mixtral 8x22B Instruct v0.1v3
Mixtral 8x22B Instruct v0.3v3
Codestral 22B v0.1v3
Codestral Mamba 7B v0.1v3
Mathstral 7B v0.1v3
Nemo 12B 2407v3 - Tekken
Large 123B 2407v3
Endpoint ModelTokenizer
mistral-embedv1
open-mistral-7bv3
open-mixtral-8x7bv1
open-mixtral-8x22bv3
mistral-small-latestv2
mistral-large-latestv3
codestral-22bv3
open-codestral-mambav3
open-mistral-nemov3 - Tekken

Installation

pip

You can install mistral-common via pip:

pip install mistral-common

From Source

Alternatively, you can install from source directly. This repo uses poetry as a dependency and virtual environment manager.

You can install poetry with

pip install poetry

poetry will set up a virtual environment and install dependencies with the following command:

poetry install

Examples

Open In Colab
# Import needed packages:
from mistral_common.protocol.instruct.messages import (
    UserMessage,
)
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.protocol.instruct.tool_calls import (
    Function,
    Tool,
)
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer

# Load Mistral tokenizer

model_name = "open-mixtral-8x22b"

tokenizer = MistralTokenizer.from_model(model_name)

# Tokenize a list of messages
tokenized = tokenizer.encode_chat_completion(
    ChatCompletionRequest(
        tools=[
            Tool(
                function=Function(
                    name="get_current_weather",
                    description="Get the current weather",
                    parameters={
                        "type": "object",
                        "properties": {
                            "location": {
                                "type": "string",
                                "description": "The city and state, e.g. San Francisco, CA",
                            },
                            "format": {
                                "type": "string",
                                "enum": ["celsius", "fahrenheit"],
                                "description": "The temperature unit to use. Infer this from the users location.",
                            },
                        },
                        "required": ["location", "format"],
                    },
                )
            )
        ],
        messages=[
            UserMessage(content="What's the weather like today in Paris"),
        ],
        model=model_name,
    )
)
tokens, text = tokenized.tokens, tokenized.text

# Count the number of tokens
print(len(tokens))

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc