New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

transformers-rb

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

transformers-rb

  • 0.1.6
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

Transformers.rb

:slightly_smiling_face: State-of-the-art transformers for Ruby

For fast inference, check out Informers :fire:

Build Status

Installation

First, install Torch.rb.

Then add this line to your application’s Gemfile:

gem "transformers-rb"

Getting Started

Models

Embedding

Sparse embedding

Reranking

sentence-transformers/all-MiniLM-L6-v2

Docs

sentences = ["This is an example sentence", "Each sentence is converted"]

model = Transformers.pipeline("embedding", "sentence-transformers/all-MiniLM-L6-v2")
embeddings = model.(sentences)

sentence-transformers/multi-qa-MiniLM-L6-cos-v1

Docs

query = "How many people live in London?"
docs = ["Around 9 Million people live in London", "London is known for its financial district"]

model = Transformers.pipeline("embedding", "sentence-transformers/multi-qa-MiniLM-L6-cos-v1")
query_embedding = model.(query)
doc_embeddings = model.(docs)
scores = doc_embeddings.map { |e| e.zip(query_embedding).sum { |d, q| d * q } }
doc_score_pairs = docs.zip(scores).sort_by { |d, s| -s }

sentence-transformers/all-mpnet-base-v2

Docs

sentences = ["This is an example sentence", "Each sentence is converted"]

model = Transformers.pipeline("embedding", "sentence-transformers/all-mpnet-base-v2")
embeddings = model.(sentences)

sentence-transformers/paraphrase-MiniLM-L6-v2

Docs

sentences = ["This is an example sentence", "Each sentence is converted"]

model = Transformers.pipeline("embedding", "sentence-transformers/paraphrase-MiniLM-L6-v2")
embeddings = model.(sentences)

mixedbread-ai/mxbai-embed-large-v1

Docs

query_prefix = "Represent this sentence for searching relevant passages: "

input = [
  "The dog is barking",
  "The cat is purring",
  query_prefix + "puppy"
]

model = Transformers.pipeline("embedding", "mixedbread-ai/mxbai-embed-large-v1")
embeddings = model.(input)

thenlper/gte-small

Docs

sentences = ["That is a happy person", "That is a very happy person"]

model = Transformers.pipeline("embedding", "thenlper/gte-small")
embeddings = model.(sentences)

intfloat/e5-base-v2

Docs

doc_prefix = "passage: "
query_prefix = "query: "

input = [
  doc_prefix + "Ruby is a programming language created by Matz",
  query_prefix + "Ruby creator"
]

model = Transformers.pipeline("embedding", "intfloat/e5-base-v2")
embeddings = model.(input)

BAAI/bge-base-en-v1.5

Docs

query_prefix = "Represent this sentence for searching relevant passages: "

input = [
  "The dog is barking",
  "The cat is purring",
  query_prefix + "puppy"
]

model = Transformers.pipeline("embedding", "BAAI/bge-base-en-v1.5")
embeddings = model.(input)

Snowflake/snowflake-arctic-embed-m-v1.5

Docs

query_prefix = "Represent this sentence for searching relevant passages: "

input = [
  "The dog is barking",
  "The cat is purring",
  query_prefix + "puppy"
]

model = Transformers.pipeline("embedding", "Snowflake/snowflake-arctic-embed-m-v1.5")
embeddings = model.(input, pooling: "cls")

opensearch-project/opensearch-neural-sparse-encoding-v1

Docs

docs = ["The dog is barking", "The cat is purring", "The bear is growling"]

model_id = "opensearch-project/opensearch-neural-sparse-encoding-v1"
model = Transformers::AutoModelForMaskedLM.from_pretrained(model_id)
tokenizer = Transformers::AutoTokenizer.from_pretrained(model_id)
special_token_ids = tokenizer.special_tokens_map.map { |_, token| tokenizer.vocab[token] }

feature = tokenizer.(docs, padding: true, truncation: true, return_tensors: "pt", return_token_type_ids: false)
output = model.(**feature)[0]

values, _ = Torch.max(output * feature[:attention_mask].unsqueeze(-1), dim: 1)
values = Torch.log(1 + Torch.relu(values))
values[0.., special_token_ids] = 0
embeddings = values.to_a

mixedbread-ai/mxbai-rerank-base-v1

Docs

query = "How many people live in London?"
docs = ["Around 9 Million people live in London", "London is known for its financial district"]

model = Transformers.pipeline("reranking", "mixedbread-ai/mxbai-rerank-base-v1")
result = model.(query, docs)

BAAI/bge-reranker-base

Docs

query = "How many people live in London?"
docs = ["Around 9 Million people live in London", "London is known for its financial district"]

model = Transformers.pipeline("reranking", "BAAI/bge-reranker-base")
result = model.(query, docs)

Pipelines

Text

Embedding

embed = Transformers.pipeline("embedding")
embed.("We are very happy to show you the 🤗 Transformers library.")

Reranking

rerank = Informers.pipeline("reranking")
rerank.("Who created Ruby?", ["Matz created Ruby", "Another doc"])

Named-entity recognition

ner = Transformers.pipeline("ner")
ner.("Ruby is a programming language created by Matz")

Sentiment analysis

classifier = Transformers.pipeline("sentiment-analysis")
classifier.("We are very happy to show you the 🤗 Transformers library.")

Question answering

qa = Transformers.pipeline("question-answering")
qa.(question: "Who invented Ruby?", context: "Ruby is a programming language created by Matz")

Feature extraction

extractor = Transformers.pipeline("feature-extraction")
extractor.("We are very happy to show you the 🤗 Transformers library.")

Vision

Image classification

classifier = Transformers.pipeline("image-classification")
classifier.("image.jpg")

Image feature extraction

extractor = Transformers.pipeline("image-feature-extraction")
extractor.("image.jpg")

API

This library follows the Transformers Python API. The following model architectures are currently supported:

  • BERT
  • DeBERTa-v2
  • DistilBERT
  • MPNet
  • ViT
  • XLM-RoBERTa

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/transformers-ruby.git
cd transformers-ruby
bundle install
bundle exec rake download:files
bundle exec rake test

FAQs

Package last updated on 29 Dec 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc