Security News
The Unpaid Backbone of Open Source: Solo Maintainers Face Increasing Security Demands
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
🤗 Models | 📊 Datasets | 📃 Paper
triple-encoders
is a library for contextualizing distributed Sentence Transformers representations.
At inference, triple encoders can be used for retrieval-based sequence modeling via sequential modular late-interaction:
Representations are encoded separately and the contextualization is weightless:
triple-encoders
can be used with any Sentence Transformers model. This means that you can model multilingual sequences by simply training on a multilingual model checkpoint.You can install triple-encoders
via pip:
pip install triple-encoders
Note that triple-encoders
requires Python 3.6 or higher.
Our experiments for sequence modeling and short-term planning conducted in the paper can be found in the notebooks
folder. The hyperparameter that we used for training are the default parameters in the trainer.py
file.
We provide an example of how to use triple-encoders for conversational sequence modeling (response selection) with 2 dialog speakers. If you want to use triple-encoders for other sequence modeling tasks, you can use the TripleEncodersForSequenceModeling
class.
from triple_encoders.TripleEncodersForConversationalSequenceModeling import TripleEncodersForConversationalSequenceModeling
triple_path = ''
# load model
model = TripleEncodersForConversationalSequenceModeling(triple_path)
# load candidates for response selection
candidates = ['I am doing great too!','Where did you go?', 'ACL is an interesting conference']
# load candidates and store index
model.load_candidates_from_strings(candidates, output_directory_candidates_dump='output/path/to/save/candidates')
# create a sequence
sequence = model.contextualize_sequence(["Hi!",'Hey, how are you?'], k_last_rows=2)
# model sequence (compute scores for candidates)
sequence = model.sequence_modeling(sequence)
# retrieve utterance from dialog partner
new_utterance = "I'm fine, thanks. How are you?"
# pass it to the model with dialog_partner=True
sequence = model.contextualize_utterance(new_utterance, sequence, dialog_partner=True)
# model sequence (compute scores for candidates)
sequence = model.sequence_modeling(sequence)
# retrieve candidates to provide a response
response = model.retrieve_candidates(sequence, 3)
response
#(['I am doing great too!','Where did you go?', 'ACL is an interesting conference'],
# tensor([0.4944, 0.2392, 0.0483]))
Speed:
from datasets import load_dataset
dataset = load_dataset("daily_dialog")
test = dataset['test']['dialog']
df = model.evaluate_seq_dataset(test, k_last_rows=2)
df
# pandas dataframe with the average rank for each history length
Short-term planning enables you to re-rank candidate replies from LLMs to reach a goal utterance over multiple turns.
from triple_encoders.TripleEncodersForSTP import TripleEncodersForSTP
model = TripleEncodersForSTP(triple_path)
context = ['Hey, how are you ?',
'I am good, how about you ?',
'I am good too.']
candidates = ['Want to eat something out ?',
'Want to go for a walk ?']
goal = ' I am hungry.'
result = model.short_term_planning(candidates, goal, context)
result
# 'Want to eat something out ?'
from datasets import load_dataset
from triple_encoders.TripleEncodersForSTP import TripleEncodersForSTP
dataset = load_dataset("daily_dialog")
test = dataset['test']['dialog']
model = TripleEncodersForSTP(triple_path, llm_model_name_or_path='your favorite large language model')
df = model.evaluate_stp_dataset(test)
# pandas dataframe with the average rank and Hits@k for each history length, goal_distance
You can train your own triple encoders with Contextualized Curved Contrastive Learning (C3L) using our trainer.
The hyperparameters that we used for training are the default parameters in the trainer.py
file.
Note that we pre-trained our best model with Curved Contrastive Learning (CCL) (from imaginaryNLP) before training with C3L.
from triple_encoders.trainer import TripleEncoderTrainer
from datasets import load_dataset
dataset = load_dataset("daily_dialog")
trainer = TripleEncoderTrainer(base_model_name_or_path=,
batch_size=48,
observation_window=5,
speaker_token=True, # used for conversational sequence modeling
num_epochs=3,
warmup_steps=10000)
trainer.generate_datasets(
dataset["train"]["dialog"],
dataset["validation"]["dialog"],
dataset["test"]["dialog"],
)
trainer.train("output/path/to/save/model")
If you use triple-encoders in your research, please cite the following paper:
% todo
@article{anonymous,
title={Triple Encoders: Represenations That Fire Together, Wire Together},
author={Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, Iryna Gurevych},
journal={axiv},
year={2024}
}
Contact person: Justus-Jonas Erker, justus-jonas.erker@tu-darmstadt.de
https://www.ukp.tu-darmstadt.de/
Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions. This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.
triple-encoders is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.
this package is based upon the imaginaryNLP and Sentence Transformers.
FAQs
Distributed Sentence Transformer Representations with Triple Encoders
We found that triple-encoders demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Solo open source maintainers face burnout and security challenges, with 60% unpaid and 60% considering quitting.
Security News
License exceptions modify the terms of open source licenses, impacting how software can be used, modified, and distributed. Developers should be aware of the legal implications of these exceptions.
Security News
A developer is accusing Tencent of violating the GPL by modifying a Python utility and changing its license to BSD, highlighting the importance of copyleft compliance.