Metadata-Version: 2.1
		Name: adaptor
		Version: 0.2.2
		Version: 0.2.3
		Summary: Adaptor: Objective-centric Adaptation Framework for Language Models.
		@@ -21,3 +21,3 @@ Home-page: https://github.com/gaussalgo/adaptor

		# Adapt𝒪r: Objective-centric Adaptation library
		# Adaptor: Objective-centric Adaptation library

		@@ -38,3 +38,3 @@ [![Tests](https://github.com/gaussalgo/adaptor/actions/workflows/test.yml/badge.svg)](https://github.com/gaussalgo/adaptor/actions)
		- [Benefits of Task and Domain Adaptation](#benefits-of-task-and-domain-adaptation)
		- [How Can Adapt𝒪r Help](#how-can-adaptor-help)
		- [How Can Adaptor Help](#how-can-adaptor-help)
		- [Usage](#usage)
		@@ -45,3 +45,3 @@ - [Install](#usage)
		- [How to Contribute](CONTRIBUTING.md)
		- [Cite](#citing-adapt𝒪r)
		- [Cite](#citing-adaptor)
		</details>
		@@ -156,2 +156,3 @@
		```
		Try this example on real data, in [`tutorials/adapted_named_entity_recognition.ipynb`](tutorials/adapted_named_entity_recognition.ipynb)

		@@ -165,3 +166,3 @@ #### Adapted Machine Translation
		```python
		# 1. pick the models - randomly pre-initialize the appropriate heads
		# 1. pick the base model
		lang_module = LangModule("Helsinki-NLP/opus-mt-en-de")
		@@ -204,8 +205,41 @@
		```
		Try this example with training resources resolution from OPUS in `examples/machine_translation/train_wiki_adapt_bible.py`
		Try this example on real data, in [`tutorials/unsupervised_machine_translation.ipynb`](tutorials/unsupervised_machine_translation.ipynb)

		#### More examples
		#### Single-objective training
		It also makes sense to use the comfort of adaptor high-level interface in simple use-cases
		where it's enough to apply one objective.
		```python
		# 1. pick the base model
		lang_module = LangModule(test_base_models["sequence_classification"])

		You can find a few more exaples in [tutorials](tutorials), but contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))
		# 2. pick any objective - note that all objectives have almost identical interface
		classification = SequenceClassification(lang_module=lang_module,
		texts_or_path="tests/mock_data/supervised_texts.txt",
		labels_or_path="tests/mock_data/supervised_texts_sequence_labels.txt",
		batch_size=1)
		# 3. Schedule choice does not matter in single-objective training
		schedule = SequentialSchedule(objectives=[classification], args=training_arguments)

		# 4. train using Adapter
		adapter = Adapter(lang_module=lang_module, schedule=parallel_schedule, args=training_arguments)
		adapter.train()

		# 5. save the trained lang_module
		adapter.save_model("output_model")

		# 6. reload and use it like any other Hugging Face model
		classifier = AutoModelForSequenceClassification.from_pretrained("output_model/SequenceClassification")
		tokenizer = AutoTokenizer.from_pretrained("output_model/SequenceClassification")

		inputs = tokenizer("A piece of text to translate.", return_tensors="pt")
		output = classifier(**inputs)
		output_label_id = output.logits.argmax(-1)[0].item()
		print("Your new model predicted class: %s" % classifier.config.id2label[output_label_id])
		```
		Try this example on real data in [`tutorials/simple_sequence_classification.ipynb`](tutorials/simple_sequence_classification.ipynb)

		### More examples

		You can find more examples in [tutorials](tutorials). Your contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))

		### Motivation for objective-centric training
		@@ -235,3 +269,3 @@

		## Citing Adapt𝒪r
		## Citing Adaptor

		@@ -238,0 +272,0 @@ If you use Adaptor in your research, please cite it as follows.

+2

-1

adaptor.egg-info/requires.txt

		torch>=1.7
		transformers<=4.19.1,>=4.10.2
		transformers<=4.30.2
		sentencepiece
		accelerate>=0.20.1

		@@ -5,0 +6,0 @@ [examples]

+1

-1

adaptor/adapter.py

		@@ -82,3 +82,3 @@ import logging

		def log(self, logs: List[Dict[str, float]]) -> None:
		def log(self, logs: Dict[str, float]) -> None:
		is_eval_log = any(self.eval_metrics_prefix in log_key for log_key in logs)
		@@ -85,0 +85,0 @@ extended_logs = self.schedule.objectives_log(split="eval" if is_eval_log else "train")

+2

-2

adaptor/lang_module.py

		@@ -147,3 +147,3 @@ import logging

		def forward(self, **inputs) -> torch.LongTensor:
		def forward(self, return_loss: bool = True, **inputs) -> torch.LongTensor:
		"""
		@@ -155,3 +155,3 @@ Performs forward pass over the head identified by the sample's `oid`.
		try:
		selected_head_model = self.trainable_models[str(inputs["oid"])]
		selected_head_model = self.trainable_models[str(inputs["oid"].item())]
		except KeyError:
		@@ -158,0 +158,0 @@ raise ValueError("Requesting inference with the objective having no registered head."

+3

-3

adaptor/objectives/objective_base.py

		@@ -145,3 +145,3 @@ import abc
		loss_history = self.loss_history[split][-self.max_samples_per_log[split]:]
		mean_loss = sum(loss_history) / len(loss_history) if len(loss_history) else 0
		mean_loss = sum(loss_history) / len(loss_history) if len(loss_history) else float("inf")
		self.evaluations_history[split]["loss"].append(mean_loss)
		@@ -207,3 +207,3 @@
		logger.warning("Objective `%s` convergence metric `%s` did not improve for %s eval steps. History: %s" %
		(self, stopping_evaluator, patience, last_n))
		(self, stopping_evaluator, patience, self.evaluations_history["eval"][stopping_evaluator]))

		@@ -304,3 +304,3 @@ return passed_patience_evals and did_not_improve
		def _add_oid(sample: Union[BatchEncoding, Dict[str, torch.LongTensor]]) -> Dict[str, torch.LongTensor]:
		sample["oid"] = id(self)
		sample["oid"] = torch.tensor(id(self))
		return sample
		@@ -307,0 +307,0 @@

+2

-2

adaptor/schedules.py

		@@ -26,3 +26,3 @@ import abc
		objectives: Dict[str, Dict[int, Objective]]
		objectives_outputs_queue: List[Tuple[str, int]]
		objectives_outputs_queue: List[Tuple[str, torch.LongTensor]]
		converged_objectives: List[Objective]
		@@ -181,3 +181,3 @@ should_stop: bool
		# the objective loss arrives aggregated into a single item
		loss = self.objectives[split][oid].compute_loss(logit_outputs, labels, inputs, split)
		loss = self.objectives[split][oid.item()].compute_loss(logit_outputs, labels, inputs, split)

		@@ -184,0 +184,0 @@ return loss

+43

-9

PKG-INFO

		Metadata-Version: 2.1
		Name: adaptor
		Version: 0.2.2
		Version: 0.2.3
		Summary: Adaptor: Objective-centric Adaptation Framework for Language Models.
		@@ -21,3 +21,3 @@ Home-page: https://github.com/gaussalgo/adaptor

		# Adapt𝒪r: Objective-centric Adaptation library
		# Adaptor: Objective-centric Adaptation library

		@@ -38,3 +38,3 @@ [![Tests](https://github.com/gaussalgo/adaptor/actions/workflows/test.yml/badge.svg)](https://github.com/gaussalgo/adaptor/actions)
		- [Benefits of Task and Domain Adaptation](#benefits-of-task-and-domain-adaptation)
		- [How Can Adapt𝒪r Help](#how-can-adaptor-help)
		- [How Can Adaptor Help](#how-can-adaptor-help)
		- [Usage](#usage)
		@@ -45,3 +45,3 @@ - [Install](#usage)
		- [How to Contribute](CONTRIBUTING.md)
		- [Cite](#citing-adapt𝒪r)
		- [Cite](#citing-adaptor)
		</details>
		@@ -156,2 +156,3 @@
		```
		Try this example on real data, in [`tutorials/adapted_named_entity_recognition.ipynb`](tutorials/adapted_named_entity_recognition.ipynb)

		@@ -165,3 +166,3 @@ #### Adapted Machine Translation
		```python
		# 1. pick the models - randomly pre-initialize the appropriate heads
		# 1. pick the base model
		lang_module = LangModule("Helsinki-NLP/opus-mt-en-de")
		@@ -204,8 +205,41 @@
		```
		Try this example with training resources resolution from OPUS in `examples/machine_translation/train_wiki_adapt_bible.py`
		Try this example on real data, in [`tutorials/unsupervised_machine_translation.ipynb`](tutorials/unsupervised_machine_translation.ipynb)

		#### More examples
		#### Single-objective training
		It also makes sense to use the comfort of adaptor high-level interface in simple use-cases
		where it's enough to apply one objective.
		```python
		# 1. pick the base model
		lang_module = LangModule(test_base_models["sequence_classification"])

		You can find a few more exaples in [tutorials](tutorials), but contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))
		# 2. pick any objective - note that all objectives have almost identical interface
		classification = SequenceClassification(lang_module=lang_module,
		texts_or_path="tests/mock_data/supervised_texts.txt",
		labels_or_path="tests/mock_data/supervised_texts_sequence_labels.txt",
		batch_size=1)
		# 3. Schedule choice does not matter in single-objective training
		schedule = SequentialSchedule(objectives=[classification], args=training_arguments)

		# 4. train using Adapter
		adapter = Adapter(lang_module=lang_module, schedule=parallel_schedule, args=training_arguments)
		adapter.train()

		# 5. save the trained lang_module
		adapter.save_model("output_model")

		# 6. reload and use it like any other Hugging Face model
		classifier = AutoModelForSequenceClassification.from_pretrained("output_model/SequenceClassification")
		tokenizer = AutoTokenizer.from_pretrained("output_model/SequenceClassification")

		inputs = tokenizer("A piece of text to translate.", return_tensors="pt")
		output = classifier(**inputs)
		output_label_id = output.logits.argmax(-1)[0].item()
		print("Your new model predicted class: %s" % classifier.config.id2label[output_label_id])
		```
		Try this example on real data in [`tutorials/simple_sequence_classification.ipynb`](tutorials/simple_sequence_classification.ipynb)

		### More examples

		You can find more examples in [tutorials](tutorials). Your contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))

		### Motivation for objective-centric training
		@@ -235,3 +269,3 @@

		## Citing Adapt𝒪r
		## Citing Adaptor

		@@ -238,0 +272,0 @@ If you use Adaptor in your research, please cite it as follows.

+42

-8

README.md

		@@ -1,2 +0,2 @@
		# Adapt𝒪r: Objective-centric Adaptation library
		# Adaptor: Objective-centric Adaptation library

		@@ -17,3 +17,3 @@ [![Tests](https://github.com/gaussalgo/adaptor/actions/workflows/test.yml/badge.svg)](https://github.com/gaussalgo/adaptor/actions)
		- [Benefits of Task and Domain Adaptation](#benefits-of-task-and-domain-adaptation)
		- [How Can Adapt𝒪r Help](#how-can-adaptor-help)
		- [How Can Adaptor Help](#how-can-adaptor-help)
		- [Usage](#usage)
		@@ -24,3 +24,3 @@ - [Install](#usage)
		- [How to Contribute](CONTRIBUTING.md)
		- [Cite](#citing-adapt𝒪r)
		- [Cite](#citing-adaptor)
		</details>
		@@ -135,2 +135,3 @@
		```
		Try this example on real data, in [`tutorials/adapted_named_entity_recognition.ipynb`](tutorials/adapted_named_entity_recognition.ipynb)

		@@ -144,3 +145,3 @@ #### Adapted Machine Translation
		```python
		# 1. pick the models - randomly pre-initialize the appropriate heads
		# 1. pick the base model
		lang_module = LangModule("Helsinki-NLP/opus-mt-en-de")
		@@ -183,8 +184,41 @@
		```
		Try this example with training resources resolution from OPUS in `examples/machine_translation/train_wiki_adapt_bible.py`
		Try this example on real data, in [`tutorials/unsupervised_machine_translation.ipynb`](tutorials/unsupervised_machine_translation.ipynb)

		#### More examples
		#### Single-objective training
		It also makes sense to use the comfort of adaptor high-level interface in simple use-cases
		where it's enough to apply one objective.
		```python
		# 1. pick the base model
		lang_module = LangModule(test_base_models["sequence_classification"])

		You can find a few more exaples in [tutorials](tutorials), but contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))
		# 2. pick any objective - note that all objectives have almost identical interface
		classification = SequenceClassification(lang_module=lang_module,
		texts_or_path="tests/mock_data/supervised_texts.txt",
		labels_or_path="tests/mock_data/supervised_texts_sequence_labels.txt",
		batch_size=1)
		# 3. Schedule choice does not matter in single-objective training
		schedule = SequentialSchedule(objectives=[classification], args=training_arguments)

		# 4. train using Adapter
		adapter = Adapter(lang_module=lang_module, schedule=parallel_schedule, args=training_arguments)
		adapter.train()

		# 5. save the trained lang_module
		adapter.save_model("output_model")

		# 6. reload and use it like any other Hugging Face model
		classifier = AutoModelForSequenceClassification.from_pretrained("output_model/SequenceClassification")
		tokenizer = AutoTokenizer.from_pretrained("output_model/SequenceClassification")

		inputs = tokenizer("A piece of text to translate.", return_tensors="pt")
		output = classifier(**inputs)
		output_label_id = output.logits.argmax(-1)[0].item()
		print("Your new model predicted class: %s" % classifier.config.id2label[output_label_id])
		```
		Try this example on real data in [`tutorials/simple_sequence_classification.ipynb`](tutorials/simple_sequence_classification.ipynb)

		### More examples

		You can find more examples in [tutorials](tutorials). Your contributions are welcome :) (see [CONTRIBUTING.md](CONTRIBUTING.md))

		### Motivation for objective-centric training
		@@ -214,3 +248,3 @@

		## Citing Adapt𝒪r
		## Citing Adaptor

		@@ -217,0 +251,0 @@ If you use Adaptor in your research, please cite it as follows.

+1

-1

setup.cfg

		[metadata]
		description-file = README.md
		description_file = README.md
		license_files = LICENSE
		@@ -4,0 +4,0 @@

+3

-2

setup.py

		@@ -12,3 +12,3 @@ #!/usr/bin/env python
		name="adaptor",
		version='0.2.2',
		version='0.2.3',
		description="Adaptor: Objective-centric Adaptation Framework for Language Models.",
		@@ -34,4 +34,5 @@ long_description_content_type="text/markdown",
		"torch>=1.7",
		"transformers>=4.10.2,<=4.19.1", # upper-closed on 4.19.1 for now, due to minor bug in eval loss logging
		"transformers<=4.30.2", # TODO upper-closed on 4.30.2: Problem with returning empty batches
		"sentencepiece",
		"accelerate>=0.20.1"
		],
		@@ -38,0 +39,0 @@ test_require=[

adaptor - npm Package Compare versions

Improved metrics