{
		"name": "sillytavern-transformers",
		"version": "2.14.6",
		"version": "2.17.0",
		"description": "State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!",
		@@ -42,4 +42,4 @@ "main": "./src/transformers.js",
		"onnxruntime-web": "1.14.0",
		"@huggingface/jinja": "^0.1.0",
		"jimp": "^0.22.10"
		"jimp": "^0.22.10",
		"@huggingface/jinja": "^0.2.2"
		},
		@@ -54,3 +54,3 @@ "optionalDependencies": {
		"jest-environment-node": "^29.5.0",
		"jsdoc-to-markdown": "^8.0.0",
		"jsdoc-to-markdown": "^8.0.1",
		"typescript": "^5.2.2",
		@@ -64,3 +64,9 @@ "wavefile": "^11.0.0",
		"semver": "^7.5.4",
		"protobufjs": "^7.2.4"
		"jimp": {
		"phin": "3.7.1"
		},
		"parse-bmfont-xml": {
		"xml2js": "^0.5.0"
		},
		"protobufjs": "^7.2.6"
		},
		@@ -67,0 +73,0 @@ "files": [

README.md

		@@ -104,3 +104,3 @@
		<script type="module">
		import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.2';
		import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.0';
		</script>
		@@ -138,3 +138,3 @@ ```

		By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.2/dist/), which should work out-of-the-box. You can customize this as follows:
		By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.0/dist/), which should work out-of-the-box. You can customize this as follows:

		@@ -203,3 +203,2 @@
		\|--------------------------\|----\|-------------\|------------\|
		\| [Conversational](https://huggingface.co/tasks/conversational) \| `conversational` \| Generating conversational text that is relevant, coherent and knowledgable given a prompt. \| ❌ \|
		\| [Fill-Mask](https://huggingface.co/tasks/fill-mask) \| `fill-mask` \| Masking some of the words in a sentence and predicting which words should replace those masks. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FillMaskPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=fill-mask&library=transformers.js) \|
		@@ -216,2 +215,3 @@ \| [Question Answering](https://huggingface.co/tasks/question-answering) \| `question-answering` \| Retrieve the answer to a question from a given text. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.QuestionAnsweringPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js) \|
		\| [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) \| `zero-shot-classification` \| Classifying text into classes that are unseen during training. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ZeroShotClassificationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js) \|
		\| [Feature Extraction](https://huggingface.co/tasks/feature-extraction) \| `feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js) \|

		@@ -230,2 +230,3 @@ #### Vision
		\| [Unconditional Image Generation](https://huggingface.co/tasks/unconditional-image-generation) \| n/a \| Generating images with no condition in any context (like a prompt text or another image). \| ❌ \|
		\| [Image Feature Extraction](https://huggingface.co/tasks/image-feature-extraction) \| `image-feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original image. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageFeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-feature-extraction&library=transformers.js) \|

		@@ -255,3 +256,2 @@ #### Audio
		\| [Document Question Answering](https://huggingface.co/tasks/document-question-answering) \| `document-question-answering` \| Answering questions on document images. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.DocumentQuestionAnsweringPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=document-question-answering&library=transformers.js) \|
		\| [Feature Extraction](https://huggingface.co/tasks/feature-extraction) \| `feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js) \|
		\| [Image-to-Text](https://huggingface.co/tasks/image-to-text) \| `image-to-text` \| Output text from a given image. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageToTextPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js) \|
		@@ -303,2 +303,3 @@ \| [Text-to-Image](https://huggingface.co/tasks/text-to-image) \| `text-to-image` \| Generates images from input text. \| ❌ \|
		1. [DPT](https://huggingface.co/docs/transformers/master/model_doc/dpt) (from Intel Labs) released with the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by René Ranftl, Alexey Bochkovskiy, Vladlen Koltun.
		1. [EfficientNet](https://huggingface.co/docs/transformers/model_doc/efficientnet) (from Google Brain) released with the paper [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) by Mingxing Tan, Quoc V. Le.
		1. [ELECTRA](https://huggingface.co/docs/transformers/model_doc/electra) (from Google Research/Stanford University) released with the paper [ELECTRA: Pre-training text encoders as discriminators rather than generators](https://arxiv.org/abs/2003.10555) by Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
		@@ -334,3 +335,5 @@ 1. [ESM](https://huggingface.co/docs/transformers/model_doc/esm) (from Meta AI) are transformer protein language models. ESM-1b was released with the paper [Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences](https://www.pnas.org/content/118/15/e2016239118) by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. ESM-1v was released with the paper [Language models enable zero-shot prediction of the effects of mutations on protein function](https://doi.org/10.1101/2021.07.09.450648) by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. ESM-2 and ESMFold were released with the paper [Language models of protein sequences at the scale of evolution enable accurate structure prediction](https://doi.org/10.1101/2022.07.20.500902) by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives.
		1. [OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) (from Google AI) released with the paper [Simple Open-Vocabulary Object Detection with Vision Transformers](https://arxiv.org/abs/2205.06230) by Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby.
		1. [OWLv2](https://huggingface.co/docs/transformers/model_doc/owlv2) (from Google AI) released with the paper [Scaling Open-Vocabulary Object Detection](https://arxiv.org/abs/2306.09683) by Matthias Minderer, Alexey Gritsenko, Neil Houlsby.
		1. [Phi](https://huggingface.co/docs/transformers/main/model_doc/phi) (from Microsoft) released with the papers - [Textbooks Are All You Need](https://arxiv.org/abs/2306.11644) by Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee and Yuanzhi Li, [Textbooks Are All You Need II: phi-1.5 technical report](https://arxiv.org/abs/2309.05463) by Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar and Yin Tat Lee.
		1. [Qwen2](https://huggingface.co/docs/transformers/model_doc/qwen2) (from the Qwen team, Alibaba Group) released with the paper [Qwen Technical Report](https://arxiv.org/abs/2309.16609) by Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou and Tianhang Zhu.
		1. [ResNet](https://huggingface.co/docs/transformers/model_doc/resnet) (from Microsoft Research) released with the paper [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.
		@@ -344,2 +347,4 @@ 1. [RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta) (from Facebook), released together with the paper [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
		1. [SqueezeBERT](https://huggingface.co/docs/transformers/model_doc/squeezebert) (from Berkeley) released with the paper [SqueezeBERT: What can computer vision teach NLP about efficient neural networks?](https://arxiv.org/abs/2006.11316) by Forrest N. Iandola, Albert E. Shaw, Ravi Krishna, and Kurt W. Keutzer.
		1. [StableLm](https://huggingface.co/docs/transformers/model_doc/stablelm) (from Stability AI) released with the paper [StableLM 3B 4E1T (Technical Report)](https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo) by Jonathan Tow, Marco Bellagente, Dakota Mahan, Carlos Riquelme Ruiz, Duy Phung, Maksym Zhuravinskyi, Nathan Cooper, Nikhil Pinnaparaju, Reshinth Adithyan, and James Baicoianu.
		1. [Starcoder2](https://huggingface.co/docs/transformers/main/model_doc/starcoder2) (from BigCode team) released with the paper [StarCoder 2 and The Stack v2: The Next Generation](https://arxiv.org/abs/2402.19173) by Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de Vries.
		1. [Swin Transformer](https://huggingface.co/docs/transformers/model_doc/swin) (from Microsoft) released with the paper [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030) by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.
		@@ -351,2 +356,4 @@ 1. [Swin2SR](https://huggingface.co/docs/transformers/model_doc/swin2sr) (from University of Würzburg) released with the paper [Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration](https://arxiv.org/abs/2209.11345) by Marcos V. Conde, Ui-Jin Choi, Maxime Burchi, Radu Timofte.
		1. [TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr) (from Microsoft), released together with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
		1. [UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech) (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.
		1. [UniSpeechSat](https://huggingface.co/docs/transformers/model_doc/unispeech-sat) (from Microsoft Research) released with the paper [UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING](https://arxiv.org/abs/2110.05752) by Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu.
		1. [Vision Transformer (ViT)](https://huggingface.co/docs/transformers/model_doc/vit) (from Google AI) released with the paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929) by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.
		@@ -353,0 +360,0 @@ 1. [ViTMatte](https://huggingface.co/docs/transformers/model_doc/vitmatte) (from HUST-VL) released with the paper [ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers](https://arxiv.org/abs/2305.15272) by Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang.

src/env.js

		@@ -32,3 +32,3 @@ /**

		const VERSION = '2.14.2';
		const VERSION = '2.17.0';

		@@ -57,11 +57,12 @@ // Check if various APIs are available (depends on environment)

		// Set path to wasm files. This is needed when running in a web worker.
		// https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#wasmPaths
		// We use remote wasm files by default to make it easier for newer users.
		// In practice, users should probably self-host the necessary .wasm files.
		onnx_env.wasm.wasmPaths = RUNNING_LOCALLY
		? path.join(__dirname, '/dist/')
		: `https://cdn.jsdelivr.net/npm/@xenova/transformers@${VERSION}/dist/`;
		if (onnx_env?.wasm) {
		// Set path to wasm files. This is needed when running in a web worker.
		// https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#wasmPaths
		// We use remote wasm files by default to make it easier for newer users.
		// In practice, users should probably self-host the necessary .wasm files.
		onnx_env.wasm.wasmPaths = RUNNING_LOCALLY
		? path.join(__dirname, '/dist/')
		: `https://cdn.jsdelivr.net/npm/@xenova/transformers@${VERSION}/dist/`;
		}


		/**
		@@ -68,0 +69,0 @@ * Global variable used to control execution. This provides users a simple way to configure Transformers.js.

src/utils/image.js

		@@ -14,2 +14,3 @@
		import Jimp from 'jimp';
		import { Tensor } from './tensor.js';

		@@ -174,3 +175,3 @@ // Will be empty (or not used) if running in browser or web-worker
		* Helper method to create a new Image from a tensor
		* @param {import('./tensor.js').Tensor} tensor
		* @param {Tensor} tensor
		*/
		@@ -507,2 +508,19 @@ static fromTensor(tensor, channel_format = 'CHW') {

		toTensor(channel_format = 'CHW') {
		let tensor = new Tensor(
		'uint8',
		new Uint8Array(this.data),
		[this.height, this.width, this.channels]
		);

		if (channel_format === 'HWC') {
		// Do nothing
		} else if (channel_format === 'CHW') { // hwc -> chw
		tensor = tensor.permute(2, 0, 1);
		} else {
		throw new Error(`Unsupported channel format: ${channel_format}`);
		}
		return tensor;
		}

		toCanvas() {
		@@ -509,0 +527,0 @@ if (!BROWSER_ENV) {

src/utils/maths.js

		@@ -91,3 +91,3 @@
		/**
		* Helper method to transpose a `AnyTypedArray` directly
		* Helper method to permute a `AnyTypedArray` directly
		* @template {AnyTypedArray} T
		@@ -97,6 +97,6 @@ * @param {T} array
		* @param {number[]} axes
		* @returns {[T, number[]]} The transposed array and the new shape.
		* @returns {[T, number[]]} The permuted array and the new shape.
		*/
		export function transpose_data(array, dims, axes) {
		// Calculate the new shape of the transposed array
		export function permute_data(array, dims, axes) {
		// Calculate the new shape of the permuted array
		// and the stride of the original array
		@@ -115,7 +115,7 @@ const shape = new Array(axes.length);

		// Create the transposed array with the new shape
		// Create the permuted array with the new shape
		// @ts-ignore
		const transposedData = new array.constructor(array.length);
		const permutedData = new array.constructor(array.length);

		// Transpose the original array to the new array
		// Permute the original array to the new array
		for (let i = 0; i < array.length; ++i) {
		@@ -127,6 +127,6 @@ let newIndex = 0;
		}
		transposedData[newIndex] = array[i];
		permutedData[newIndex] = array[i];
		}

		return [transposedData, shape];
		return [permutedData, shape];
		}
		@@ -181,3 +181,7 @@
		export function dot(arr1, arr2) {
		return arr1.reduce((acc, val, i) => acc + val * arr2[i], 0);
		let result = 0;
		for (let i = 0; i < arr1.length; ++i) {
		result += arr1[i] * arr2[i];
		}
		return result;
		}
		@@ -960,1 +964,15 @@
		}

		/**
		* Helper function to round a number to the nearest integer, with ties rounded to the nearest even number.
		* Also known as "bankers' rounding". This is the default rounding mode in python. For example:
		* 1.5 rounds to 2 and 2.5 rounds to 2.
		*
		* @param {number} x The number to round
		* @returns {number} The rounded number
		*/
		export function bankers_round(x) {
		const r = Math.round(x);
		const br = Math.abs(x) % 1 === 0.5 ? (r % 2 === 0 ? r : r - 1) : r;
		return br;
		}

112

src/utils/tensor.js

		@@ -14,3 +14,3 @@ /**
		interpolate_data,
		transpose_data
		permute_data
		} from './maths.js';
		@@ -313,13 +313,15 @@
		/**
		* Return a transposed version of this Tensor, according to the provided dimensions.
		* @param {...number} dims Dimensions to transpose.
		* @returns {Tensor} The transposed tensor.
		* Return a permuted version of this Tensor, according to the provided dimensions.
		* @param {...number} dims Dimensions to permute.
		* @returns {Tensor} The permuted tensor.
		*/
		permute(...dims) {
		return permute(this, dims);
		}

		// TODO: implement transpose. For now (backwards compatibility), it's just an alias for permute()
		transpose(...dims) {
		return transpose(this, dims);
		return this.permute(...dims);
		}

		// TODO: rename transpose to permute
		// TODO: implement transpose

		// TODO add .max() and .min() methods
		@@ -685,10 +687,10 @@
		/**
		* Transposes a tensor according to the provided axes.
		* @param {any} tensor The input tensor to transpose.
		* @param {Array} axes The axes to transpose the tensor along.
		* @returns {Tensor} The transposed tensor.
		* Permutes a tensor according to the provided axes.
		* @param {any} tensor The input tensor to permute.
		* @param {Array} axes The axes to permute the tensor along.
		* @returns {Tensor} The permuted tensor.
		*/
		export function transpose(tensor, axes) {
		const [transposedData, shape] = transpose_data(tensor.data, tensor.dims, axes);
		return new Tensor(tensor.type, transposedData, shape);
		export function permute(tensor, axes) {
		const [permutedData, shape] = permute_data(tensor.data, tensor.dims, axes);
		return new Tensor(tensor.type, permutedData, shape);
		}
		@@ -769,2 +771,38 @@
		/**
		* Apply Layer Normalization for last certain number of dimensions.
		* @param {Tensor} input The input tensor
		* @param {number[]} normalized_shape input shape from an expected input of size
		* @param {Object} options The options for the layer normalization
		* @param {number} [options.eps=1e-5] A value added to the denominator for numerical stability.
		* @returns {Tensor} The normalized tensor.
		*/
		export function layer_norm(input, normalized_shape, {
		eps = 1e-5,
		} = {}) {
		if (input.dims.length !== 2) {
		throw new Error('`layer_norm` currently only supports 2D input.');
		}

		const [batchSize, featureDim] = input.dims;

		if (normalized_shape.length !== 1 && normalized_shape[0] !== featureDim) {
		throw new Error('`normalized_shape` must be a 1D array with shape `[input.dims[1]]`.');
		}

		const [std, mean] = std_mean(input, 1, 0, true);

		// @ts-ignore
		const returnedData = new input.data.constructor(input.data.length);

		for (let i = 0; i < batchSize; ++i) {
		const offset = i * featureDim;
		for (let j = 0; j < featureDim; ++j) {
		const offset2 = offset + j;
		returnedData[offset2] = (input.data[offset2] - mean.data[i]) / (std.data[i] + eps);
		}
		}
		return new Tensor(input.type, returnedData, input.dims);
		}

		/**
		* Helper function to calculate new dimensions when performing a squeeze operation.
		@@ -1162,1 +1200,45 @@ * @param {number[]} dims The dimensions of the tensor.
		}

		/**
		* Quantizes the embeddings tensor to binary or unsigned binary precision.
		* @param {Tensor} tensor The tensor to quantize.
		* @param {'binary'\|'ubinary'} precision The precision to use for quantization.
		* @returns {Tensor} The quantized tensor.
		*/
		export function quantize_embeddings(tensor, precision) {
		if (tensor.dims.length !== 2) {
		throw new Error("The tensor must have 2 dimensions");
		}
		if (tensor.dims.at(-1) % 8 !== 0) {
		throw new Error("The last dimension of the tensor must be a multiple of 8");
		}
		if (!['binary', 'ubinary'].includes(precision)) {
		throw new Error("The precision must be either 'binary' or 'ubinary'");
		}

		const signed = precision === 'binary';
		const dtype = signed ? 'int8' : 'uint8';

		// Create a typed array to store the packed bits
		const cls = signed ? Int8Array : Uint8Array;
		const inputData = tensor.data;
		const outputData = new cls(inputData.length / 8);

		// Iterate over each number in the array
		for (let i = 0; i < inputData.length; ++i) {
		// Determine if the number is greater than 0
		const bit = inputData[i] > 0 ? 1 : 0;

		// Calculate the index in the typed array and the position within the byte
		const arrayIndex = Math.floor(i / 8);
		const bitPosition = i % 8;

		// Pack the bit into the typed array
		outputData[arrayIndex] \|= bit << (7 - bitPosition);
		if (signed && bitPosition === 0) {
		outputData[arrayIndex] -= 128;
		}
		};

		return new Tensor(dtype, outputData, [tensor.dims[0], tensor.dims[1] / 8]);
		}

types/env.d.ts

		@@ -22,3 +22,3 @@ export namespace env {
		declare const __dirname: any;
		declare const VERSION: "2.14.2";
		declare const VERSION: "2.17.0";
		declare const localModelPath: any;
		@@ -25,0 +25,0 @@ declare const FS_AVAILABLE: boolean;

types/processors.d.ts

		@@ -42,3 +42,3 @@ declare const FeatureExtractor_base: new () => {
		* @param {number} config.resample What method to use for resampling.
		* @param {number} config.size The size to resize the image to.
		* @param {number\|Object} config.size The size to resize the image to.
		*/
		@@ -53,3 +53,3 @@ constructor(config: {
		resample: number;
		size: number;
		size: number \| any;
		});
		@@ -94,3 +94,3 @@ image_mean: any;
		* @param {Float32Array} pixelData The pixel data to pad.
		* @param {number[]} imgDims The dimensions of the image.
		* @param {number[]} imgDims The dimensions of the image (height, width, channels).
		* @param {{width:number; height:number}\|number} padSize The dimensions of the padded image.
		@@ -181,8 +181,8 @@ * @param {Object} options The options for padding.
		}
		export class DPTImageProcessor extends ImageFeatureExtractor {
		export class DPTFeatureExtractor extends ImageFeatureExtractor {
		}
		export class DPTImageProcessor extends DPTFeatureExtractor {
		}
		export class BitImageProcessor extends ImageFeatureExtractor {
		}
		export class DPTFeatureExtractor extends ImageFeatureExtractor {
		}
		export class GLPNFeatureExtractor extends ImageFeatureExtractor {
		@@ -210,2 +210,6 @@ }
		}
		export class EfficientNetImageProcessor extends ImageFeatureExtractor {
		constructor(config: any);
		include_top: any;
		}
		export class MobileViTFeatureExtractor extends ImageFeatureExtractor {
		@@ -230,2 +234,4 @@ }
		}
		export class Owlv2ImageProcessor extends OwlViTFeatureExtractor {
		}
		export class DeiTFeatureExtractor extends ImageFeatureExtractor {
		@@ -672,2 +678,3 @@ }
		static FEATURE_EXTRACTOR_CLASS_MAPPING: {
		ImageFeatureExtractor: typeof ImageFeatureExtractor;
		WhisperFeatureExtractor: typeof WhisperFeatureExtractor;
		@@ -677,2 +684,3 @@ ViTFeatureExtractor: typeof ViTFeatureExtractor;
		OwlViTFeatureExtractor: typeof OwlViTFeatureExtractor;
		Owlv2ImageProcessor: typeof Owlv2ImageProcessor;
		CLIPFeatureExtractor: typeof CLIPFeatureExtractor;
		@@ -694,2 +702,3 @@ ChineseCLIPFeatureExtractor: typeof ChineseCLIPFeatureExtractor;
		NougatImageProcessor: typeof NougatImageProcessor;
		EfficientNetImageProcessor: typeof EfficientNetImageProcessor;
		ViTImageProcessor: typeof ViTImageProcessor;
		@@ -696,0 +705,0 @@ VitMatteImageProcessor: typeof VitMatteImageProcessor;

types/tokenizers.d.ts

		@@ -67,2 +67,7 @@ declare const TokenizerModel_base: new () => {
		};
		/**
		* @typedef {Object} Message
		* @property {string} role The role of the message (e.g., "user" or "assistant" or "system").
		* @property {string} content The content of the message.
		*/
		export class PreTrainedTokenizer extends PreTrainedTokenizer_base {
		@@ -233,7 +238,2 @@ /**
		/**
		* @typedef {Object} Message
		* @property {string} role The role of the message (e.g., "user" or "assistant" or "system").
		* @property {string} content The content of the message.
		*/
		/**
		* Converts a list of message objects with `"role"` and `"content"` keys to a list of token
		@@ -280,14 +280,6 @@ * ids. This method is intended for use with chat models, and will read the tokenizer's chat_template attribute to
		* @param {boolean} [options.return_tensor=true] Whether to return the output as a Tensor or an Array. Has no effect if tokenize is false.
		* @param {Object} [options.tokenizer_kwargs={}] Additional options to pass to the tokenizer.
		* @returns {string \| Tensor \| number[]\| number[][]} The tokenized output.
		*/
		apply_chat_template(conversation: {
		/**
		* The role of the message (e.g., "user" or "assistant" or "system").
		*/
		role: string;
		/**
		* The content of the message.
		*/
		content: string;
		}[], { chat_template, add_generation_prompt, tokenize, padding, truncation, max_length, return_tensor, }?: {
		apply_chat_template(conversation: Message[], { chat_template, add_generation_prompt, tokenize, padding, truncation, max_length, return_tensor, tokenizer_kwargs, ...kwargs }?: {
		chat_template?: string;
		@@ -300,2 +292,3 @@ add_generation_prompt?: boolean;
		return_tensor?: boolean;
		tokenizer_kwargs?: any;
		}): string \| Tensor \| number[] \| number[][];
		@@ -384,2 +377,8 @@ }
		}
		export class Qwen2Tokenizer extends PreTrainedTokenizer {
		}
		export class GemmaTokenizer extends PreTrainedTokenizer {
		}
		export class Grok1Tokenizer extends PreTrainedTokenizer {
		}
		/**
		@@ -575,2 +574,4 @@ * The NllbTokenizer class is used to tokenize text for NLLB ("No Language Left Behind") models.
		}
		export class CohereTokenizer extends PreTrainedTokenizer {
		}
		/**
		@@ -625,2 +626,6 @@ * Helper class which is used to instantiate pretrained tokenizers with the `from_pretrained` function.
		VitsTokenizer: typeof VitsTokenizer;
		Qwen2Tokenizer: typeof Qwen2Tokenizer;
		GemmaTokenizer: typeof GemmaTokenizer;
		Grok1Tokenizer: typeof Grok1Tokenizer;
		CohereTokenizer: typeof CohereTokenizer;
		PreTrainedTokenizer: typeof PreTrainedTokenizer;
		@@ -702,2 +707,12 @@ };
		};
		export type Message = {
		/**
		* The role of the message (e.g., "user" or "assistant" or "system").
		*/
		role: string;
		/**
		* The content of the message.
		*/
		content: string;
		};
		declare const Normalizer_base: new () => {
		@@ -704,0 +719,0 @@ (...args: any[]): any;

types/utils/image.d.ts

		@@ -33,5 +33,5 @@ export class RawImage {
		* Helper method to create a new Image from a tensor
		* @param {import('./tensor.js').Tensor} tensor
		* @param {Tensor} tensor
		*/
		static fromTensor(tensor: import('./tensor.js').Tensor, channel_format?: string): RawImage;
		static fromTensor(tensor: Tensor, channel_format?: string): RawImage;
		/**
		@@ -84,2 +84,3 @@ * Create a new `RawImage` object.
		toBlob(type?: string, quality?: number): Promise<any>;
		toTensor(channel_format?: string): Tensor;
		toCanvas(): any;
		@@ -117,2 +118,3 @@ /**
		}
		import { Tensor } from './tensor.js';
		//# sourceMappingURL=image.d.ts.map

types/utils/maths.d.ts

		@@ -19,3 +19,3 @@ /**
		/**
		* Helper method to transpose a `AnyTypedArray` directly
		* Helper method to permute a `AnyTypedArray` directly
		* @template {AnyTypedArray} T
		@@ -25,5 +25,5 @@ * @param {T} array
		* @param {number[]} axes
		* @returns {[T, number[]]} The transposed array and the new shape.
		* @returns {[T, number[]]} The permuted array and the new shape.
		*/
		export function transpose_data<T extends AnyTypedArray>(array: T, dims: number[], axes: number[]): [T, number[]];
		export function permute_data<T extends AnyTypedArray>(array: T, dims: number[], axes: number[]): [T, number[]];
		/**
		@@ -98,2 +98,11 @@ * Compute the softmax of an array of numbers.
		export function round(num: number, decimals: number): number;
		/**
		* Helper function to round a number to the nearest integer, with ties rounded to the nearest even number.
		* Also known as "bankers' rounding". This is the default rounding mode in python. For example:
		* 1.5 rounds to 2 and 2.5 rounds to 2.
		*
		* @param {number} x The number to round
		* @returns {number} The rounded number
		*/
		export function bankers_round(x: number): number;
		export class FFT {
		@@ -100,0 +109,0 @@ constructor(fft_length: any);

types/utils/tensor.d.ts

		/**
		* Transposes a tensor according to the provided axes.
		* @param {any} tensor The input tensor to transpose.
		* @param {Array} axes The axes to transpose the tensor along.
		* @returns {Tensor} The transposed tensor.
		* Permutes a tensor according to the provided axes.
		* @param {any} tensor The input tensor to permute.
		* @param {Array} axes The axes to permute the tensor along.
		* @returns {Tensor} The permuted tensor.
		*/
		export function transpose(tensor: any, axes: any[]): Tensor;
		export function permute(tensor: any, axes: any[]): Tensor;
		/**
		@@ -25,2 +25,13 @@ * Interpolates an Tensor to the given size.
		/**
		* Apply Layer Normalization for last certain number of dimensions.
		* @param {Tensor} input The input tensor
		* @param {number[]} normalized_shape input shape from an expected input of size
		* @param {Object} options The options for the layer normalization
		* @param {number} [options.eps=1e-5] A value added to the denominator for numerical stability.
		* @returns {Tensor} The normalized tensor.
		*/
		export function layer_norm(input: Tensor, normalized_shape: number[], { eps, }?: {
		eps?: number;
		}): Tensor;
		/**
		* Concatenates an array of tensors along a specified dimension.
		@@ -75,2 +86,9 @@ * @param {Tensor[]} tensors The array of tensors to concatenate.
		export function ones_like(tensor: Tensor): Tensor;
		/**
		* Quantizes the embeddings tensor to binary or unsigned binary precision.
		* @param {Tensor} tensor The tensor to quantize.
		* @param {'binary'\|'ubinary'} precision The precision to use for quantization.
		* @returns {Tensor} The quantized tensor.
		*/
		export function quantize_embeddings(tensor: Tensor, precision: 'binary' \| 'ubinary'): Tensor;
		export class Tensor {
		@@ -157,7 +175,8 @@ /**
		/**
		* Return a transposed version of this Tensor, according to the provided dimensions.
		* @param {...number} dims Dimensions to transpose.
		* @returns {Tensor} The transposed tensor.
		* Return a permuted version of this Tensor, according to the provided dimensions.
		* @param {...number} dims Dimensions to permute.
		* @returns {Tensor} The permuted tensor.
		*/
		transpose(...dims: number[]): Tensor;
		permute(...dims: number[]): Tensor;
		transpose(...dims: any[]): Tensor;
		/**
		@@ -164,0 +183,0 @@ * Returns the sum of each row of the input tensor in the given dimension dim.

dist/transformers.js

Sorry, the diff of this file is too big to display

dist/transformers.js.map

Sorry, the diff of this file is not supported yet

dist/transformers.min.js

Sorry, the diff of this file is too big to display

dist/transformers.min.js.map

Sorry, the diff of this file is not supported yet

src/models.js

Sorry, the diff of this file is too big to display

src/pipelines.js

Sorry, the diff of this file is too big to display

src/processors.js

Sorry, the diff of this file is too big to display

src/tokenizers.js

Sorry, the diff of this file is too big to display

types/models.d.ts

Sorry, the diff of this file is too big to display

types/models.d.ts.map

Sorry, the diff of this file is not supported yet

types/pipelines.d.ts

Sorry, the diff of this file is too big to display

types/pipelines.d.ts.map

Sorry, the diff of this file is not supported yet

types/processors.d.ts.map

Sorry, the diff of this file is not supported yet

types/tokenizers.d.ts.map

Sorry, the diff of this file is not supported yet

types/utils/image.d.ts.map

Sorry, the diff of this file is not supported yet

types/utils/maths.d.ts.map

Sorry, the diff of this file is not supported yet

types/utils/tensor.d.ts.map

Sorry, the diff of this file is not supported yet

		@@ -104,3 +104,3 @@
		<script type="module">
		import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.2';
		import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.0';
		</script>
		@@ -138,3 +138,3 @@ ```

		By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.14.2/dist/), which should work out-of-the-box. You can customize this as follows:
		By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.0/dist/), which should work out-of-the-box. You can customize this as follows:

		@@ -203,3 +203,2 @@
		\|--------------------------\|----\|-------------\|------------\|
		\| [Conversational](https://huggingface.co/tasks/conversational) \| `conversational` \| Generating conversational text that is relevant, coherent and knowledgable given a prompt. \| ❌ \|
		\| [Fill-Mask](https://huggingface.co/tasks/fill-mask) \| `fill-mask` \| Masking some of the words in a sentence and predicting which words should replace those masks. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FillMaskPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=fill-mask&library=transformers.js) \|
		@@ -216,2 +215,3 @@ \| [Question Answering](https://huggingface.co/tasks/question-answering) \| `question-answering` \| Retrieve the answer to a question from a given text. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.QuestionAnsweringPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=question-answering&library=transformers.js) \|
		\| [Zero-Shot Classification](https://huggingface.co/tasks/zero-shot-classification) \| `zero-shot-classification` \| Classifying text into classes that are unseen during training. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ZeroShotClassificationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=zero-shot-classification&library=transformers.js) \|
		\| [Feature Extraction](https://huggingface.co/tasks/feature-extraction) \| `feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js) \|

		@@ -230,2 +230,3 @@ #### Vision
		\| [Unconditional Image Generation](https://huggingface.co/tasks/unconditional-image-generation) \| n/a \| Generating images with no condition in any context (like a prompt text or another image). \| ❌ \|
		\| [Image Feature Extraction](https://huggingface.co/tasks/image-feature-extraction) \| `image-feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original image. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageFeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-feature-extraction&library=transformers.js) \|

		@@ -255,3 +256,2 @@ #### Audio
		\| [Document Question Answering](https://huggingface.co/tasks/document-question-answering) \| `document-question-answering` \| Answering questions on document images. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.DocumentQuestionAnsweringPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=document-question-answering&library=transformers.js) \|
		\| [Feature Extraction](https://huggingface.co/tasks/feature-extraction) \| `feature-extraction` \| Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.FeatureExtractionPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=feature-extraction&library=transformers.js) \|
		\| [Image-to-Text](https://huggingface.co/tasks/image-to-text) \| `image-to-text` \| Output text from a given image. \| ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageToTextPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-to-text&library=transformers.js) \|
		@@ -303,2 +303,3 @@ \| [Text-to-Image](https://huggingface.co/tasks/text-to-image) \| `text-to-image` \| Generates images from input text. \| ❌ \|
		1. [DPT](https://huggingface.co/docs/transformers/master/model_doc/dpt) (from Intel Labs) released with the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by René Ranftl, Alexey Bochkovskiy, Vladlen Koltun.
		1. [EfficientNet](https://huggingface.co/docs/transformers/model_doc/efficientnet) (from Google Brain) released with the paper [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) by Mingxing Tan, Quoc V. Le.
		1. [ELECTRA](https://huggingface.co/docs/transformers/model_doc/electra) (from Google Research/Stanford University) released with the paper [ELECTRA: Pre-training text encoders as discriminators rather than generators](https://arxiv.org/abs/2003.10555) by Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
		@@ -334,3 +335,5 @@ 1. [ESM](https://huggingface.co/docs/transformers/model_doc/esm) (from Meta AI) are transformer protein language models. ESM-1b was released with the paper [Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences](https://www.pnas.org/content/118/15/e2016239118) by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. ESM-1v was released with the paper [Language models enable zero-shot prediction of the effects of mutations on protein function](https://doi.org/10.1101/2021.07.09.450648) by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. ESM-2 and ESMFold were released with the paper [Language models of protein sequences at the scale of evolution enable accurate structure prediction](https://doi.org/10.1101/2022.07.20.500902) by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives.
		1. [OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) (from Google AI) released with the paper [Simple Open-Vocabulary Object Detection with Vision Transformers](https://arxiv.org/abs/2205.06230) by Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby.
		1. [OWLv2](https://huggingface.co/docs/transformers/model_doc/owlv2) (from Google AI) released with the paper [Scaling Open-Vocabulary Object Detection](https://arxiv.org/abs/2306.09683) by Matthias Minderer, Alexey Gritsenko, Neil Houlsby.
		1. [Phi](https://huggingface.co/docs/transformers/main/model_doc/phi) (from Microsoft) released with the papers - [Textbooks Are All You Need](https://arxiv.org/abs/2306.11644) by Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee and Yuanzhi Li, [Textbooks Are All You Need II: phi-1.5 technical report](https://arxiv.org/abs/2309.05463) by Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar and Yin Tat Lee.
		1. [Qwen2](https://huggingface.co/docs/transformers/model_doc/qwen2) (from the Qwen team, Alibaba Group) released with the paper [Qwen Technical Report](https://arxiv.org/abs/2309.16609) by Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou and Tianhang Zhu.
		1. [ResNet](https://huggingface.co/docs/transformers/model_doc/resnet) (from Microsoft Research) released with the paper [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.
		@@ -344,2 +347,4 @@ 1. [RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta) (from Facebook), released together with the paper [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
		1. [SqueezeBERT](https://huggingface.co/docs/transformers/model_doc/squeezebert) (from Berkeley) released with the paper [SqueezeBERT: What can computer vision teach NLP about efficient neural networks?](https://arxiv.org/abs/2006.11316) by Forrest N. Iandola, Albert E. Shaw, Ravi Krishna, and Kurt W. Keutzer.
		1. [StableLm](https://huggingface.co/docs/transformers/model_doc/stablelm) (from Stability AI) released with the paper [StableLM 3B 4E1T (Technical Report)](https://stability.wandb.io/stability-llm/stable-lm/reports/StableLM-3B-4E1T--VmlldzoyMjU4?accessToken=u3zujipenkx5g7rtcj9qojjgxpconyjktjkli2po09nffrffdhhchq045vp0wyfo) by Jonathan Tow, Marco Bellagente, Dakota Mahan, Carlos Riquelme Ruiz, Duy Phung, Maksym Zhuravinskyi, Nathan Cooper, Nikhil Pinnaparaju, Reshinth Adithyan, and James Baicoianu.
		1. [Starcoder2](https://huggingface.co/docs/transformers/main/model_doc/starcoder2) (from BigCode team) released with the paper [StarCoder 2 and The Stack v2: The Next Generation](https://arxiv.org/abs/2402.19173) by Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de Vries.
		1. [Swin Transformer](https://huggingface.co/docs/transformers/model_doc/swin) (from Microsoft) released with the paper [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030) by Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo.
		@@ -351,2 +356,4 @@ 1. [Swin2SR](https://huggingface.co/docs/transformers/model_doc/swin2sr) (from University of Würzburg) released with the paper [Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration](https://arxiv.org/abs/2209.11345) by Marcos V. Conde, Ui-Jin Choi, Maxime Burchi, Radu Timofte.
		1. [TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr) (from Microsoft), released together with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
		1. [UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech) (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.
		1. [UniSpeechSat](https://huggingface.co/docs/transformers/model_doc/unispeech-sat) (from Microsoft Research) released with the paper [UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING](https://arxiv.org/abs/2110.05752) by Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu.
		1. [Vision Transformer (ViT)](https://huggingface.co/docs/transformers/model_doc/vit) (from Google AI) released with the paper [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929) by Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.
		@@ -353,0 +360,0 @@ 1. [ViTMatte](https://huggingface.co/docs/transformers/model_doc/vitmatte) (from HUST-VL) released with the paper [ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers](https://arxiv.org/abs/2305.15272) by Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang.

sillytavern-transformers - npm Package Compare versions

New alerts

Fixed alerts

Improved metrics

Worsened metrics

Dependency changes