Socket
Socket
Sign inDemoInstall

@xenova/transformers

Package Overview
Dependencies
Maintainers
1
Versions
75
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@xenova/transformers - npm Package Compare versions

Comparing version 2.8.0 to 2.9.0

4

package.json
{
"name": "@xenova/transformers",
"version": "2.8.0",
"version": "2.9.0",
"description": "State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!",

@@ -13,3 +13,3 @@ "main": "./src/transformers.js",

"generate-tests": "python -m tests.generate_tests",
"test": "node --experimental-vm-modules node_modules/jest/bin/jest.js --verbose",
"test": "node --experimental-vm-modules node_modules/jest/bin/jest.js --verbose --maxConcurrency 1",
"readme": "python ./docs/scripts/build_readme.py",

@@ -16,0 +16,0 @@ "docs-api": "node ./docs/scripts/generate.js",

@@ -18,6 +18,9 @@

<a href="https://www.npmjs.com/package/@xenova/transformers">
<img alt="Downloads" src="https://img.shields.io/npm/dw/@xenova/transformers">
<img alt="NPM Downloads" src="https://img.shields.io/npm/dw/@xenova/transformers">
</a>
<a href="https://www.jsdelivr.com/package/npm/@xenova/transformers">
<img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/@xenova/transformers">
</a>
<a href="https://github.com/xenova/transformers.js/blob/main/LICENSE">
<img alt="License" src="https://img.shields.io/github/license/xenova/transformers.js">
<img alt="License" src="https://img.shields.io/github/license/xenova/transformers.js?color=blue">
</a>

@@ -102,3 +105,3 @@ <a href="https://huggingface.co/docs/transformers.js/index">

<script type="module">
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.8.0';
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.9.0';
</script>

@@ -133,3 +136,3 @@ ```

By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.8.0/dist/), which should work out-of-the-box. You can customize this as follows:
By default, Transformers.js uses [hosted pretrained models](https://huggingface.co/models?library=transformers.js) and [precompiled WASM binaries](https://cdn.jsdelivr.net/npm/@xenova/transformers@2.9.0/dist/), which should work out-of-the-box. You can customize this as follows:

@@ -213,3 +216,3 @@

|--------------------------|----|-------------|------------|
| [Depth Estimation](https://huggingface.co/tasks/depth-estimation) | `depth-estimation` | Predicting the depth of objects present in an image. | ❌ |
| [Depth Estimation](https://huggingface.co/tasks/depth-estimation) | `depth-estimation` | Predicting the depth of objects present in an image. | ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.DepthEstimationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=depth-estimation&library=transformers.js) |
| [Image Classification](https://huggingface.co/tasks/image-classification) | `image-classification` | Assigning a label or class to an entire image. | ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageClassificationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-classification&library=transformers.js) |

@@ -251,2 +254,3 @@ | [Image Segmentation](https://huggingface.co/tasks/image-segmentation) | `image-segmentation` | Divides an image into segments where each pixel is mapped to an object. This task has multiple variants such as instance segmentation, panoptic segmentation and semantic segmentation. | ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ImageSegmentationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=image-segmentation&library=transformers.js) |

| [Zero-Shot Image Classification](https://huggingface.co/tasks/zero-shot-image-classification) | `zero-shot-image-classification` | Classifying images into classes that are unseen during training. | ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ZeroShotImageClassificationPipeline)<br>[(models)](https://huggingface.co/models?pipeline_tag=zero-shot-image-classification&library=transformers.js) |
| [Zero-Shot Object Detection](https://huggingface.co/tasks/zero-shot-object-detection) | `zero-shot-object-detection` | Identify objects of classes that are unseen during training. | ✅ [(docs)](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.ZeroShotObjectDetectionPipeline)<br>[(models)](https://huggingface.co/models?other=zero-shot-object-detection&library=transformers.js) |

@@ -281,4 +285,6 @@

1. **[Donut](https://huggingface.co/docs/transformers/model_doc/donut)** (from NAVER), released together with the paper [OCR-free Document Understanding Transformer](https://arxiv.org/abs/2111.15664) by Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park.
1. **[DPT](https://huggingface.co/docs/transformers/master/model_doc/dpt)** (from Intel Labs) released with the paper [Vision Transformers for Dense Prediction](https://arxiv.org/abs/2103.13413) by René Ranftl, Alexey Bochkovskiy, Vladlen Koltun.
1. **[Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)** (from Technology Innovation Institute) by Almazrouei, Ebtesam and Alobeidli, Hamza and Alshamsi, Abdulaziz and Cappelli, Alessandro and Cojocaru, Ruxandra and Debbah, Merouane and Goffinet, Etienne and Heslow, Daniel and Launay, Julien and Malartic, Quentin and Noune, Badreddine and Pannier, Baptiste and Penedo, Guilherme.
1. **[FLAN-T5](https://huggingface.co/docs/transformers/model_doc/flan-t5)** (from Google AI) released in the repository [google-research/t5x](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints) by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
1. **[GLPN](https://huggingface.co/docs/transformers/model_doc/glpn)** (from KAIST) released with the paper [Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth](https://arxiv.org/abs/2201.07436) by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
1. **[GPT Neo](https://huggingface.co/docs/transformers/model_doc/gpt_neo)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.

@@ -305,3 +311,5 @@ 1. **[GPT NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)** (from EleutherAI) released with the paper [GPT-NeoX-20B: An Open-Source Autoregressive Language Model](https://arxiv.org/abs/2204.06745) by Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach

1. **[NLLB](https://huggingface.co/docs/transformers/model_doc/nllb)** (from Meta) released with the paper [No Language Left Behind: Scaling Human-Centered Machine Translation](https://arxiv.org/abs/2207.04672) by the NLLB team.
1. **[Nougat](https://huggingface.co/docs/transformers/model_doc/nougat)** (from Meta AI) released with the paper [Nougat: Neural Optical Understanding for Academic Documents](https://arxiv.org/abs/2308.13418) by Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.
1. **[OPT](https://huggingface.co/docs/transformers/master/model_doc/opt)** (from Meta AI) released with the paper [OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/abs/2205.01068) by Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen et al.
1. **[OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit)** (from Google AI) released with the paper [Simple Open-Vocabulary Object Detection with Vision Transformers](https://arxiv.org/abs/2205.06230) by Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby.
1. **[ResNet](https://huggingface.co/docs/transformers/model_doc/resnet)** (from Microsoft Research) released with the paper [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

@@ -308,0 +316,0 @@ 1. **[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)** (from Facebook), released together with the paper [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692) by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.

@@ -24,3 +24,3 @@ /**

/** @type {module} The ONNX runtime module. */
/** @type {import('onnxruntime-web')} The ONNX runtime module. */
export let ONNX;

@@ -27,0 +27,0 @@

@@ -32,3 +32,3 @@ /**

const VERSION = '2.8.0';
const VERSION = '2.9.0';

@@ -35,0 +35,0 @@ // Check if various APIs are available (depends on environment)

@@ -187,1 +187,16 @@

}
/**
* Helper function to convert list [xmin, xmax, ymin, ymax] into object { "xmin": xmin, ... }
* @param {number[]} box The bounding box as a list.
* @param {boolean} asInteger Whether to cast to integers.
* @returns {Object} The bounding box as an object.
*/
export function get_bounding_box(box, asInteger) {
if (asInteger) {
box = box.map(x => x | 0);
}
const [xmin, ymin, xmax, ymax] = box;
return { xmin, ymin, xmax, ymax };
}

@@ -494,2 +494,45 @@

export class NoBadWordsLogitsProcessor extends LogitsProcessor {
/**
* Create a `NoBadWordsLogitsProcessor`.
* @param {number[][]} bad_words_ids List of list of token ids that are not allowed to be generated.
* @param {number|number[]} eos_token_id The id of the *end-of-sequence* token. Optionally, use a list to set multiple *end-of-sequence* tokens.
*/
constructor(bad_words_ids, eos_token_id) {
super();
this.bad_words_ids = bad_words_ids;
this.eos_token_id = Array.isArray(eos_token_id) ? eos_token_id : [eos_token_id];
}
/**
* Apply logit processor.
* @param {Array} input_ids The input IDs.
* @param {Object} logits The logits.
* @returns {Object} The processed logits.
*/
_call(input_ids, logits) {
for (const bad_word_ids of this.bad_words_ids) {
// Whether to modify the logits of the last token in the bad word id sequence
let mark = true;
// For each bad word in the list, if the current sequence of input ids ends with this sequence (excluding the last),
// then we set the logits of the last bad word id to -Infinity.
for (let i = 1; i <= bad_word_ids.length - 1 && bad_word_ids.length < input_ids.length; ++i) {
if (bad_word_ids.at(-i - 1) !== input_ids.at(-i)) {
// We have found a mismatch
mark = false;
break;
}
}
if (mark) {
logits.data[bad_word_ids.at(-1)] = -Infinity;
}
}
return logits
}
}
/**

@@ -496,0 +539,0 @@ * Class that holds a configuration for a generation task.

@@ -19,2 +19,3 @@

const BROWSER_ENV = typeof self !== 'undefined';
const WEBWORKER_ENV = BROWSER_ENV && self.constructor.name === 'DedicatedWorkerGlobalScope';

@@ -94,2 +95,6 @@ let createCanvasFunction;

get size() {
return [this.width, this.height];
}
/**

@@ -392,2 +397,54 @@ * Helper method for reading an image from a variety of input types.

async crop([x_min, y_min, x_max, y_max]) {
// Ensure crop bounds are within the image
x_min = Math.max(x_min, 0);
y_min = Math.max(y_min, 0);
x_max = Math.min(x_max, this.width - 1);
y_max = Math.min(y_max, this.height - 1);
// Do nothing if the crop is the entire image
if (x_min === 0 && y_min === 0 && x_max === this.width - 1 && y_max === this.height - 1) {
return this;
}
const crop_width = x_max - x_min + 1;
const crop_height = y_max - y_min + 1;
if (BROWSER_ENV) {
// Store number of channels before resizing
const numChannels = this.channels;
// Create canvas object for this image
const canvas = this.toCanvas();
// Create a new canvas of the desired size. This is needed since if the
// image is too small, we need to pad it with black pixels.
const ctx = createCanvasFunction(crop_width, crop_height).getContext('2d');
// Draw image to context, cropping in the process
ctx.drawImage(canvas,
x_min, y_min, crop_width, crop_height,
0, 0, crop_width, crop_height
);
// Create image from the resized data
const resizedImage = new RawImage(ctx.getImageData(0, 0, crop_width, crop_height).data, crop_width, crop_height, 4);
// Convert back so that image has the same number of channels as before
return resizedImage.convert(numChannels);
} else {
// Create sharp image from raw data
const img = this.toSharp().extract({
left: x_min,
top: y_min,
width: crop_width,
height: crop_height,
});
return await loadImageFunction(img);
}
}
async center_crop(crop_width, crop_height) {

@@ -508,2 +565,11 @@ // If the image is already the desired size, return it

async toBlob(type = 'image/png', quality = 1) {
if (!BROWSER_ENV) {
throw new Error('toBlob() is only supported in browser environments.')
}
const canvas = this.toCanvas();
return await canvas.convertToBlob({ type, quality });
}
toCanvas() {

@@ -582,13 +648,17 @@ if (!BROWSER_ENV) {

*/
save(path) {
async save(path) {
if (BROWSER_ENV) {
if (WEBWORKER_ENV) {
throw new Error('Unable to save an image from a Web Worker.')
}
const extension = path.split('.').pop().toLowerCase();
const mime = CONTENT_TYPE_MAP.get(extension) ?? 'image/png';
// Convert image to canvas
const canvas = this.toCanvas();
// Convert image to Blob
const blob = await this.toBlob(mime);
// Convert the canvas content to a data URL
const dataURL = canvas.toDataURL(mime);
const dataURL = URL.createObjectURL(blob);

@@ -613,3 +683,3 @@ // Create an anchor element with the data URL as the href attribute

const img = this.toSharp();
img.toFile(path);
return await img.toFile(path);
}

@@ -616,0 +686,0 @@ }

@@ -235,3 +235,3 @@

* Returns the value and index of the minimum element in an array.
* @param {number[]} arr array of numbers.
* @param {number[]|TypedArray} arr array of numbers.
* @returns {number[]} the value and index of the minimum element, of the form: [valueOfMin, indexOfMin]

@@ -256,3 +256,3 @@ * @throws {Error} If array is empty.

* Returns the value and index of the maximum element in an array.
* @param {number[]} arr array of numbers.
* @param {number[]|TypedArray} arr array of numbers.
* @returns {number[]} the value and index of the maximum element, of the form: [valueOfMax, indexOfMax]

@@ -259,0 +259,0 @@ * @throws {Error} If array is empty.

@@ -37,3 +37,2 @@ /**

/** @type {Object} */
const ONNXTensor = ONNX.Tensor;

@@ -40,0 +39,0 @@

@@ -1,4 +0,5 @@

/** @type {module} The ONNX runtime module. */
export let ONNX: any;
/** @type {import('onnxruntime-web')} The ONNX runtime module. */
export let ONNX: typeof ONNX_WEB;
export const executionProviders: string[];
import * as ONNX_WEB from 'onnxruntime-web';
//# sourceMappingURL=onnx.d.ts.map

@@ -20,5 +20,5 @@ export namespace env {

}
declare const onnx_env: any;
declare const onnx_env: import("onnxruntime-common").Env;
declare const __dirname: any;
declare const VERSION: "2.8.0";
declare const VERSION: "2.9.0";
declare const localModelPath: any;

@@ -25,0 +25,0 @@ declare const FS_AVAILABLE: boolean;

@@ -7,2 +7,3 @@ /**

* - `"automatic-speech-recognition"`: will return a `AutomaticSpeechRecognitionPipeline`.
* - `"depth-estimation"`: will return a `DepthEstimationPipeline`.
* - `"document-question-answering"`: will return a `DocumentQuestionAnsweringPipeline`.

@@ -25,2 +26,3 @@ * - `"feature-extraction"`: will return a `FeatureExtractionPipeline`.

* - `"zero-shot-image-classification"`: will return a `ZeroShotImageClassificationPipeline`.
* - `"zero-shot-object-detection"`: will return a `ZeroShotObjectDetectionPipeline`.
* @param {string} [model=null] The name of the pre-trained model to use. If not specified, the default model for the task will be used.

@@ -897,10 +899,85 @@ * @param {import('./utils/hub.js').PretrainedOptions} [options] Optional parameters for the pipeline.

}): Promise<any>;
}
/**
* Zero-shot object detection pipeline. This pipeline predicts bounding boxes of
* objects when you provide an image and a set of `candidate_labels`.
*
* **Example:** Zero-shot object detection w/ `Xenova/clip-vit-base-patch32`.
* ```javascript
* let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/astronaut.png';
* let candidate_labels = ['human face', 'rocket', 'helmet', 'american flag'];
* let detector = await pipeline('zero-shot-object-detection', 'Xenova/owlvit-base-patch32');
* let output = await detector(url, candidate_labels);
* // [
* // {
* // score: 0.24392342567443848,
* // label: 'human face',
* // box: { xmin: 180, ymin: 67, xmax: 274, ymax: 175 }
* // },
* // {
* // score: 0.15129457414150238,
* // label: 'american flag',
* // box: { xmin: 0, ymin: 4, xmax: 106, ymax: 513 }
* // },
* // {
* // score: 0.13649864494800568,
* // label: 'helmet',
* // box: { xmin: 277, ymin: 337, xmax: 511, ymax: 511 }
* // },
* // {
* // score: 0.10262022167444229,
* // label: 'rocket',
* // box: { xmin: 352, ymin: -1, xmax: 463, ymax: 287 }
* // }
* // ]
* ```
*
* **Example:** Zero-shot object detection w/ `Xenova/clip-vit-base-patch32` (returning top 4 matches and setting a threshold).
* ```javascript
* let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/beach.png';
* let candidate_labels = ['hat', 'book', 'sunglasses', 'camera'];
* let detector = await pipeline('zero-shot-object-detection', 'Xenova/owlvit-base-patch32');
* let output = await detector(url, candidate_labels, { topk: 4, threshold: 0.05 });
* // [
* // {
* // score: 0.1606510728597641,
* // label: 'sunglasses',
* // box: { xmin: 347, ymin: 229, xmax: 429, ymax: 264 }
* // },
* // {
* // score: 0.08935828506946564,
* // label: 'hat',
* // box: { xmin: 38, ymin: 174, xmax: 258, ymax: 364 }
* // },
* // {
* // score: 0.08530698716640472,
* // label: 'camera',
* // box: { xmin: 187, ymin: 350, xmax: 260, ymax: 411 }
* // },
* // {
* // score: 0.08349756896495819,
* // label: 'book',
* // box: { xmin: 261, ymin: 280, xmax: 494, ymax: 425 }
* // }
* // ]
* ```
*/
export class ZeroShotObjectDetectionPipeline extends Pipeline {
/**
* Helper function to convert list [xmin, xmax, ymin, ymax] into object { "xmin": xmin, ... }
* @param {number[]} box The bounding box as a list.
* @param {boolean} asInteger Whether to cast to integers.
* @returns {Object} The bounding box as an object.
* @private
* Detect objects (bounding boxes & classes) in the image(s) passed as inputs.
* @param {Array} images The input images.
* @param {string[]} candidate_labels What the model should recognize in the image.
* @param {Object} options The options for the classification.
* @param {number} [options.threshold] The probability necessary to make a prediction.
* @param {number} [options.topk] The number of top predictions that will be returned by the pipeline.
* If the provided number is `null` or higher than the number of predictions available, it will default
* to the number of predictions.
* @param {boolean} [options.percentage=false] Whether to return the boxes coordinates in percentage (true) or in pixels (false).
* @returns {Promise<any>} An array of classifications for each input image or a single classification object if only one input image is provided.
*/
private _get_bounding_box;
_call(images: any[], candidate_labels: string[], { threshold, topk, percentage, }?: {
threshold?: number;
topk?: number;
percentage?: boolean;
}): Promise<any>;
}

@@ -1013,2 +1090,34 @@ /**

}
/**
* Depth estimation pipeline using any `AutoModelForDepthEstimation`. This pipeline predicts the depth of an image.
*
* **Example:** Depth estimation w/ `Xenova/dpt-hybrid-midas`
* ```javascript
* let url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg';
* let depth_estimator = await pipeline('depth-estimation', 'Xenova/dpt-hybrid-midas');
* let out = await depth_estimator(url);
* // {
* // predicted_depth: Tensor {
* // dims: [ 384, 384 ],
* // type: 'float32',
* // data: Float32Array(147456) [ 542.859130859375, 545.2833862304688, 546.1649169921875, ... ],
* // size: 147456
* // },
* // depth: RawImage {
* // data: Uint8Array(307200) [ 86, 86, 86, ... ],
* // width: 640,
* // height: 480,
* // channels: 1
* // }
* // }
* ```
*/
export class DepthEstimationPipeline extends Pipeline {
/**
* Predicts the depth for the image(s) passed as inputs.
* @param {any} images The images to compute depth for.
* @returns {Promise<any>} An image or a list of images containing result(s).
*/
_call(images: any): Promise<any>;
}
export type QuestionAnsweringResult = {

@@ -1015,0 +1124,0 @@ /**

@@ -68,5 +68,7 @@ declare const FeatureExtractor_base: new () => {

size: any;
size_divisor: any;
do_center_crop: any;
crop_size: any;
do_convert_rgb: any;
do_crop_margin: any;
pad_size: any;

@@ -87,2 +89,9 @@ do_pad: any;

/**
* Crops the margin of the image. Gray pixels are considered margin (i.e., pixels with a value below the threshold).
* @param {RawImage} image The image to be cropped.
* @param {number} gray_threshold Value below which pixels are considered to be gray.
* @returns {Promise<RawImage>} The cropped image.
*/
crop_margin(image: RawImage, gray_threshold?: number): Promise<RawImage>;
/**
* Pad the image by a certain amount.

@@ -142,2 +151,8 @@ * @param {Float32Array} pixelData The pixel data to pad.

}
export class DPTFeatureExtractor extends ImageFeatureExtractor {
}
export class GLPNFeatureExtractor extends ImageFeatureExtractor {
}
export class CLIPFeatureExtractor extends ImageFeatureExtractor {
}
export class ConvNextFeatureExtractor extends ImageFeatureExtractor {

@@ -149,2 +164,19 @@ }

}
export class OwlViTFeatureExtractor extends ImageFeatureExtractor {
/**
* Post-processes the outputs of the model (for object detection).
* @param {Object} outputs The outputs of the model that must be post-processed
* @param {Tensor} outputs.logits The logits
* @param {Tensor} outputs.pred_boxes The predicted boxes.
* @param {number} [threshold=0.5] The threshold to use for the scores.
* @param {number[][]} [target_sizes=null] The sizes of the original images.
* @param {boolean} [is_zero_shot=false] Whether zero-shot object detection was performed.
* @return {Object[]} An array of objects containing the post-processed outputs.
* @private
*/
post_process_object_detection(outputs: {
logits: Tensor;
pred_boxes: Tensor;
}, threshold?: number, target_sizes?: number[][], is_zero_shot?: boolean): any[];
}
export class DeiTFeatureExtractor extends ImageFeatureExtractor {

@@ -157,2 +189,4 @@ }

}
export class NougatImageProcessor extends DonutFeatureExtractor {
}
/**

@@ -181,3 +215,7 @@ * @typedef {object} DetrFeatureExtractorResultProps

* @param {Tensor} outputs.pred_boxes The predicted boxes.
* @param {number} [threshold=0.5] The threshold to use for the scores.
* @param {number[][]} [target_sizes=null] The sizes of the original images.
* @param {boolean} [is_zero_shot=false] Whether zero-shot object detection was performed.
* @return {Object[]} An array of objects containing the post-processed outputs.
* @private
*/

@@ -187,3 +225,3 @@ post_process_object_detection(outputs: {

pred_boxes: Tensor;
}, threshold?: number, target_sizes?: any): any[];
}, threshold?: number, target_sizes?: number[][], is_zero_shot?: boolean): any[];
/**

@@ -250,3 +288,7 @@ * Binarize the given masks using `object_mask_threshold`, it returns the associated values of `masks`, `scores` and `labels`.

* @param {Tensor} outputs.pred_boxes The predicted boxes.
* @param {number} [threshold=0.5] The threshold to use for the scores.
* @param {number[][]} [target_sizes=null] The sizes of the original images.
* @param {boolean} [is_zero_shot=false] Whether zero-shot object detection was performed.
* @return {Object[]} An array of objects containing the post-processed outputs.
* @private
*/

@@ -256,3 +298,3 @@ post_process_object_detection(outputs: {

pred_boxes: Tensor;
}, threshold?: number, target_sizes?: any): any[];
}, threshold?: number, target_sizes?: number[][], is_zero_shot?: boolean): any[];
}

@@ -434,2 +476,4 @@ /**

}
export class OwlViTProcessor extends Processor {
}
/**

@@ -470,3 +514,7 @@ * Helper class which is used to instantiate pretrained processors with the `from_pretrained` function.

MobileViTFeatureExtractor: typeof MobileViTFeatureExtractor;
OwlViTFeatureExtractor: typeof OwlViTFeatureExtractor;
CLIPFeatureExtractor: typeof CLIPFeatureExtractor;
ConvNextFeatureExtractor: typeof ConvNextFeatureExtractor;
DPTFeatureExtractor: typeof DPTFeatureExtractor;
GLPNFeatureExtractor: typeof GLPNFeatureExtractor;
BeitFeatureExtractor: typeof BeitFeatureExtractor;

@@ -477,2 +525,3 @@ DeiTFeatureExtractor: typeof DeiTFeatureExtractor;

DonutFeatureExtractor: typeof DonutFeatureExtractor;
NougatImageProcessor: typeof NougatImageProcessor;
SamImageProcessor: typeof SamImageProcessor;

@@ -488,2 +537,3 @@ Swin2SRImageProcessor: typeof Swin2SRImageProcessor;

SpeechT5Processor: typeof SpeechT5Processor;
OwlViTProcessor: typeof OwlViTProcessor;
};

@@ -490,0 +540,0 @@ /**

@@ -0,1 +1,7 @@

/**
* Helper method for adding `token_type_ids` to model inputs
* @param {Object} inputs An object containing the input ids and attention mask.
* @returns {Object} The prepared inputs object.
*/
export function add_token_types(inputs: any): any;
declare const TokenizerModel_base: new () => {

@@ -99,2 +105,4 @@ (...args: any[]): any;

sep_token_id: number;
unk_token: string;
unk_token_id: number;
model_max_length: any;

@@ -449,2 +457,4 @@ /** @type {boolean} Whether or not to strip the text when tokenizing (removing excess spaces before and after the string). */

}
export class NougatTokenizer extends PreTrainedTokenizer {
}
/**

@@ -492,2 +502,3 @@ * Helper class which is used to instantiate pretrained tokenizers with the `from_pretrained` function.

SpeechT5Tokenizer: typeof SpeechT5Tokenizer;
NougatTokenizer: typeof NougatTokenizer;
PreTrainedTokenizer: typeof PreTrainedTokenizer;

@@ -494,0 +505,0 @@ };

@@ -97,2 +97,9 @@ /**

/**
* Helper function to convert list [xmin, xmax, ymin, ymax] into object { "xmin": xmin, ... }
* @param {number[]} box The bounding box as a list.
* @param {boolean} asInteger Whether to cast to integers.
* @returns {Object} The bounding box as an object.
*/
export function get_bounding_box(box: number[], asInteger: boolean): any;
/**
* A base class for creating callable objects.

@@ -99,0 +106,0 @@ *

@@ -280,2 +280,19 @@ declare const LogitsProcessorList_base: new () => {

}
export class NoBadWordsLogitsProcessor extends LogitsProcessor {
/**
* Create a `NoBadWordsLogitsProcessor`.
* @param {number[][]} bad_words_ids List of list of token ids that are not allowed to be generated.
* @param {number|number[]} eos_token_id The id of the *end-of-sequence* token. Optionally, use a list to set multiple *end-of-sequence* tokens.
*/
constructor(bad_words_ids: number[][], eos_token_id: number | number[]);
bad_words_ids: number[][];
eos_token_id: number[];
/**
* Apply logit processor.
* @param {Array} input_ids The input IDs.
* @param {Object} logits The logits.
* @returns {Object} The processed logits.
*/
_call(input_ids: any[], logits: any): any;
}
/**

@@ -282,0 +299,0 @@ * Class that holds a configuration for a generation task.

@@ -48,2 +48,3 @@ export class RawImage {

channels: 2 | 1 | 3 | 4;
get size(): number[];
/**

@@ -76,3 +77,5 @@ * Convert the image to grayscale format.

pad([left, right, top, bottom]: [any, any, any, any]): Promise<any>;
crop([x_min, y_min, x_max, y_max]: [any, any, any, any]): Promise<any>;
center_crop(crop_width: any, crop_height: any): Promise<any>;
toBlob(type?: string, quality?: number): Promise<any>;
toCanvas(): any;

@@ -103,5 +106,5 @@ /**

*/
save(path: string): void;
save(path: string): Promise<any>;
toSharp(): any;
}
//# sourceMappingURL=image.d.ts.map

@@ -71,14 +71,14 @@ /**

* Returns the value and index of the minimum element in an array.
* @param {number[]} arr array of numbers.
* @param {number[]|TypedArray} arr array of numbers.
* @returns {number[]} the value and index of the minimum element, of the form: [valueOfMin, indexOfMin]
* @throws {Error} If array is empty.
*/
export function min(arr: number[]): number[];
export function min(arr: number[] | TypedArray): number[];
/**
* Returns the value and index of the maximum element in an array.
* @param {number[]} arr array of numbers.
* @param {number[]|TypedArray} arr array of numbers.
* @returns {number[]} the value and index of the maximum element, of the form: [valueOfMax, indexOfMax]
* @throws {Error} If array is empty.
*/
export function max(arr: number[]): number[];
export function max(arr: number[] | TypedArray): number[];
/**

@@ -85,0 +85,0 @@ * Return the Discrete Fourier Transform sample frequencies.

@@ -74,5 +74,4 @@ /**

export function ones_like(tensor: Tensor): Tensor;
declare const Tensor_base: any;
declare const Tensor_base: import("onnxruntime-common").TensorConstructor;
export class Tensor extends Tensor_base {
[x: string]: any;
/**

@@ -82,3 +81,3 @@ * Create a new Tensor or copy an existing Tensor.

*/
constructor(...args: [string, DataArray, number[]] | [any]);
constructor(...args: [string, DataArray, number[]] | [import("onnxruntime-common").TensorConstructor]);
/**

@@ -207,3 +206,2 @@ * Index into a Tensor object.

squeeze_(dim?: any): this;
dims: any;
/**

@@ -210,0 +208,0 @@ * Returns a new tensor with a dimension of size one inserted at the specified position.

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is too big to display

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

Sorry, the diff of this file is not supported yet

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc