Updated transformers.js from v2 to v3
Migrated quantization option from onnxEmbeddingModelQuantized (boolean) to dtype ('p32', 'p16', 'q8', 'q4')
Updated Web UI to use new dtype option

2.2.4

Diff

jparkerweb

published 2.2.4 • 2 months ago

Changelog

Source

[2.2.4] - 2024-11-08

🐛 Fixed

Fixed issue with Web UI embedding cache not being cleared when a new model is initialized

2.2.1

Diff

jparkerweb

published 2.2.1 • 2 months ago

Changelog

Source

[2.2.1] - 2024-11-06

✨ Added

Added Highlight.js to Web UI for syntax highlighting of JSON results and code samples
Added JSON results toggle button to turn line wrapping on/off

2.2.0

Diff

jparkerweb

published 2.2.0 • 2 months ago

Changelog

Source

[2.2.0] - 2024-11-05

✨ Added

New Web UI tool for experimenting with semantic chunking settings
- Interactive form interface for all chunking parameters
- Real-time text processing and results display
- Visual feedback for similarity thresholds
- Model selection and configuration
- Results download in JSON format
- Code generation for settings
- Example texts for testing
- Dark mode interface
Added excludeChunkPrefixInResults option to chunkit and cramit functions
- Allows removal of chunk prefix from final results while maintaining prefix for embedding calculations

📦 Updated

Improved error handling and feedback in chunking functions
Enhanced documentation with Web UI usage examples
Added more embedding models to supported list

🐛 Fixed

Fixed issue with chunk prefix handling in embedding calculations
Improved token length calculation reliability

2.1.4

Diff

jparkerweb

published 2.1.4 • 2 months ago

Changelog

Source

[2.1.4] - 2024-03-01

📦 Updated

Updated README cramit example script to use updated document object input format.

2.1.3

Diff

jparkerweb

published 2.1.3 • 2 months ago

Changelog

Source

[2.1.3] - 2024-11-04

🐛 Fixed

Fixed cramit function to properly pack sentences up to maxTokenSize

📦 Updated

Improved chunk creation logic to better handle both chunkit and cramit modes
Enhanced token size calculation efficiency

2.1.2

Diff

jparkerweb

published 2.1.2 • 2 months ago

Changelog

Source

[2.1.2] - 2024-11-04

🐛 Fixed

Improved semantic chunking accuracy with stricter similarity thresholds
Enhanced logging in similarity calculations for better debugging
Fixed chunk creation to better respect semantic boundaries

📦 Updated

Default similarity threshold increased to 0.5
Default dynamic threshold bounds adjusted (0.4 - 0.8)
Improved chunk rebalancing logic with similarity checks
Updated logging for similarity scores between sentences

2.1.1

Diff

jparkerweb

published 2.1.1 • 2 months ago

Changelog

Source

[2.1.1] - 2024-11-01

📦 Updated

Updated example scripts in README.

2.1.0

Diff

jparkerweb

published 2.1.0 • 2 months ago

Changelog

Source

[2.1.0] - 2024-11-01

📦 Updated

⚠️ BREAKING: Input format now accepts array of document objects
Output array of chunks extended with the following new properties:
- document_id: Timestamp in milliseconds when processing started
- document_name: Original document name or ""
- number_of_chunks: Total number of chunks for the document
- chunk_number: Current chunk number (1-based)
- model_name: Name of the embedding model used
- is_model_quantized: Whether the model is quantized

semantic-chunking - npm Package Versions

2.3.1

2.3.0

.css-1z04cui{margin-bottom:var(--chakra-space-4);font-size:var(--chakra-fontSizes-md);}[2.3.0] - 2024-11-11

📦 Updated

2.2.4

[2.2.4] - 2024-11-08

🐛 Fixed

2.2.1

[2.2.1] - 2024-11-06

✨ Added

2.2.0

[2.2.0] - 2024-11-05

✨ Added

📦 Updated

🐛 Fixed

2.1.4

[2.1.4] - 2024-03-01

📦 Updated

2.1.3

[2.1.3] - 2024-11-04

🐛 Fixed

📦 Updated

2.1.2

[2.1.2] - 2024-11-04

🐛 Fixed

📦 Updated

2.1.1

[2.1.1] - 2024-11-01

📦 Updated

2.1.0

[2.1.0] - 2024-11-01

📦 Updated

[2.3.0] - 2024-11-11