Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

semantic-chunking

Package Overview
Dependencies
Maintainers
0
Versions
30
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

semantic-chunking - npm Package Versions

13

2.3.1

Diff

jparkerweb
published 2.3.0 •

Changelog

Source

[2.3.0] - 2024-11-11

📦 Updated

  • Updated transformers.js from v2 to v3
  • Migrated quantization option from onnxEmbeddingModelQuantized (boolean) to dtype ('p32', 'p16', 'q8', 'q4')
  • Updated Web UI to use new dtype option
jparkerweb
published 2.2.4 •

Changelog

Source

[2.2.4] - 2024-11-08

🐛 Fixed

  • Fixed issue with Web UI embedding cache not being cleared when a new model is initialized
jparkerweb
published 2.2.1 •

Changelog

Source

[2.2.1] - 2024-11-06

✨ Added

  • Added Highlight.js to Web UI for syntax highlighting of JSON results and code samples
  • Added JSON results toggle button to turn line wrapping on/off
jparkerweb
published 2.2.0 •

Changelog

Source

[2.2.0] - 2024-11-05

✨ Added

  • New Web UI tool for experimenting with semantic chunking settings
    • Interactive form interface for all chunking parameters
    • Real-time text processing and results display
    • Visual feedback for similarity thresholds
    • Model selection and configuration
    • Results download in JSON format
    • Code generation for settings
    • Example texts for testing
    • Dark mode interface
  • Added excludeChunkPrefixInResults option to chunkit and cramit functions
    • Allows removal of chunk prefix from final results while maintaining prefix for embedding calculations

📦 Updated

  • Improved error handling and feedback in chunking functions
  • Enhanced documentation with Web UI usage examples
  • Added more embedding models to supported list

🐛 Fixed

  • Fixed issue with chunk prefix handling in embedding calculations
  • Improved token length calculation reliability
jparkerweb
published 2.1.4 •

Changelog

Source

[2.1.4] - 2024-03-01

📦 Updated

  • Updated README cramit example script to use updated document object input format.
jparkerweb
published 2.1.3 •

Changelog

Source

[2.1.3] - 2024-11-04

🐛 Fixed

  • Fixed cramit function to properly pack sentences up to maxTokenSize

📦 Updated

  • Improved chunk creation logic to better handle both chunkit and cramit modes
  • Enhanced token size calculation efficiency
jparkerweb
published 2.1.2 •

Changelog

Source

[2.1.2] - 2024-11-04

🐛 Fixed

  • Improved semantic chunking accuracy with stricter similarity thresholds
  • Enhanced logging in similarity calculations for better debugging
  • Fixed chunk creation to better respect semantic boundaries

📦 Updated

  • Default similarity threshold increased to 0.5
  • Default dynamic threshold bounds adjusted (0.4 - 0.8)
  • Improved chunk rebalancing logic with similarity checks
  • Updated logging for similarity scores between sentences
jparkerweb
published 2.1.1 •

Changelog

Source

[2.1.1] - 2024-11-01

📦 Updated

  • Updated example scripts in README.
jparkerweb
published 2.1.0 •

Changelog

Source

[2.1.0] - 2024-11-01

📦 Updated

  • ⚠️ BREAKING: Input format now accepts array of document objects
  • Output array of chunks extended with the following new properties:
    • document_id: Timestamp in milliseconds when processing started
    • document_name: Original document name or ""
    • number_of_chunks: Total number of chunks for the document
    • chunk_number: Current chunk number (1-based)
    • model_name: Name of the embedding model used
    • is_model_quantized: Whether the model is quantized
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc