semantic-chunking
Advanced tools
Changelog
[2.1.0] - 2024-11-01
document_id
: Timestamp in milliseconds when processing starteddocument_name
: Original document name or ""number_of_chunks
: Total number of chunks for the documentchunk_number
: Current chunk number (1-based)model_name
: Name of the embedding model usedis_model_quantized
: Whether the model is quantizedChangelog
[2.0.0] - 2024-11-01
returnEmbedding
option to chunkit
and cramit
functions to include embeddings in the output.returnTokenLength
option to chunkit
and cramit
functions to include token length in the output.chunkPrefix
option to prefix each chunk with a task instruction (e.g., "search_document: ", "search_query: ").chunkPrefix
with embedding models that support task prefixes.text
, embedding
, and tokenLength
properties. Previous versions returned an array of strings.Changelog
[1.5.1] - 2024-11-01
cramit
function..Changelog
[1.5.0] - 2024-10-11
Changelog
[1.4.0] - 2024-09-24
Changelog
[1.3.0] - 2024-09-09
Changelog
[1.1.0] - 2024-05-09
Changelog
[1.0.0] - 2024-02-29