
Security News
Static vs. Runtime Reachability: Insights from Latio’s On the Record Podcast
The Latio podcast explores how static and runtime reachability help teams prioritize exploitable vulnerabilities and streamline AppSec workflows.
A command-line interface (CLI) for training and searching Boox datasets.
A command-line interface (CLI) for training and searching Boox datasets.
Install boox-cli
globally using npm or yarn:
npm install -g boox-cli
# Or
yarn global add boox-cli
To train a Boox dataset, use the train command:
boox-cli train <source> [destination] [options]
<source>
: The path to your dataset file (JSON format).[destination]
: (Optional) The path where the trained data will be saved. Defaults to the current directory.Options:
-i, --id <field>
: The field in your dataset objects that uniquely identifies each document (default: 'id'
).-f, --features <fields...>
: The fields to index for search (multiple fields can be specified).-a, --attributes <fields...>
: The fields to include as-is without indexing (multiple fields can be specified).-d, --deflate
: Compress the trained data as .dat
file (default: false
).-c, --cwd <folder>
: The working directory (default: current directory).-r, --rcname <name>
: The name of the Boox configuration file (default: 'boox'
).Example:
boox-cli train data/products.json -f title description -a price
This command will train a Boox dataset from the data/products.json
file, indexing the title
and description
fields for search and including the price
field as-is. The trained data will be saved as a compressed .gz
file.
To search a trained Boox dataset, use the search
command:
boox-cli search <source> <query> [options]
<source>
: The path to the trained dataset file (.dat
or .gz
).<query>
: The search query string.Options:
-o, --offset <number>
: The offset for pagination (default: '1'
).-l, --length <number>
: The number of results per page (default: '10'
).-k, --context <field>
: Display the context instead of paginated results object.-a, --attrs <fields...>
: Fields to display when --context
is provided.-d, --deflate
: Assume the trained data is deflated as .dat
file (default: false
).-c, --cwd <folder>
: The working directory (default: current directory).-r, --rcname <name>
: The name of the Boox configuration file (default: 'boox'
).Example:
boox-cli search data/products-trained.gz "shoes" -o 2 -l 20
This command will search the data/products-trained.gz
dataset for documents containing the word "shoes"
, starting from the second page and displaying 20 results per page.
You can create a Boox configuration file in your project's root directory to specify default options for the boox-cli train
and boox-cli search
commands:
.booxrc
.booxrc.json
.booxrc.{yaml,yml}
.boox.{mjs,cjs,js}
boox.config.{mjs,cjs,js}
Before using the example below, make sure to install the required libraries:
npm install -D double-metaphone stemmer stopword marked marked-plaintify
Here's an example of a Boox configuration file:
// boox.config.js
import { doubleMetaphone } from 'double-metaphone'
import { Marked } from 'marked'
import markedPlaintify from 'marked-plaintify'
import { stemmer } from 'stemmer'
import { removeStopwords } from 'stopword'
const marked = new Marked({ gfm: true }).use(markedPlaintify())
const wordRegexp = /\b\w+\b/g
/** @type {() => import('boox').BooxOptions} */
export default function defineBooxConfig() {
return {
id: 'customId',
features: ['title', 'content', 'tags'],
attributes: ['author', 'date'],
modelOptions: {
normalizer(input) {
// Remove Markdown formatting
return marked.parse(input)
},
tokenizer(input) {
const tokens = Array.from(input.match(wordRegexp) || [])
return removeStopwords(tokens)
},
stemmer: stemmer,
phonetic: doubleMetaphone
}
}
}
The --rcname
flag allows you to customize the name of the configuration file. For example, to use a configuration file named my-appname.config.js
, you would run the following command:
boox-cli train src/dataset.json --rcname my-appname
FAQs
A command-line interface (CLI) for training and searching Boox datasets.
We found that boox-cli demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
The Latio podcast explores how static and runtime reachability help teams prioritize exploitable vulnerabilities and streamline AppSec workflows.
Security News
The latest Opengrep releases add Apex scanning, precision rule tuning, and performance gains for open source static code analysis.
Security News
npm now supports Trusted Publishing with OIDC, enabling secure package publishing directly from CI/CD workflows without relying on long-lived tokens.