Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Tools for running data in a SQLite database through AWS Comprehend
See sqlite-comprehend: run AWS entity extraction against content in a SQLite database for background on this project.
Install this tool using pip
:
pip install sqlite-comprehend
You can see examples of tables generated using this command here:
You will need AWS credentials with the comprehend:BatchDetectEntities
IAM permission.
You can configure credentials using these instructions. You can also save them to a JSON or INI configuration file and pass them to the command using -a credentials.ini
, or pass them using the --access-key
and --secret-key
options.
The sqlite-comprehend entities
command runs entity extraction against every row in the specified table and saves the results to your database.
Specify the database, the table and one or more columns containing text in that table. The following runs against the text
column in the pages
table of the sfms.db
SQLite database:
sqlite-comprehend sfms.db pages text
Results will be written into a pages_comprehend_entities
table. Change the name of the output table by passing -o other_table_name
.
You can run against a subset of rows by adding a --where
clause:
sqlite-comprehend sfms.db pages text --where 'id < 10'
You can also used named parameters in your --where
clause:
sqlite-comprehend sfms.db pages text --where 'id < :maxid' -p maxid 10
Only the first 5,000 characters for each row will be considered. Be sure to review Comprehend's pricing - which starts at $0.0001 per hundred characters.
If your context includes HTML tags, you can strip them out before extracting entities by adding --strip-tags
:
sqlite-comprehend sfms.db pages text --strip-tags
Rows that have been processed are recorded in the pages_comprehend_entities_done
table. If you run the command more than once it will only process rows that have been newly added.
You can delete records from that _done
table to run them again.
Usage: sqlite-comprehend entities [OPTIONS] DATABASE TABLE COLUMNS...
Detect entities in columns in a table
To extract entities from columns text1 and text2 in mytable:
sqlite-comprehend entities my.db mytable text1 text2
To run against just a subset of the rows in the table, add:
--where "id < :max_id" -p max_id 50
Results will be written to a table called mytable_comprehend_entities
To specify a different output table, use -o custom_table_name
Options:
--where TEXT WHERE clause to filter table
-p, --param <TEXT TEXT>... Named :parameters for SQL query
-o, --output TEXT Custom output table
-r, --reset Start from scratch, deleting previous results
--strip-tags Strip HTML tags before extracting entities
--access-key TEXT AWS access key ID
--secret-key TEXT AWS secret access key
--session-token TEXT AWS session token
--endpoint-url TEXT Custom endpoint URL
-a, --auth FILENAME Path to JSON/INI file containing credentials
--help Show this message and exit.
Assuming an input table called pages
the tables created by this tool will have the following schema:
CREATE TABLE [pages] (
[id] INTEGER PRIMARY KEY,
[text] TEXT
);
CREATE TABLE [comprehend_entity_types] (
[id] INTEGER PRIMARY KEY,
[value] TEXT
);
CREATE TABLE [comprehend_entities] (
[id] INTEGER PRIMARY KEY,
[name] TEXT,
[type] INTEGER REFERENCES [comprehend_entity_types]([id])
);
CREATE TABLE [pages_comprehend_entities] (
[id] INTEGER REFERENCES [pages]([id]),
[score] FLOAT,
[entity] INTEGER REFERENCES [comprehend_entities]([id]),
[begin_offset] INTEGER,
[end_offset] INTEGER
);
CREATE UNIQUE INDEX [idx_comprehend_entity_types_value]
ON [comprehend_entity_types] ([value]);
CREATE UNIQUE INDEX [idx_comprehend_entities_type_name]
ON [comprehend_entities] ([type], [name]);
CREATE TABLE [pages_comprehend_entities_done] (
[id] INTEGER PRIMARY KEY REFERENCES [pages]([id])
);
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd sqlite-comprehend
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
FAQs
Tools for running data in a SQLite database through AWS Comprehend
We found that sqlite-comprehend demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.