New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

better-sqlite3-schema

Package Overview
Dependencies
Maintainers
1
Versions
35
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

better-sqlite3-schema - npm Package Compare versions

Comparing version 0.2.0 to 0.3.0

codegen.d.ts

37

package.json
{
"name": "better-sqlite3-schema",
"version": "0.2.0",
"version": "0.3.0",
"description": "Migrate (nested and multi-dimensional) json data to/from sqlite database with better-sqlite3-helper",

@@ -17,4 +17,4 @@ "keywords": [

"license": "BSD-2-Clause",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"main": "index.js",
"types": "index.d.ts",
"scripts": {

@@ -27,17 +27,9 @@ "test": "npm run rawtest",

"build": "tsc -p tsconfig.build.json",
"postbuild": "mk-package",
"pretest": "npm run format && npm run build",
"rawtest": "ts-node test/index.ts && jest",
"prepublishOnly": "npm run rawtest && npm run build"
"rawtest": "ts-node test/index.ts && jest"
},
"husky": {
"hook": {
"pre-commit": "npm run format"
}
},
"directories": {
"test": "test"
},
"files": [
"dist"
],
"dependencies": {

@@ -48,19 +40,2 @@ "better-sqlite3": "^7.1.0",

},
"devDependencies": {
"@beenotung/tslib": "^16.35.0",
"@types/better-sqlite3": "^5.4.0",
"@types/jest": "^26.0.14",
"@types/node": "*",
"husky": "^4.3.0",
"jest": "^26.4.2",
"prettier": "^2.1.2",
"rimraf": "^3.0.2",
"ts-jest": "^26.4.1",
"ts-node": "^9.0.0",
"tslint": "^6.1.3",
"tslint-config-prettier": "^1.18.0",
"tslint-eslint-rules": "^5.4.0",
"tslint-etc": "^1.13.7",
"typescript": "^4.0.3"
},
"repository": {

@@ -74,2 +49,2 @@ "type": "git",

"homepage": "https://github.com/beenotung/better-sqlite3-schema#readme"
}
}

@@ -70,2 +70,97 @@ # better-sqlite3-schema

Detail example see `makePredefinedInsertRowFn()` and `makeGeneralInsertRowFn()` in [sample-test.ts](./test/sample-test.ts)
### Functional Approach (compose at runtime)
The functional approach allows one to compose customizable helper functions at runtime.
Explore the dataset and auto built schema with
- `makeSchemaScanner()`
Compose insert functions with
- `makeInsertRowFnFromSchema()`
- `makeDeduplicatedInsertRowFnFromSchema()`
Compose select functions with
- `makeSelectRowFnFromSchema()`
- `makeSelectRefFieldArray()`
- `makeGetRefValueFnFromSchema()`
Detail example see `makePredefinedInsertRowFn()` and `makeGeneralInsertRowFn()` in [functional-test.ts](./example/functional-test.ts)
### Code Generation Approach (compose at build-time)
The code generation approach allows one to compose customizable helper functions at build-time. Which can archive ~50% speed up compared to the runtime composing.
## Benchmark
### Sample 1: HTTP Proxy Log
8GiB of HTTP proxy server log.
Each line is a compact json text.
Sample text:
```json
{"timestamp":1600713130016,"type":"request","userAgent":"Mozilla/5.0 (Linux; Android 10; LIO-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Mobile Safari/537.36","referer":"https://www.example.net/sw.js","protocol":"https","host":"www.example.net","method":"GET","url":"/build/p-7794655c.js"}
```
When stored into sqlite3, the data are normalized into multiple tables to avoid duplication, e.g. only storing the full text of each type of user agent and url once.
File size in varies format:
| storage | size | size compared with plain text | Remark |
|---|---|---|---|
| plain text | 8256M | - | |
| sqlite without index | 920M | 11.1% | |
| zip of non-indexed sqlite file | 220M | 2.7% | 23.9% of sqlite3 file |
| sqlite with indices | 1147M | 13.9% | +24% of sqlite file |
| zip of indexed sqlite file | 268M | 3.2% | 23.4% of indexed sqlite3 file |
Time used to import:
- 6 minutes 10 seconds: with inlined helper functions with code generation
- 14 minutes: with runtime-composed helper functions
Optimization used:
- code generation from schema
- bulk insert (batch each 8K items with a transaction)
- cache id of normalized, repeatable values (with js object)
- create unique index on normalized values
- `PRAGMA synchronous = OFF`
- `PRAGMA journal_mode = MEMORY`
- `PRAGMA cache_size = ${(200 * 1000 ** 2) / 4}`
(default page size is 4K, we largely increase the cache_size to avoid massive tedious disk write)
**Remark**:
Using index increases the file size by 1/4, but hugely speeds up the import process.
To archive the best of both aspects, create indices during import;
and remove indices (then VACUUM) for archive file.
It takes 4.9s to build the indices;
and 16.3s to vacuum the database after removal of indices.
### Sample 2: Online Forum Data
291119 sample json data crawled from online forum (threads and posts)
Total size: 843M
The objects have consistent shape.
Some data are duplicated, e.g. user name, and some common comments.
Same as the dataset used in [binary-object](https://github.com/beenotung/binary-object)
File size in varies format:
| storage | size |
|---|---|
| json text | 843M |
| sqlite3 with index | 669M |
| sqlite3 without index | 628M |
| zip of sqlite3 without index | 171M |
**Remark**:
The data in sqlite3 are normalized to avoid duplication
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc