@icekuma/bktdb
Advanced tools
Comparing version 1.0.6 to 1.0.7
{ | ||
"name": "@icekuma/bktdb", | ||
"version": "1.0.6", | ||
"version": "1.0.7", | ||
"description": "", | ||
@@ -13,2 +13,4 @@ "main": "dist/index.js", | ||
"@types/node": "^14.14.37", | ||
"grpc": "^1.24.9", | ||
"grpc-tools": "^1.11.1", | ||
"mongodb": "^3.6.5" | ||
@@ -15,0 +17,0 @@ }, |
225
readme.md
@@ -1,1 +0,224 @@ | ||
# bktdb | ||
## 概念 | ||
- bucket: 数据桶, 逻辑上代表一类数据的集合, 粒度由业务层定义, 可能随业务发展变化, 业务和运维最小操作单元 | ||
- vtable: 虚拟表, 一类bucket 的集合, 数据量过大一般会根据Hash分成多个bucket | ||
![](./bucket.png) | ||
## 接口 | ||
接口所有操作, 基于bucket, 不管数据是否分片, 接口使用方式统一 | ||
业务接口 | ||
- `add(item)`: 新增一条数据 | ||
- `list(filter, limit, offset, sort)`: 枚举数据, 非幂等的操作, 同样的条件, 多次枚举可能返回结果不同(受数据量动态变化以及大数据搜索算法的影响) | ||
- `remove(filter)`: 删除数据 | ||
- `update(filter, item)`: 更新数据 | ||
- `search()`: 搜索, 基于ES, 需要提前使用`vTableToEsFull, vTableToEsChange` 同步数据到ES中 | ||
运维接口 | ||
针对kv/用户数据量暴增场景(分库分表) | ||
- `bucketReMap()`: bucket -> collection 重新映射, 解决数据量大问题, Hash, Range, Single, date, 监听oplog数据变更, 接近平滑升级(迁移索引)(单库) | ||
- `bucketMigrate`: bucket迁移, 比如热数据迁移到高性能SSD | ||
- `bucketReMeta()`: 重建meta信息, 一般在remap,migrate后执行 | ||
- `bucketSwitch()`: 迁移后切换bucket, bucket滚动升级, 升级过程中锁定bucket只读, 验证数据完整性, 切换配置(需要重命名临时表) | ||
- 全量迁移 | ||
- 热数据迁移 | ||
- `bucketVerify()`: 数据校验, 验证新老bucket 数据是否一致, 对比每个Doc 字段是否一致 | ||
针对数据升级? | ||
- `bucketUpgrade()`: bucket 字段升级, mgo用的少, 一般都是兼容老Doc, 新Doc字段上区分, 不像mysql 表结构强一致 | ||
针对运维统计场景 | ||
- `bucketScan(callback)`: 扫描bucket, 用于统计等常见使用 | ||
[DEMO](./demo/demo.js) | ||
业务应该避免以下操作(低频除外) | ||
1. 全局扫描 vtable | ||
2. 跨bucket 统计操作 | ||
3. | ||
## 2021年04月02日 | ||
3个场景: | ||
1. kv 数据, 如文件stor, 可根据hash 分到N个Bucket中 | ||
2. 用户数据, 每个用户一个看成一个bucket, 如用户个人文件bucket, 一个表里面有多个bucket | ||
3. 资源型数据, 如sku, 模板等, 少于1亿不需要分片, 禁止跨bucket 枚举 | ||
特性: | ||
1. bucket 可以在任意位置 (需单独配置,默认在一个db中) | ||
2. bucket 业务和运维最小操作单元, 资源调度等 | ||
3. 支持根据字段Hash 或value 分bucket | ||
4. 一个应用独占一个db库 | ||
5. db内, 应用自我管理collection | ||
扩展: | ||
1. bucket 锁 | ||
2. 滚动升级 | ||
3. 数据迁移 | ||
demo: | ||
网盘场景 | ||
bid = <name>_<type>_<num/value> | ||
item -> bucket | ||
1. hash: bid_stor_0_0, bid_stor_0_1 | ||
2. group: bid_file_1_u10001, bid_file_1_u10002 | ||
bucket -> collection (collection 限制小于等于128) | ||
1. hash: 取模后各占一个表, 默认规则, coll_stor_0_0 (缺点模变更影响所有数据) | ||
2. range: 范围, 比如0-10w, 要求范围字段为数字, 需要提前预设, coll_stor_1_0, coll_stor_1_1 | ||
3. single: 单表, 所有bucket 在一个表里面, 默认 | ||
bucket_topo 信息 | ||
| name | type | key | num | map | ctime | mtime | ver | | ||
bucket_meta 信息 | ||
| vtable | name | db | coll | alias_coll | type | key | status | rcnt | wcnt | ctime | mtime | ver | | ||
bucket_tag 信息 | ||
| vtable | name | tags | | ||
bucket 数据量 | ||
count(user)xNxM + hash_128 | ||
10w用户x5个维度x100个item + 128 = 5000w条数据 | ||
日活2k(缓存) | ||
运维场景 | ||
1. 迁移 | ||
2. 热数据(高频访问, 放到高性能机器SSD) | ||
问题? | ||
1. bucket 拆分合并, 不影响实际存储, 存储由map定义, 但会影响bucket 数据, 全局bucket 状态 | ||
2. 逻辑上bucket 只是类似一个tag group | ||
3. key 自动加索引 | ||
4. 同一bucket数据一定在一个表上 | ||
难点? | ||
1. 如何构建全局状态 | ||
## 全局信息 | ||
1. 物理表信息 | ||
- 从topology 里面获取物理表信息, 一般是静态的, 提前创建 | ||
2. 虚拟表信息 | ||
- 直接从topology 里面获取, 一般是静态的, 提前预设的 | ||
3. Bucket信息 | ||
- Bucket 是动态的 | ||
- 从topology 里面获取物理表信息, 遍历物理表, distcnt(key) 获取bucket 个数(可能统计过程中有新增的bucket会遗漏, 需要加ckp可以保证大部分没有问题),每次都需要全量扫描统计(有索引) | ||
- 从bucketmeta 里面获取bucket 信息, 需要在bucketMgr.get的时候更新bucketmeta信息, 可能不一致(需要加ckp可以保证大部分没有问题)), 一目了然 | ||
## 解决的问题? | ||
1. 数据库管理 (console) | ||
2. 数据热迁移 (bucket粒度) | ||
3. 大数据量 (分库分表, map) | ||
## bucket状态管理 | ||
1. 全局bucket 状态 | ||
2. 单个bucket 读写状态 | ||
3. bucket 迁移后状态变更 | ||
4. 多实例bucket状态实时共享 | ||
5. 热点bucket 调度(28原则,80%的bucket是低频的) | ||
## 进度 | ||
2021-04-14: | ||
DONE | ||
1. 新增虚拟表vtable, 屏蔽bucket细节, 业务操作可以自动定位到bucket | ||
2. 新增topology 拓扑定义, 用于定义bucket的数据逻辑划分和数据物理映射 | ||
3. bucket 数据逻辑划分, 支持hash, group 等类型(哪些数据在同一个bucket中) | ||
4. bucket 数据物理映射, 支持hash, range, single(传统单表类型) 等类型(数据存储到哪个db的哪个collection中) | ||
5. 新增bucketReMap 命令, 用于重新做数据物理映射(涉及全量复制, 增量同步, 数据校验) | ||
6. 新增bucket_meta 元信息, 用于记录bucket 状态, 锁, 读写次数, 统计热点bucket等 | ||
7. 新增bucketScan 命令, 遍历一个bucket, 执行对应的callback | ||
8. 网盘demo演示, 操作3种不同类型的数据, stor, file, share | ||
TODO | ||
1. app多实例, bucket状态同步一致性 (难点) | ||
2. bucketSwitch(oldbucket, newbucket) 命令, 用于bucket 热更新(MVCC), remap后的滚动升级, 业务切换bucket (难点) (切换topology ?) | ||
3. bucketMigrate(bucket, newlocation): bucket迁移, 比如热数据迁移到高性能SSD | ||
4. bucket 调度管理, 由于bucket 粒度最小, 对多机房的场景, 运维起来会很麻烦, 可能需要增加bucket_tag, 根据tag 对bucket数据进行批量调度 | ||
5. key 自动加索引(select) | ||
问题? | ||
1. remap 操作会导致一个bucket的拆分和合并, 在业务层的表现就是, 某个item原来属于一个bucket, 突然就不在这个bucket里面了, 也有可能原来不在, 突然就在了, 通常出现在hash模式下(一般group模式已用户为组,不会变动) | ||
2021-04-21 | ||
1. 新增自动创建bucket key索引 | ||
2. 新增onBucketMigrate, onBucketRemap两个事件用于热更新bucket数据库连接和拓扑信息 | ||
3. 通过定时轮询数据库的方式更新元信息 | ||
4. 新增migrate_hot, remap demo | ||
5. 新增bucket_tag, 根据tag 对bucket数据进行批量调度根据 | ||
TODO: | ||
1. sdk 服务化demo(内嵌SDK的方式无法大规模使用, 可通过服务化更好的运维和升级sdk) | ||
2. 尝试在法律咨询项目中, 引入bucket [x] | ||
3. 代码结构优化, 健壮性, 易用性 | ||
4. 新增vtable 索引创建 [x] | ||
## 性能 | ||
单机: 2G | ||
1. ops: 337/s | ||
2. mongostat | ||
``` | ||
*0 *0 *0 *0 0 1|0 0.2% 10.0% 0 2.21G 339M 0|0 1|0 535b 43.6k 20 rs01 PRI Apr 23 09:10:17.166 | ||
66 98 64 *0 82 126|0 0.7% 10.0% 0 2.20G 339M 0|1 2|1 214k 279k 31 rs01 PRI Apr 23 09:10:18.182 | ||
259 298 257 *0 350 354|0 0.8% 10.0% 0 2.20G 339M 0|0 1|0 756k 897k 31 rs01 PRI Apr 23 09:10:19.174 | ||
insert query update delete getmore command dirty used flushes vsize res qrw arw net_in net_out conn set repl time | ||
316 298 319 *0 409 483|0 0.9% 10.1% 0 2.20G 339M 0|1 2|1 947k 1.07m 31 rs01 PRI Apr 23 09:10:20.166 | ||
329 290 327 *0 374 434|0 0.9% 10.2% 0 2.20G 339M 0|1 1|1 893k 1.04m 31 rs01 PRI Apr 23 09:10:21.183 | ||
342 393 342 *0 384 455|0 1.0% 10.2% 0 2.20G 339M 0|0 1|0 960k 1.11m 31 rs01 PRI Apr 23 09:10:22.175 | ||
353 307 353 *0 403 439|0 1.2% 9.9% 0 2.20G 338M 0|0 2|1 932k 1.10m 31 rs01 PRI Apr 23 09:10:23.185 | ||
341 386 342 *0 405 500|0 1.2% 10.0% 0 2.20G 337M 0|0 1|4 1.01m 1.14m 31 rs01 PRI Apr 23 09:10:24.169 | ||
352 337 352 *0 460 501|0 1.3% 10.1% 0 2.20G 337M 0|0 4|3 1.02m 1.17m 31 rs01 PRI Apr 23 09:10:25.193 | ||
363 356 361 *0 421 522|0 1.4% 10.2% 0 2.20G 337M 0|0 1|2 1.04m 1.18m 31 rs01 PRI Apr 23 09:10:26.171 | ||
341 372 341 *0 444 452|0 1.4% 10.2% 0 2.20G 337M 0|0 1|6 972k 1.14m 31 rs01 PRI Apr 23 09:10:27.168 | ||
276 240 278 *0 307 391|0 1.5% 10.3% 0 2.20G 337M 0|3 2|1 772k 881k 31 rs01 PRI Apr 23 09:10:28.182 | ||
359 363 359 *0 428 506|0 1.5% 10.3% 0 2.20G 337M 0|1 1|1 1.03m 1.18m 32 rs01 PRI Apr 23 09:10:29.180 | ||
insert query update delete getmore command dirty used flushes vsize res qrw arw net_in net_out conn set repl time | ||
347 374 347 *0 391 489|0 1.5% 10.3% 0 2.20G 337M 0|1 1|0 992k 1.13m 32 rs01 PRI Apr 23 09:10:30.178 | ||
367 330 366 *0 353 463|0 0.4% 10.0% 0 2.20G 341M 0|0 2|0 954k 1.10m 32 rs01 PRI Apr 23 09:10:31.172 | ||
351 375 350 *0 440 477|0 0.5% 10.1% 0 2.20G 340M 0|0 1|3 1.00m 1.16m 32 rs01 PRI Apr 23 09:10:32.174 | ||
349 333 351 *0 423 481|0 0.5% 10.2% 0 2.20G 340M 0|3 1|1 986k 1.13m 32 rs01 PRI Apr 23 09:10:33.183 | ||
372 362 371 *0 408 462|0 0.6% 10.2% 0 2.20G 340M 0|0 2|1 987k 1.16m 32 rs01 PRI Apr 23 09:10:34.174 | ||
346 371 346 *0 438 484|0 0.7% 9.9% 0 2.20G 338M 0|0 1|1 1.00m 1.16m 32 rs01 PRI Apr 23 09:10:35.172 | ||
342 318 342 *0 384 416|0 0.8% 10.0% 0 2.21G 338M 0|0 2|0 899k 1.07m 32 rs01 PRI Apr 23 09:10:36.176 | ||
359 377 360 *0 463 527|0 0.9% 10.1% 0 2.21G 338M 0|0 1|0 1.07m 1.21m 32 rs01 PRI Apr 23 09:10:37.169 | ||
355 360 354 *0 387 468|0 0.9% 10.2% 0 2.21G 338M 0|2 1|1 972k 1.12m 32 rs01 PRI Apr 23 09:10:38.172 | ||
319 311 319 *0 437 430|0 1.0% 10.2% 0 2.21G 338M 0|0 1|3 912k 1.07m 32 rs01 PRI Apr 23 09:10:39.180 | ||
insert query update delete getmore command dirty used flushes vsize res qrw arw net_in net_out conn set repl time | ||
234 222 234 *0 271 316|0 1.0% 10.3% 0 2.21G 339M 0|1 1|0 648k 765k 32 rs01 PRI Apr 23 09:10:40.187 | ||
350 367 351 *0 393 459|0 1.1% 10.3% 0 2.21G 338M 0|0 2|0 964k 1.12m 32 rs01 PRI Apr 23 09:10:41.181 | ||
341 329 340 *0 417 477|0 1.2% 10.4% 0 2.21G 338M 0|0 1|2 973k 1.11m 32 rs01 PRI Apr 23 09:10:42.180 | ||
325 350 326 *0 408 464|0 1.2% 10.5% 0 2.21G 338M 0|1 1|0 951k 1.09m 32 rs01 PRI Apr 23 09:10:43.168 | ||
364 331 363 *0 450 472|0 1.3% 10.5% 0 2.21G 338M 0|0 1|0 997k 1.17m 32 rs01 PRI Apr 23 09:10:44.168 | ||
358 372 358 *0 443 506|0 1.4% 10.6% 0 2.21G 338M 0|0 1|0 1.03m 1.19m 32 rs01 PRI Apr 23 09:10:45.178 | ||
347 356 347 *0 418 482|0 1.5% 10.7% 0 2.21G 338M 0|0 1|3 990k 1.14m 32 rs01 PRI Apr 23 09:10:46.166 | ||
374 357 375 *0 454 508|0 1.5% 10.3% 1 2.21G 338M 0|1 1|0 1.05m 1.21m 32 rs01 PRI Apr 23 09:10:47.179 | ||
95 55 94 *0 110 138|0 1.5% 10.3% 0 2.21G 337M 0|0 1|0 261k 327k 32 rs01 PRI Apr 23 09:10:48.164 | ||
*0 *0 *0 *0 0 5|0 1.5% 10.3% 0 2.21G 337M 0|0 1|0 1.13k 46.0k 32 rs01 PRI Apr 23 09:10:49.163 | ||
``` | ||
2021-04-27 | ||
1. 业务接口vtable新增count, list | ||
2. 新增unittest,benchmark(单机1U2G内存, 337op/s) | ||
3. 运维接口新增同步(全量/增量)bucket数据到elasticsearch, vTableToEsFull, vTableToEsChange | ||
4. 运维接口新增索引创建, indexAllBucket, dropIndexAllBucket | ||
2021-05-08 | ||
1. 新增bucket.updateMany | ||
2. 新增vtable.searchFullText, vtable.searchFieldText 接入Es搜索 | ||
3. 修复BUGx2, bucket名称错误, 增量同步ES | ||
TODO | ||
1. 代码结构优化, 健壮性, 异常处理等 | ||
2. bucket_meta 统计增加缓存, 减少直接操作DB | ||
3. bucket-proxy, grpc 多语言支持 [x] | ||
2021-05-14 | ||
1. 新增bucket-proxy 代理服务, 使用grpc 协议支持多种语言客户端 | ||
2. 新增bucket-js, bucket-go, bucket-dotnet 客户端DEMO | ||
3. 新增vtable.searchFullText, vtable.searchFieldText 接入Es搜索 | ||
TODO | ||
1. bucket-proxy异常逻辑完善, 平台鉴权等 | ||
2. bucket-meta 性能优化 |
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
Empty package
Supply chain riskPackage does not contain any code. It may be removed, is name squatting, or the result of a faulty package publish.
Found 1 instance in 1 package
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
New author
Supply chain riskA new npm collaborator published a version of the package for the first time. New collaborators are usually benign additions to a project, but do indicate a change to the security surface area of a package.
Found 1 instance in 1 package
License Policy Violation
LicenseThis package is not allowed per your license policy. Review the package's license to ensure compliance.
Found 1 instance in 1 package
Filesystem access
Supply chain riskAccesses the file system, and could potentially read sensitive data.
Found 1 instance in 1 package
225
1
13021
6
2
0
3
+ Addedgrpc@^1.24.9
+ Addedgrpc-tools@^1.11.1
+ Added@mapbox/node-pre-gyp@1.0.11(transitive)
+ Added@types/bytebuffer@5.0.49(transitive)
+ Added@types/long@3.0.32(transitive)
+ Addedabbrev@1.1.1(transitive)
+ Addedagent-base@6.0.2(transitive)
+ Addedansi-regex@2.1.15.0.1(transitive)
+ Addedaproba@2.0.0(transitive)
+ Addedare-we-there-yet@2.0.0(transitive)
+ Addedascli@1.0.1(transitive)
+ Addedbalanced-match@1.0.2(transitive)
+ Addedbrace-expansion@1.1.11(transitive)
+ Addedbytebuffer@5.0.1(transitive)
+ Addedcamelcase@2.1.1(transitive)
+ Addedchownr@2.0.0(transitive)
+ Addedcliui@3.2.0(transitive)
+ Addedcode-point-at@1.1.0(transitive)
+ Addedcolor-support@1.1.3(transitive)
+ Addedcolour@0.7.1(transitive)
+ Addedconcat-map@0.0.1(transitive)
+ Addedconsole-control-strings@1.1.0(transitive)
+ Addeddecamelize@1.2.0(transitive)
+ Addeddelegates@1.0.0(transitive)
+ Addeddetect-libc@2.0.3(transitive)
+ Addedemoji-regex@8.0.0(transitive)
+ Addedfs-minipass@2.1.0(transitive)
+ Addedfs.realpath@1.0.0(transitive)
+ Addedgauge@3.0.2(transitive)
+ Addedglob@7.2.3(transitive)
+ Addedgrpc@1.24.11(transitive)
+ Addedgrpc-tools@1.12.4(transitive)
+ Addedhas-unicode@2.0.1(transitive)
+ Addedhttps-proxy-agent@5.0.1(transitive)
+ Addedinflight@1.0.6(transitive)
+ Addedinvert-kv@1.0.0(transitive)
+ Addedis-fullwidth-code-point@1.0.03.0.0(transitive)
+ Addedlcid@1.0.0(transitive)
+ Addedlodash.camelcase@4.3.0(transitive)
+ Addedlodash.clone@4.5.0(transitive)
+ Addedlong@3.2.0(transitive)
+ Addedmake-dir@3.1.0(transitive)
+ Addedminimatch@3.1.2(transitive)
+ Addedminipass@3.3.65.0.0(transitive)
+ Addedminizlib@2.1.2(transitive)
+ Addedmkdirp@1.0.4(transitive)
+ Addednan@2.22.0(transitive)
+ Addednode-fetch@2.7.0(transitive)
+ Addednopt@5.0.0(transitive)
+ Addednpmlog@5.0.1(transitive)
+ Addednumber-is-nan@1.0.1(transitive)
+ Addedobject-assign@4.1.1(transitive)
+ Addedonce@1.4.0(transitive)
+ Addedoptjs@3.2.2(transitive)
+ Addedos-locale@1.4.0(transitive)
+ Addedpath-is-absolute@1.0.1(transitive)
+ Addedprotobufjs@5.0.3(transitive)
+ Addedreadable-stream@3.6.2(transitive)
+ Addedrimraf@3.0.2(transitive)
+ Addedsemver@6.3.17.6.3(transitive)
+ Addedset-blocking@2.0.0(transitive)
+ Addedsignal-exit@3.0.7(transitive)
+ Addedstring-width@1.0.24.2.3(transitive)
+ Addedstring_decoder@1.3.0(transitive)
+ Addedstrip-ansi@3.0.16.0.1(transitive)
+ Addedtar@6.2.1(transitive)
+ Addedtr46@0.0.3(transitive)
+ Addedwebidl-conversions@3.0.1(transitive)
+ Addedwhatwg-url@5.0.0(transitive)
+ Addedwide-align@1.1.5(transitive)
+ Addedwindow-size@0.1.4(transitive)
+ Addedwrap-ansi@2.1.0(transitive)
+ Addedwrappy@1.0.2(transitive)
+ Addedy18n@3.2.2(transitive)
+ Addedyallist@4.0.0(transitive)
+ Addedyargs@3.32.0(transitive)