Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
@mapbox/carmen-cache
Advanced tools
carmen-cache
is a low-level storage layer used in the carmen geocoder.
To install carmen-cache
run:
yarn install
By default, binaries are provided for 64 bit OS X >= 10.8
and 64 bit Linux (>= Ubuntu Trusty)
. On those platforms no external dependencies are needed.
Other platforms will fall back to a source compile: see [Source Build](#Source Build) for details
To build from source run:
make
yarn test
This will automatically:
mason_packages
) via mason: bzip, rocksdb, protozero, and the clang++ compilerTo do a full rebuild run: make clean
See CONTRIBUTING for how to release a new carmen-cache version.
carmen-cache
doesWhen doing a forward (text -> coordinates) geocode, carmen goes through a few steps. The carmen README has a full rundown, but abbreviated for purposes of this module, carmen
does approximately as follows:
The steps in bold are implemented in this module in C++ for disk/memory compactness and speed; the rest are implemented in carmen
itself in Javascript, or in other carmen
dependencies.
As a concrete example, if a user were searching for "paris france," carmen might determine that the country
index contained the string "france" and the city
index contained multiple occurrences of "paris" (one for the real one in France, one for the one in Texas, etc.). carmen-cache would be responsible for retrieving the tile coordinates of all the tiles covering the country France, and the tile coordinates covering each Paris, and seeing which aligned with which; it would discover that the French Paris's tiles overlapped with some of the tiles in the France feature, making that combination a plausible stacking, while the Texas Paris's tiles would not align with France. It would return these results to carmen
for further verification.
carmen-cache
exposes two implementations of the same interface, one read-write version called MemoryCache
and one disk-based read-only version called RocksDBCache
build on Facebook's RocksDB. The read-write version is used during carmen
's index-building process, at the end of which it's serialized into the read-only version for storage. At query time, the read-only version is used instead, as it's both faster and more memory-efficient.
Carmen-cache knows about the following kinds of data:
test/grid.js
for an example of how the these 5-tuples should be packed into an integer representation.carmen-cache
can apply a penalty for results that match a given query in a language other than the one the user requested. carmen-cache
represents each language as a number, and has no internal concept of which real-world language each number maps to; it's carmen
's responsibility to perform that mapping.The MemoryCache
supports setting of a key (and optional list of language numbers) to a list of grid numbers. It also supports a pack
operation, which writes out the read-write form into a henceforth-read-only version encoded on disk as a RocksDB database.
Both versions support a list
operation to retrieve all keys, a get
operation to retrieve a grid list for a given key and language set, and a getMatching
operation that can do either or both of:
RocksDBCache
formatThe RocksDB representation of the cache condenses the data for on-disk storage as a RocksDB database. It is a key-value store:
Each key in the RocksDB database is a string, followed by a |
delimiter, followed by as many as 128 bits of data representing the language annotation. If this bit set is of zero length, it is interpreted as 2^128 - 1
(i.e., all 1
, which matches all languages). If it's any other length less than 128 bits, it's interpreted as the least significant (conceptually right-most) bits of a 128-bit integer (in other words, it's padded with 0
on the most significant/conceptually-left-most side).
Each value is a compact representation of the set of grid integers for a given set of grids. This representation is obtained by sorting the integer representations of the grids in descending order, delta-encoding them (that is, storing all values after the first as the difference between it and its predecessor), and packing them as variable-length integers into a protobuf
buffer. Reading from this structure operates in reverse, expanding out all values after the first subtractively, and can be done lazily.
The RocksDB
representation contains an additional optimization to assist in autocomplete queries: it precomputes combined sorted lists of grids automatically for fixed-length prefixes of length 3 and length 6, so as to reduce the number of seeks and reads necessary to calculate autocomplete results for very short autocomplete queries. These precomputed versions are stored with a key that begins with =1
or =2
(for shorter and longer prefixes, respectively), followed by the prefix string, followed by the |
delimiter and language bitmask as per usual. Prefixes include language annotations and are thus per-language-set just like other keys. This process is transparent to carmen
: these keys are calculated and populated automatically at pack
time, read automatically instead of reading the full grid
lists at getMatching
time if the requested key is sufficiently short, and hidden from, e.g., carmen
's list
operation.
carmen-cache
's coalesce
operation is what computes the possible stacking of combinations of substrings and returns the results to carmen. It can take advantage of the C++ threadpool to consider multiple possible stackings in parallel, and contains two implementations: coalesceSingle
and coalesceMulti
. The former handles cases where a given query could be satisfied in its entirety by a single index, whereas the latter considers multi-index interactions. coalesce
expects a set of phrasematch
objects (see carmen
's source for what they contain), and returns a set of coalesce results via callback to carmen
.
A brief diagrammatic overview of how coalesceMulti
works follows:
0.27.0
FAQs
C++ protobuf cache used by carmen
We found that @mapbox/carmen-cache demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 14 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.