Research
Security News
Quasar RAT Disguised as an npm Package for Detecting Vulnerabilities in Ethereum Smart Contracts
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
@graphitation/apollo-forest-run
Advanced tools
**Experimental** custom cache for Apollo client using indexing and diffing for data syncing (instead of normalization). Aims to be a drop-in replacement* for Apollo InMemoryCache.
Experimental custom cache for Apollo client using indexing and diffing for data syncing (instead of normalization). Aims to be a drop-in replacement* for Apollo InMemoryCache.
Most GraphQL clients use "normalization" as a method to keep data in cache up-to date. However, normalization is not a silver bullet and has associated costs: performance overhead, necessity of garbage collection, consistency pitfalls due to partial nature of data in GraphQL results, as well as increased overall complexity of GraphQL client cache (read about normalization pitfalls to learn more).
Forest Run explores an alternative to "normalization" that doesn't require central normalized store to keep data up to date. It aims to be a drop-in-replacement for Apollo InMemoryCache (with some restrictions)
Expected benefits over InMemoryCache:
We view GraphQL clients as basically emulations of "live queries" running locally. Clients keep those "local live queries" up-to date by syncing with overlapping data from other incoming GraphQL operations (mutations, subscriptions, other queries).
In this view of the world, normalized store is primarily an implementation detail of this "syncing" mechanism. It is not a source of truth or data storage for local state. Or at least, caching is not a primary purpose of GraphQL client.
Unlike normalized caches where data is stored in a single flat object and then propagated to "local live queries" via cache reads, Forest Run preserves data in the original shape as it comes from GraphQL server, and attempts to keep results in sync for all "local live queries" on incoming writes.
All heavy lifting is happening where you naturally expect it: during writes.
Reads are O(1)
lookups by a key (with a few exceptions).
So how can data syncing work without normalization?
In ForestRun data syncing is a 3-step process:
This separation into steps is a deliberate design decision allowing some further optimizations and features, like building indexes on the server or offloading both indexing and diffing to a worker thread.
In addition, it improves overall debuggability and traceability because it is possible to capture history of every GraphQL result.
Let's briefly walk through every step individually.
Indexing is a process of remembering positions of individual nodes (objects having identity) inside operation result.
Imagine we are querying a post with a list of comments:
query {
post(id: "p1") {
__typename
id
title
comments {
__typename
id
text
}
}
}
And receive the following data from the server:
{
post: {
__typename: `Post`,
id: `p1`,
title: `My first post`,
comments: [
{
__typename: "Comment",
id: "c1",
text: "hello"
},
{
__typename: "Comment",
id: "c2",
text: "world"
}
]
}
}
Indexing algorithm produces a data structure, containing all the necessary information for comparing and updating objects on incoming changes (note: this is only a high-level, conceptual overview):
const index = {
"Post:p1": [
{
data: {
// the actual reference to the Post object from GraphQL result
},
selection: {
// descriptor of Post fields (produced from GraphQL AST)
},
parent: "ROOT_QUERY", // reference to parent object within the operation
},
],
"Comment:c1": [
{
data: {
/*...*/
},
selection: {
// descriptor of Comment fields - the same instance for all comments
},
parent: "Post:p1",
},
],
"Comment:c2": [
{
data: {
/*...*/
},
selection: {
/*...*/
},
parent: "Post:p1",
},
],
};
At a first glance it may seem similar to normalization, but there are several important differences:
In addition to operation index, another compact index is created: Map<NODE_ID, OPERATION_ID[]>
which lets us locate
all operations, containing a "chunk" of some node. Those two structures help us quickly locate all node "chunks" from
different operations.
After indexing, we walk through all found ids and look for other "local live queries" that have this node in their index. For ids that are already present in cache, diffing process is performed which produces a normalized difference object with only changed fields.
This normalized difference is later used to update all cached operations containing node with this id.
Example. Imagine the author edits his comment c1
(see above) and replaces text hello
with hi
. So the following mutation is issued:
mutation {
editComment(id: "c1", text: "hi") {
__typename
id
text
}
}
And the incoming mutation result is:
const data = {
__typename: "Comment",
id: "c1",
text: "hi",
};
Now we need to sync the state of all our "local live queries" with this latest state.
After indexing this result, we see that it contains entity with id Comment:c1
.
Next we find that the index of our original Post query also contains entity with the same id
Comment:c1
(see above).
We can access entity object from both results quickly through respective indexes.
And so we diff
this specific entity representation with the same entity Comment:c1
stored in the original query
to get a normalized difference:
const c1CommentDifference = {
fields: {
text: {
oldValue: "hello",
newValue: "hi",
},
},
dirty: ["text"],
};
Normalized difference could be used to update multiple operations containing the same node to the latest state of the world.
Note: this is just a conceptual overview. Diffing is the most complex part of the implementation and has many nuances (embedded objects diffing, list diffing, abstract types diffing; missing fields, fields with errors, arguments, aliases, etc).
After diffing we get a normalized difference object for each incoming node. For every changed node we can find affected operations and apply the difference.
Process of updating produces a single copy of the "dirty" operation result. Only changed objects and lists are copied. Objects and lists that didn't change are recycled.
No unnecessary allocations, structured sharing is guaranteed.
In ForestRun, state of individual entity is spread across multiple GraphQL operation results. There is no single centralized state as in the normalized store. This may seem counter-intuitive, but it makes sense for the UI application, where true data consistency cannot be guaranteed and the actual source of truth lives somewhere else.
Having said that, it is quite convenient to have a single view into entity's current state across all chunks. This is achieved through aggregated entity views.
It is basically a short-living object that is created on demand (e.g. for diffing phase). This object is fed with different chunks of the same entity found in different GraphQL results (via indexing) and so is a convenience tool to represent single entity state from multiple chunks.
FAQs
**Experimental** custom cache for Apollo client using indexing and diffing for data syncing (instead of normalization). Aims to be a drop-in replacement* for Apollo InMemoryCache.
The npm package @graphitation/apollo-forest-run receives a total of 18 weekly downloads. As such, @graphitation/apollo-forest-run popularity was classified as not popular.
We found that @graphitation/apollo-forest-run demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket researchers uncover a malicious npm package posing as a tool for detecting vulnerabilities in Etherium smart contracts.
Security News
Research
A supply chain attack on Rspack's npm packages injected cryptomining malware, potentially impacting thousands of developers.
Research
Security News
Socket researchers discovered a malware campaign on npm delivering the Skuld infostealer via typosquatted packages, exposing sensitive data.