![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
org.brutusin:flea-db
Advanced tools
A java library for creating standalone, portable, schema-full object databases supporting pagination and faceted search, and offering strong-typed and generic APIs.
Built on top of Apache Lucene.
Main features:
Table of Contents:
##Motivation
##Maven dependency
<dependency>
<groupId>org.brutusin</groupId>
<artifactId>flea-db</artifactId>
</dependency>
Click here to see the latest available version released to the Maven Central Repository.
If you are not using maven and need help you can ask here.
##APIs
All flea-db
functionality is defined by FleaDB
interface.
The library provides two implementations for it:
GenericFleaDB
.ObjectFleaDB
built on top of the previous one.###GenericFleaDB
GenericFleaDB
is the lowest level flea-db implementation that defines the database schema using a JSON schema and stores and indexes records of type JsonNode
. It uses Apache Lucene APIs and org.brutusin:json
SPI to maintain two different indexes (one for the terms and other for the taxonomy, see index structure), hyding the underlying complexity from the user perspective.
This is how it works:
JsonSchema
and an index folder are passed depending on whether the database is new and/or persistent. Then the JSON schema (passed or readed from the existing database flea.json
descriptor file) is processed, looking for its index
properties, and finally a database schema is created.JsonNode
record is validated against the JSON schema. Then a JsonTransformer
instance (making use of the processed database schema) transforms the records in terms understandable by Lucene (documents, fields, facet fields ...) and finally the storage is delegated to the Lucene API.Query
and Sort
objects are transformed into terms understandable by Lucene making use of the database schema. The returned paginator is basically a wrapper around the underlying luecene IndexSearcher
and Query
objects that lazily (on demand) performs searches to the index.###ObjectFleaDB
ObjectFleaDB
is built on top of GenericFleaDB
.
Basically an ObjectFleaDB
delegates all its functionality to a wrapped GenericFleaDB
instance, making use of org.brutusin:json
to perform transformations POJO<->JsonNode
and Class<->JsonSchema
. This is the reason why all flea-db
databases can be used with GenericFleaDB
.
###JSON SPI
As cited before, this library makes use of the org.brutusin:json
, so a JSON service provider like json-provider
is needed at runtime. The choosen provider will determine JSON serialization, validation, parsing, schema generation and expression semantics.
###JSON Schema extension
Standard JSON schema specification has been extended to declare indexable properties ("index":"index"
and "index":"facet"
options). See annotations section for more details.
Example:
{
"type": "object",
"properties": {
"age": {
"type": "integer",
"index": "index"
},
"category": {
"type": "string",
"index": "facet"
}
}
}
"index":"index"
: Means that the property is indexed by Lucene under a field with name set according to the rules explained in nomenclature section."index":"facet"
: Means that the property is indexed as in the previous case, but also a facet is created with this field name.###Annotations See documentation in JSON SPI for supported annotations used in the strong-typed scenario.
###Indexed fields nomenclature
Databases are self descriptive, they provide information of their schema and indexed fields (via Schema
).
Field semantics are inherited from the expression semantics defined in the org.brutusin:json-provider
Supose JsonNode node
to be stored and let fieldId
be the expression identifying a database field, according to the previous section.
Expression exp = JsonCodec.getInstance().compile(fieldId);
JsonSchema fieldSchema = exp.projectSchema(rootSchema);
JsonNode fieldNode = exp.projectNode(node);
Then, the following rules apply to extract index and facet values for that field:
fieldSchema | index:index | index:facet |
---|---|---|
String | fieldNode.asString() | fieldNode.asString() |
Boolean | fieldNode.asString() | fieldNode.asString() |
Integer | fieldNode.asLong() | Unsupported |
Number | fieldNode.asDouble() | Unsupported |
Object | each of its property names | each of its property names |
Array | recurse for each of its elements | recurse for each of its elements |
##Usage
Databases can be created in RAM memory or in disk, depending on the addressed problem characteristics (performance, dataset size, indexation time ...).
In order to create a persistent database, a constructor(s) with a File
argument has to be choosen:
Flea db1 = new GenericFleaDB(indexFolder, jsonSchema);
// or
Flea db2 = new ObjectFleaDB(indexFolder, Record.class);
NOTE: Multiple instances can be used to read the same persistent database (for example different concurrent JVM executions), but only one can hold the writing file-lock (claimed the first time a write method is called).
On the other side, the database will be kept in RAM memory and lost at the end of the JVM execution.
Flea db1 = new GenericFleaDB(jsonSchema);
// or
Flea db2 = new ObjectFleaDB(Record.class);
The following operations perform modifications on the database.
In order to store a record the store(...)
method has to be used:
db1.store(jsonNode);
// or
db2.store(record);
internally this ends up calling addDocument
in the underlying Lucene IndexWriter
.
The API enables to delete a set of records using delete(Query q)
.
NOTE: Due to Lucene facet internals, categories are never deleted from the taxonomy index, despite of being orphan.
Previous operations (store and delete) are not (and won't ever be) visible until commit()
is called. Underlying seachers and writers are released, to be lazily created in further read or write operations.
Databases can be optimized in order to achieve a better performance by using optimize()
. This method triggers a highly costly (in terms of free disk space needs and computation) merging of the Lucene index segments into a single one.
Nevertheless, this operation is useful for immutable databases, that can be once optimized prior its usage.
Two kind of read operations can be performed, both supporting a Query argument, that defines the search criteria.
Record queries can be paginated and the ordering of the results can be specified via a Sort argument.
public E getSingleResult(final Query q)
public Paginator<E> query(final Query q)
public Paginator<E> query(final Query q, final Sort sort)
FacetResponse
represents the faceting info returned by the database.
public List<FacetResponse> getFacetValues(final Query q, FacetMultiplicities activeFacets)
public List<FacetResponse> getFacetValues(final Query q, int maxFacetValues)
public List<FacetResponse> getFacetValuesStartingWith(String facetName, String prefix, Query q, int max)
public int getNumFacetValues(Query q, String facetName)
public double getFacetValueMultiplicity(String facetName, String facetValue, Query q)
Faceting is provided by lucene-facet.
Databases must be closed after its usage, via close()
method in order to free the resources and locks hold. Closing a database makes it no longer usable.
##Threading issues Both implementations are thread safe and can be shared across multiple threads.
##Index structure Persistent flea-db databases create the following index structure:
/flea-db/
|-- flea.json
|-- record-index
| |-- ...
|-- taxonomy-index
| |-- ...
being flea.json
the database descriptor containing its schema, and being record-index
and taxonomy-index
subfolders the underlying Lucene index structures.
##ACID properties
flea-db
offers the following ACID properties, inherited from Lucene ones:
##Examples: Generic API:
// Generic interaction with a previously created database
FleaDB<JsonNode> db = new GenericFleaDB(indexFolder);
// Store records
JsonNode json = JsonCodec.getInstance.parse("...");
db.store(json);
db.commit();
// Query records
Query q = Query.createTermQuery("$.id", "0");
Paginator<JsonRecord> paginator = db.query(q);
int totalPages = paginator.getTotalPages(pageSize);
for (int i = 1; i <= totalPages; i++) {
List<JsonRecord> page = paginator.getPage(i, pageSize);
for (int j = 0; j < page.size(); j++) {
JsonRecord json = page.get(j);
System.out.println(json);
}
}
db.close();
Strong-typed API:
// Create object database
FleaDB<Record> db = new ObjectFleaDB(indexFolder, Record.class);
// Store records
for (int i = 0; i < REC_NO; i++) {
Record r = new Record();
// ... populate record
db.store(r);
}
db.commit();
// Query records
Query q = Query.createTermQuery("$.id", "0");
Paginator<Record> paginator = db.query(q);
int totalPages = paginator.getTotalPages(pageSize);
for (int i = 1; i <= totalPages; i++) {
List<Record> page = paginator.getPage(i, pageSize);
for (int j = 0; j < page.size(); j++) {
Record r = page.get(j);
System.out.println(r);
}
}
db.close();
See available test classes for more examples.
##Main stack This module could not be possible without:
##Lucene version
4.10.3
(Dec, 2014)
https://github.com/brutusin/flea-db/issues
Contributions are always welcome and greatly appreciated!
##License Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0
FAQs
A tiny, embeddable, schema-full, JSON database
We found that org.brutusin:flea-db demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.