AwaDB - AI Native Database for embedding vectors
Easily Use - No boring database schema definition. No need to pay attention to vector indexing details.
Realtime Search - Lock free realtime index keeps new data fresh with millisecond level latency. No wait no manual operation.
Stability - AwaDB builds upon over 4 years experience at JD.com running production workloads at scale using a system called Vearch, combined with best-of-breed ideas and practices from the community.
Run awadb locally on Mac OSX or Linux
First install awadb:
pip3 install awadb'
Then use as below:
import awadb
awadb_client = awadb.Client()
awadb_client.Create("test_llm1")
awadb_client.Add([{'embedding_text':'The man is happy'}, {'source' : 'pic1'}])
awadb_client.Add([{'embedding_text':'The man is very happy'}, {'source' : 'pic2'}])
awadb_client.Add([{'embedding_text':'The cat is happy'}, {'source' : 'pic3'}])
awadb_client.Add([{'embedding_text':'The man is eating'}, {'source':'pic4'}])
query = "The man is happy"
results = awadb_client.Search(query, 3)
print(results)
Here the text is embedded by SentenceTransformer which is supported by Hugging Face
More detailed python local library usage you can read here
Run AwaDB as a service
If you are on the Windows platform or want a awadb service, you can download and deploy the awadb docker.
The installation of awadb docker please see here
First, Install gRPC and awadb service python client as below:
pip3 install grpcio
pip3 install awadb-client
A simple example as below:
from awadb_client import Awa
client = Awa()
client.add("example1", {'name':'david', 'feature':[1.3, 2.5, 1.9]})
client.add("example1", {'name':'jim', 'feature':[1.1, 1.4, 2.3]})
results = client.search("example1", [1.0, 2.0, 3.0])
print(results)
db_name: "default"
table_name: "example1"
results {
total: 2
msg: "Success"
result_items {
score: 0.860000074
fields {
name: "_id"
value: "64ddb69d-6038-4311-9118-605686d758d9"
}
fields {
name: "name"
value: "jim"
}
}
result_items {
score: 1.55
fields {
name: "_id"
value: "f9f3035b-faaf-48d4-a947-801416c005b3"
}
fields {
name: "name"
value: "david"
}
}
}
result_code: SUCCESS
More python sdk for service is here
More detailed quick start examples you can find here
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "docs":[{"_id":1, "name":"lj", "age":23 "f":[1,0]},{"_id":2, "name":"david", "age":32, "f":[1,2]}]}' http://localhost:8080/add
curl -H "Content-Type: application/json" -X POST -d '{"db":"default", "table":"test", "vector_query":{"f":[1, 1]}}' http://localhost:8080/search
More detailed RESTful API is here
What are the Embeddings?
Any unstructured data(image/text/audio/video) can be transferred to vectors which are generally understanded by computers through AI(LLMs or other deep neural networks).
For example, "The man is happy"-this sentence can be transferred to a 384-dimension vector(a list of numbers [0.23, 1.98, ....]
) by SentenceTransformer language model. This process is called embedding.
More detailed information about embeddings can be read from OpenAI
Awadb uses Sentence Transformers to embed the sentence by default, while you can also use OpenAI or other LLMs to do the embeddings according to your needs.
Combined with LLMs(here use LLaMa and ChatGLM) By LangChain
Examples of combining LLaMa or quantized Alpaca with llama.cpp to do local knowledge database please see here
Examples of combining ChatGLM to do local knowledge database please see here
Get involved
License
Apache 2.0
Join the AwaDB community to share any problem, suggestion, or discussion with us: