Upstash Vector Python SDK
The Upstash Vector Python client
[!NOTE]
This project is in GA Stage.
The Upstash Professional Support fully covers this project. It receives regular updates, and bug fixes.
The Upstash team is committed to maintaining and improving its functionality.
Installation
Install a released version from pip:
pip3 install upstash-vector
Usage
In order to use this client, head out to Upstash Console and create a vector database.
There, get the UPSTASH_VECTOR_REST_URL
and the UPSTASH_VECTOR_REST_TOKEN
from the dashboard.
Initializing the Index
from upstash_vector import Index
index = Index(url=UPSTASH_VECTOR_REST_URL, token=UPSTASH_VECTOR_REST_TOKEN)
or alternatively, initialize from the environment variables
export UPSTASH_VECTOR_REST_URL [URL]
export UPSTASH_VECTOR_REST_TOKEN [TOKEN]
from upstash_vector import Index
index = Index.from_env()
Upsert Vectors
Vectors can be upserted(inserted or updated) into a namespace of an index
to be later queried or fetched.
There are a couple of ways of doing upserts:
index.upsert(
vectors=[
("id1", [0.1, 0.2], {"metadata_field": "metadata_value"}, "data-value"),
("id2", [0.2, 0.2], {"metadata_field": "metadata_value"}),
("id3", [0.3, 0.4]),
]
)
index.upsert(
vectors=[
{"id": "id4", "vector": [0.1, 0.2], "metadata": {"field": "value"}, "data": "value"},
{"id": "id5", "vector": [0.1, 0.2], "metadata": {"field": "value"}},
{"id": "id6", "vector": [0.1, 0.2], "data": "value"},
{"id": "id7", "vector": [0.5, 0.6]},
]
)
from upstash_vector import Vector
index.upsert(
vectors=[
Vector(id="id5", vector=[1, 2], metadata={"field": "value"}),
Vector(id="id6", vector=[1, 2], data="value"),
Vector(id="id7", vector=[6, 7]),
]
)
If the index is created with an embedding model, raw string data can be upserted.
In this case, the data
field of the vector will also be set to the data
passed
below, so that it can be accessed later.
from upstash_vector import Data
res = index.upsert(
vectors=[
Data(id="id5", data="Goodbye World", metadata={"field": "value"}),
Data(id="id6", data="Hello World"),
]
)
Also, a namespace can be specified to upsert vectors into it.
When no namespace is provided, the default namespace is used.
index.upsert(
vectors=[
("id1", [0.1, 0.2]),
("id2", [0.3,0.4]),
],
namespace="ns",
)
Query Vectors
Some number of vectors that are approximately most similar to a given
query vector can be requested from a namespace of an index.
res = index.query(
vector=[0.6, 0.9],
top_k=5,
include_vectors=False,
include_metadata=True,
include_data=True,
filter="metadata_f = 'metadata_v'"
)
for r in res:
print(
r.id,
r.score,
r.vector,
r.metadata,
r.data,
)
If the index is created with an embedding model, raw string data can be queried.
res = index.query(
data="hello",
top_k=5,
include_vectors=False,
include_metadata=True,
include_data=True,
)
When a filter is provided, query results are further narrowed down based
on the vectors whose metadata matches with it.
See Metadata Filtering documentation
for more information regarding the filter syntax.
Also, a namespace can be specified to query from.
When no namespace is provided, the default namespace is used.
res = index.query(
vector=[0.6, 0.9],
top_k=5,
namespace="ns",
)
Fetch Vectors
A set of vectors can be fetched from a namespace of an index.
res = index.fetch(
ids=["id3", "id4"],
include_vectors=False,
include_metadata=True,
include_data=True,
)
for r in res:
if not r:
continue
print(
r.id,
r.vector,
r.metadata,
r.data,
)
or, for singular fetch:
res = index.fetch(
"id1",
include_vectors=True,
include_metadata=True,
include_data=False,
)
r = res[0]
if r:
print(
r.id,
r.vector,
r.metadata,
r.data,
)
Also, a namespace can be specified to fetch from.
When no namespace is provided, the default namespace is used.
res = index.fetch(
ids=["id3", "id4"],
namespace="ns",
)
Range Over Vectors
The vectors upserted into a namespace of an index can be scanned
in a page by page fashion.
res = index.range(
cursor="",
limit=100,
include_vectors=False,
include_metadata=True,
include_data=True,
)
while res.next_cursor != "":
res = index.range(
cursor=res.next_cursor,
limit=100,
include_vectors=False,
include_metadata=True,
include_data=True,
)
for v in res.vectors:
print(
v.id,
v.vector,
v.metadata,
v.data,
)
Also, a namespace can be specified to range from.
When no namespace is provided, the default namespace is used.
res = index.range(
cursor="",
limit=100,
namespace="ns",
)
Delete Vectors
A list of vectors can be deleted from a namespace of index.
If no such vectors with the given ids exist, this is no-op.
res = index.delete(
ids=["id1", "id2"],
)
print(
res.deleted,
)
or, for singular deletion:
res = index.delete(
"id1",
)
print(res)
Also, a namespace can be specified to delete from.
When no namespace is provided, the default namespace is used.
res = index.delete(
ids=["id1", "id2"],
namespace="ns",
)
Update a Vector
Either the vector value(or data for indexes created with an embedding model) or the metadata
can be updated without needing to set the other one.
res = index.update(
"id1",
metadata={"new_field": "new_value"},
)
print(res)
Also, a namespace can be specified to update from.
When no namespace is provided, the default namespace is used.
res = index.update(
"id1",
metadata={"new_field": "new_value"},
namespace="ns",
)
Reset the Namespace
All vectors can be removed from a namespace of an index.
index.reset()
Also, a namespace can be specified to reset.
When no namespace is provided, the default namespace is used.
index.reset(
namespace="ns",
)
All namespaces under the index can be reset with a single call
as well.
index.reset(
all=True,
)
Index Info
Some information regarding the status and type of the index can be requested.
This information also contains per-namespace status.
info = index.info()
print(
info.vector_count,
info.pending_vector_count,
info.index_size,
info.dimension,
info.similarity_function,
)
for ns, ns_info in info.namespaces.items():
print(
ns,
ns_info.vector_count,
ns_info.pending_vector_count,
)
List Namespaces
All the names of active namespaces can be listed.
namespaces = index.list_namespaces()
for ns in namespaces:
print(ns)
Delete a Namespace
A namespace can be deleted entirely.
If no such namespace exists, and exception is raised.
The default namespaces cannot be deleted.
index.delete_namespace(namespace="ns")
Contributing
Preparing the environment
This project uses Poetry for packaging and dependency management. Make sure you are able to create the poetry shell with relevant dependencies.
You will also need a vector database on Upstash.
poetry install
Code Formatting
poetry run ruff format .
Running tests
To run all the tests, make sure the poetry virtual environment activated with all
the necessary dependencies.
Create two Vector Stores on upstash. First one should have 2 dimensions. Second one should use an embedding model. Set the necessary environment variables:
URL=****
TOKEN=****
EMBEDDING_URL=****
EMBEDDING_TOKEN=****
Then, run the following command to run tests:
poetry run pytest