DashVector Client Python Library
DashVector is a scalable and fully-managed vector-database service for building various machine learning applications. The DashVector client SDK is your gateway to access the DashVector service.
For more information about DashVector, please visit: https://help.aliyun.com/document_detail/2510225.html
Installation
To install the DashVector client Python SDK, simply run:
pip install dashvector
QuickStart
import numpy as np
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.create(name='quickstart', dimension=4)
assert rsp
collection = client.get(name='quickstart')
collection.insert(
[
dashvector.Doc(id=str(i), vector=np.random.rand(4), fields={'anykey': 'anyvalue'})
for i in range(16)
]
)
docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=5)
print(docs)
stats = collection.stats()
print(stats)
client.delete(name='quickstart')
Reference
Create a Client
Client
host various APIs for interacting with DashVector Collection
.
dashvector.Client(
api_key: str,
endpoint: str = 'dashvector.cn-hangzhou.aliyuncs.com',
protocal: dashvector.DashVectorProtocol = dashvector.DashVectorProtocol.GRPC,
timeout: float = 10.0
) -> Client
Parameters | Type | Required | Description |
---|
api_key | str | Yes | Your DashVector API-KEY |
endpoint | str | No | Service Endpoint. Default value: dashvector.cn-hangzhou.aliyuncs.com |
protocol | DashVectorProtocol | No | Communication protocol, support HTTP and GRPC. Default value: DashVectorProtocol.GRPC |
timeout | float | No | Timeout period (in seconds), -1 means no timeout. Default value: 10.0 |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
assert client
Create Collection
Client.create(
name: str,
dimension: int,
dtype: Union[Type[int], Type[float]] = float,
fields_schema: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None,
metric: str = 'cosine',
timeout: Optional[int] = None
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Collection to create. |
dimension | int | Yes | The dimensions of the Collection's vectors. Valid values: 1-20,000 |
dtype | Union[Type[int], Type[float]] | No | The date type of the Collection's vectors. Default value: Type[float] |
fields_schema | Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] | No | Fields schema of the Collection. Default value: None e.g. {"name": str, "age": int} |
metric | str | No | Vector similarity metric. For cosine , dtype must be float . Valid values: 1. (Default)cosine 2. dotproduct 3. euclidean |
timeout | Optional[int] | No | Timeout period (in seconds), -1 means asynchronous creation collection. Default value: None |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.create('YOUR-COLLECTION-NAME', dimension=4)
assert rsp
List Collections
Client.list() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collections = client.list()
for collection in collections:
print(collection)
Describe Collection
Client.describe(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Collection to describe. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.describe('YOUR-COLLECTION-NAME')
print(rsp)
Delete Collection
Client.delete(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Collection to delete. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
client.delete('YOUR-COLLECTION-NAME')
Get a Collection Instance
Collection
provides APIs for accessing Doc
and Partition
Client.get(name: str) -> Collection
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Collection to get. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
assert collection
Describe Collection Statistics
Collection.stats() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.stats()
print(rsp)
Insert/Update/Upsert Docs
Collection.insert(
docs: Union[Doc, List[Doc], Tuple, List[Tuple]],
partition: Optional[str] = None,
async_req: False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
docs | Union[Doc, List[Doc], Tuple, List[Tuple]] | Yes | The docs to Insert/Update/Upsert. |
partition | Optional[str] | No | Name of the partition to Insert/Update/Upsert. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
import numpy as np
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
collection.insert(('YOUR-DOC-ID1', [0.1, 0.2, 0.3, 0.4]))
collection.insert(('YOUR-DOC-ID2', [0.2, 0.3, 0.4, 0.5], {'age': 30, 'name': 'alice', 'anykey': 'anyvalue'}))
collection.insert(
dashvector.Doc(
id='YOUR-DOC-ID3',
vector=[0.3, 0.4, 0.5, 0.6],
fields={'foo': 'bar'}
)
)
ret = collection.insert(
[
('YOUR-DOC-ID4', [0.2, 0.7, 0.8, 1.3], {'age': 1}),
('YOUR-DOC-ID4', [0.3, 0.6, 0.9, 1.2], {'age': 2}),
('YOUR-DOC-ID6', [0.4, 0.5, 1.0, 1.1], {'age': 3})
]
)
ret = collection.insert(
[
dashvector.Doc(id=str(i), vector=np.random.rand(4)) for i in range(10)
]
)
ret_funture = collection.insert(
[
dashvector.Doc(id=str(i+10), vector=np.random.rand(4)) for i in range(10)
],
async_req=True
)
ret = ret_funture.get()
Query a Collection
Collection.query(
vector: Optional[Union[List[Union[int, float]], np.ndarray]] = None,
id: Optional[str] = None,
topk: int = 10,
filter: Optional[str] = None,
include_vector: bool = False,
partition: Optional[str] = None,
output_fields: Optional[List[str]] = None,
async_req: False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
vector | Optional[Union[List[Union[int, float]], np.ndarray]] | No | The vector to query |
id | Optional[str] | No | The doc id to query. Setting id means searching by vector corresponding to the id |
topk | Optional[str] | No | Number of similarity results to return. Default value: 10 |
filter | Optional[str] | No | Expression used to filter results Default value: None e.g. age>20 |
include_vector | bool | No | Return vector details or not. Default value: False |
partition | Optional[str] | No | Name of the partition to Query. Default value: None |
output_fields | Optional[List[str]] | No | List of field names to return. Default value: None , means return all fields e.g. ['name', 'age'] |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
match_docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=100, filter='age>20', include_vector=True, output_fields=['age','name','foo'])
if match_docs:
for doc in match_docs:
print(doc.id)
print(doc.vector)
print(doc.fields)
print(doc.score)
Delete Docs
collection.delete(
ids: Union[str, List[str]],
delete_all: bool = False,
partition: Optional[str] = None,
async_req: bool = False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
ids | Union[str, List[str]] | Yes | The id (or list of ids) for the Doc(s) to Delete |
delete_all | bool | No | Delete all vectors from partition. Default value: False |
partition | Optional[str] | No | Name of the partition to Delete from. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
collection.delete(['YOUR-DOC-ID1','YOUR-DOC-ID2'])
Fetch Docs
Collection.fetch(
ids: Union[str, List[str]],
partition: Optional[str] = None,
async_req: bool = False
) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
ids | Union[str, List[str]] | Yes | The id (or list of ids) for the Doc(s) to Fetch |
partition | Optional[str] | No | Name of the partition to Fetch from. Default value: None |
async_req | bool | No | Enable async request or not. Default value: False |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
fetch_docs = collection.fetch(['YOUR-DOC-ID1', 'YOUR-DOC-ID2'])
if fetch_docs:
for doc_id in fetch_docs:
doc = fetch_docs[doc_id]
print(doc.id)
print(doc.vector)
print(doc.fields)
Create Collection Partition
Collection.create_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Partition to Create. |
timeout | Optional[int] | No | Timeout period (in seconds), -1 means asynchronous creation partition. Default value: None |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.create_partition('YOUR-PARTITION-NAME')
assert rsp
Delete Collection Partition
Collection.delete_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Partition to Delete. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.delete_partition('YOUR-PARTITION-NAME')
assert rsp
List Collection Partitions
Collection.list_partitions() -> DashVectorResponse
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
partitions = collection.list_partitions()
assert partitions
for pt in partitions:
print(pt)
Describe Collection Partition
Collection.describe_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Partition to Describe. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.describe_partition('shoes')
print(rsp)
Statistics for Collection Partition
Collection.stats_partition(name: str) -> DashVectorResponse
Parameters | Type | Required | Description |
---|
name | str | Yes | The name of the Partition to get Statistics. |
Example:
import dashvector
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.stats_partition('shoes')
print(rsp)
Class
dashvector.Doc
@dataclass(frozen=True)
class Doc(object):
id: str
vector: Union[List[int], List[float], numpy.ndarray]
fields: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None
score: float = 0.0
dashvector.DashVectorResponse
class DashVectorResponse(object):
code: DashVectorCode
message: str
request_id: str
output: Any
License
This project is licensed under the Apache License (Version 2.0).