New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

dashvector

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

dashvector

DashVector Client Python Sdk Library

  • 1.0.18
  • PyPI
  • Socket score

Maintainers
2

DashVector Client Python Library

DashVector is a scalable and fully-managed vector-database service for building various machine learning applications. The DashVector client SDK is your gateway to access the DashVector service.

For more information about DashVector, please visit: https://help.aliyun.com/document_detail/2510225.html

Installation

To install the DashVector client Python SDK, simply run:

pip install dashvector

QuickStart

import numpy as np
import dashvector

# Use DashVector `Client` api to communicate with the backend vectorDB service.
client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')

# Create a collection named "quickstart" with dimension of 4, using the default Cosine distance metric
rsp = client.create(name='quickstart', dimension=4)
assert rsp

# Get a collection by name
collection = client.get(name='quickstart')

# Operations on 'Collection' includes Inert/Query/Upsert/Update/Delete/Fetch of docs
# Here we insert sample data (4-dimensional vectors) in batches of 16
collection.insert(
    [
        dashvector.Doc(id=str(i), vector=np.random.rand(4), fields={'anykey': 'anyvalue'}) 
        for i in range(16)
    ]
)

# Query a vector from the collection
docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=5)
print(docs)

# Get statistics about collection
stats = collection.stats()
print(stats)

# Delete a collection by name
client.delete(name='quickstart')

Reference

Create a Client

Client host various APIs for interacting with DashVector Collection.

dashvector.Client(
    api_key: str,
    endpoint: str = 'dashvector.cn-hangzhou.aliyuncs.com',
    protocal: dashvector.DashVectorProtocol = dashvector.DashVectorProtocol.GRPC, 
    timeout: float = 10.0
) -> Client
ParametersTypeRequiredDescription
api_keystrYesYour DashVector API-KEY
endpointstrNoService Endpoint.
Default value: dashvector.cn-hangzhou.aliyuncs.com
protocolDashVectorProtocolNoCommunication protocol, support HTTP and GRPC.
Default value: DashVectorProtocol.GRPC
timeoutfloatNoTimeout period (in seconds), -1 means no timeout.
Default value: 10.0

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
assert client

Create Collection

Client.create(
    name: str,
    dimension: int,
    dtype: Union[Type[int], Type[float]] = float,
    fields_schema: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None,
    metric: str = 'cosine',
    timeout: Optional[int] = None
) -> DashVectorResponse
ParametersTypeRequiredDescription
namestrYesThe name of the Collection to create.
dimensionintYesThe dimensions of the Collection's vectors. Valid values: 1-20,000
dtypeUnion[Type[int], Type[float]]NoThe date type of the Collection's vectors.
Default value: Type[float]
fields_schemaOptional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]]NoFields schema of the Collection.
Default value: None
e.g. {"name": str, "age": int}
metricstrNoVector similarity metric. For cosine, dtype must be float.
Valid values:
1. (Default)cosine
2. dotproduct
3. euclidean
timeoutOptional[int]NoTimeout period (in seconds), -1 means asynchronous creation collection.
Default value: None

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')

rsp = client.create('YOUR-COLLECTION-NAME', dimension=4)
assert rsp

List Collections

Client.list() -> DashVectorResponse

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')

collections = client.list()

for collection in collections:
    print(collection)
# outputs:
# 'quickstart'

Describe Collection

Client.describe(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Collection to describe.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
rsp = client.describe('YOUR-COLLECTION-NAME')

print(rsp)
# example output:
# {
#   "request_id": "8d3ac14e-5382-4736-b77c-4318761ddfab",
#   "code": 0,
#   "message": "",
#   "output": {
#     "name": "quickstart",
#     "dimension": 4,
#     "dtype": "FLOAT",
#     "metric": "dotproduct",
#     "fields_schema": {
#       "name": "STRING",
#       "age": "INT",
#       "height": "FLOAT"
#     },
#     "status": "SERVING",
#     "partitions": {
#       "default": "SERVING"
#     }
#   }
# }

Delete Collection

Client.delete(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Collection to delete.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
client.delete('YOUR-COLLECTION-NAME')

Get a Collection Instance

Collection provides APIs for accessing Doc and Partition

Client.get(name: str) -> Collection

ParametersTypeRequiredDescription
namestrYesThe name of the Collection to get.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
assert collection

Describe Collection Statistics

Collection.stats() -> DashVectorResponse

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.stats()

print(rsp)
# example output:
# {
#   "request_id": "14448bcb-c9a3-49a8-9152-0de3990bce59",
#   "code": 0,
#   "message": "Success",
#   "output": {
#     "total_doc_count": "26",
#     "index_completeness": 1.0,
#     "partitions": {
#       "default": {
#         "total_doc_count": "26"
#       }
#     }
#   }
# }

Insert/Update/Upsert Docs

Collection.insert(
    docs: Union[Doc, List[Doc], Tuple, List[Tuple]],
    partition: Optional[str] = None,
    async_req: False
) -> DashVectorResponse
ParametersTypeRequiredDescription
docsUnion[Doc, List[Doc], Tuple, List[Tuple]]YesThe docs to Insert/Update/Upsert.
partitionOptional[str]NoName of the partition to Insert/Update/Upsert.
Default value: None
async_reqboolNoEnable async request or not.
Default value: False

Example:

import dashvector
import numpy as np

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')

# insert a doc with Tuple
collection.insert(('YOUR-DOC-ID1', [0.1, 0.2, 0.3, 0.4]))
collection.insert(('YOUR-DOC-ID2', [0.2, 0.3, 0.4, 0.5], {'age': 30, 'name': 'alice', 'anykey': 'anyvalue'}))

# insert a doc with dashvector.Doc
collection.insert(
    dashvector.Doc(
        id='YOUR-DOC-ID3', 
        vector=[0.3, 0.4, 0.5, 0.6], 
        fields={'foo': 'bar'}
    )
)

# insert in batches
ret = collection.insert(
    [
        ('YOUR-DOC-ID4', [0.2, 0.7, 0.8, 1.3], {'age': 1}),
        ('YOUR-DOC-ID4', [0.3, 0.6, 0.9, 1.2], {'age': 2}),
        ('YOUR-DOC-ID6', [0.4, 0.5, 1.0, 1.1], {'age': 3})
    ]
)

# insert in batches
ret = collection.insert(
    [
        dashvector.Doc(id=str(i), vector=np.random.rand(4)) for i in range(10)
    ]
)

# async insert
ret_funture = collection.insert(
    [
        dashvector.Doc(id=str(i+10), vector=np.random.rand(4)) for i in range(10)
    ],
    async_req=True
)
ret = ret_funture.get()

Query a Collection

Collection.query(
    vector: Optional[Union[List[Union[int, float]], np.ndarray]] = None,
    id: Optional[str] = None,
    topk: int = 10,
    filter: Optional[str] = None,
    include_vector: bool = False,
    partition: Optional[str] = None,
    output_fields: Optional[List[str]] = None,
    async_req: False
) -> DashVectorResponse
ParametersTypeRequiredDescription
vectorOptional[Union[List[Union[int, float]], np.ndarray]]NoThe vector to query
idOptional[str]NoThe doc id to query.
Setting id means searching by vector corresponding to the id
topkOptional[str]NoNumber of similarity results to return.
Default value: 10
filterOptional[str]NoExpression used to filter results
Default value: None
e.g. age>20
include_vectorboolNoReturn vector details or not.
Default value: False
partitionOptional[str]NoName of the partition to Query.
Default value: None
output_fieldsOptional[List[str]]NoList of field names to return.
Default value: None, means return all fields
e.g. ['name', 'age']
async_reqboolNoEnable async request or not.
Default value: False

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
match_docs = collection.query([0.1, 0.2, 0.3, 0.4], topk=100, filter='age>20', include_vector=True, output_fields=['age','name','foo'])
if match_docs:
    for doc in match_docs:
        print(doc.id)
        print(doc.vector)
        print(doc.fields)
        print(doc.score)

Delete Docs

collection.delete(
    ids: Union[str, List[str]],
    delete_all: bool = False,
    partition: Optional[str] = None,
    async_req: bool = False
) -> DashVectorResponse
ParametersTypeRequiredDescription
idsUnion[str, List[str]]YesThe id (or list of ids) for the Doc(s) to Delete
delete_allboolNoDelete all vectors from partition.
Default value: False
partitionOptional[str]NoName of the partition to Delete from.
Default value: None
async_reqboolNoEnable async request or not.
Default value: False

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
collection.delete(['YOUR-DOC-ID1','YOUR-DOC-ID2'])

Fetch Docs

Collection.fetch(
    ids: Union[str, List[str]],
    partition: Optional[str] = None,
    async_req: bool = False
) -> DashVectorResponse
ParametersTypeRequiredDescription
idsUnion[str, List[str]]YesThe id (or list of ids) for the Doc(s) to Fetch
partitionOptional[str]NoName of the partition to Fetch from.
Default value: None
async_reqboolNoEnable async request or not.
Default value: False

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
fetch_docs = collection.fetch(['YOUR-DOC-ID1', 'YOUR-DOC-ID2'])
if fetch_docs:
    for doc_id in fetch_docs:
        doc = fetch_docs[doc_id]
        print(doc.id)
        print(doc.vector)
        print(doc.fields)

Create Collection Partition

Collection.create_partition(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Partition to Create.
timeoutOptional[int]NoTimeout period (in seconds), -1 means asynchronous creation partition.
Default value: None

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.create_partition('YOUR-PARTITION-NAME')
assert rsp

Delete Collection Partition

Collection.delete_partition(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Partition to Delete.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
rsp = collection.delete_partition('YOUR-PARTITION-NAME')
assert rsp

List Collection Partitions

Collection.list_partitions() -> DashVectorResponse

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')
partitions = collection.list_partitions()

assert partitions
for pt in partitions:
    print(pt)

Describe Collection Partition

Collection.describe_partition(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Partition to Describe.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')

rsp = collection.describe_partition('shoes')
print(rsp)
# example output:
# {"request_id":"296267a7-68e2-483a-87e6-5992d85a5806","code":0,"message":"","output":"SERVING"}

Statistics for Collection Partition

Collection.stats_partition(name: str) -> DashVectorResponse

ParametersTypeRequiredDescription
namestrYesThe name of the Partition to get Statistics.

Example:

import dashvector

client = dashvector.Client(api_key='YOUR-DASHVECTOR-API-KEY')
collection = client.get('YOUR-COLLECTION-NAME')

rsp = collection.stats_partition('shoes')
print(rsp)
# example outptut:
# {
#     "code":0,
#     "message":"",
#     "requests_id":"330a2bcb-e4a7-4fc6-a711-2fe5f8a24e8c",
#     "output":{
#         "total_doc_count":0
#     }
# }

Class

dashvector.Doc

@dataclass(frozen=True)
class Doc(object):
    id: str
    vector: Union[List[int], List[float], numpy.ndarray]
    fields: Optional[Dict[str, Union[Type[str], Type[int], Type[float], Type[bool]]]] = None 
    score: float = 0.0

dashvector.DashVectorResponse

class DashVectorResponse(object):
    code: DashVectorCode
    message: str
    request_id: str
    output: Any

License

This project is licensed under the Apache License (Version 2.0).

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc