Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
This package contains the LangChain integrations for using DataStax Astra DB.
DataStax Astra DB is a serverless vector-capable database built on Apache Cassandra® and made conveniently available through an easy-to-use JSON API.
Installation of this partner package:
pip install langchain-astradb
See the LangChain docs page and the API reference for more details.
from langchain_astradb import AstraDBVectorStore
my_store = AstraDBVectorStore(
embedding=my_embedding,
collection_name="my_store",
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBChatMessageHistory
message_history = AstraDBChatMessageHistory(
session_id="test-session",
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBCache
cache = AstraDBCache(
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBSemanticCache
cache = AstraDBSemanticCache(
embedding=my_embedding,
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBLoader
loader = AstraDBLoader(
collection_name="my_collection",
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBStore
store = AstraDBStore(
collection_name="my_kv_store",
api_endpoint="https://...",
token="AstraCS:...",
)
from langchain_astradb import AstraDBByteStore
store = AstraDBByteStore(
collection_name="my_kv_store",
api_endpoint="https://...",
token="AstraCS:...",
)
When creating an Astra DB object in LangChain, such as an AstraDBVectorStore
, you may see a warning similar to the following:
Astra DB collection '...' is detected as having indexing turned on for all fields (either created manually or by older versions of this plugin). This implies stricter limitations on the amount of text each string in a document can store. Consider reindexing anew on a fresh collection to be able to store longer texts.
The reason for the warning is that the requested collection already exists on the database, and it is configured to index all of its fields for search, possibly implicitly, by default. When the LangChain object tries to create it, it attempts to enforce, instead, an indexing policy tailored to the prospected usage. For example, the LangChain vector store will index the metadata but leave the textual content out: this is both to enable storing very long texts and to avoid indexing fields that will never be used in filtering a search (indexing those would also have a slight performance cost for writes).
Typically there are two reasons why you may encounter the warning:
AstraDBVectorStore
do it for you: for example, through the Astra UI, or using AstraPy's create_collection
method of class Database
directly;langchain-astradb
partner package).Keep in mind that this is a warning and your application will continue running just fine, as long as you don't store very long texts.
Should you need to add to a vector store, for example, a Document
whose page_content
exceeds ~8K in length, you will receive an indexing error from the database.
You have several options:
store = AstraDBVectorStore(..., setup_mode=langchain_astradb.utils.astradb.SetupMode.OFF)
. In this case the collection will be used as-is, no (indexing) questions asked;FAQs
An integration package connecting Astra DB and LangChain
We found that langchain-astradb demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.