langchain-upstage
This package contains the LangChain integrations for Upstage through their APIs.
Installation and Setup
- Install the LangChain partner package
pip install -U langchain-upstage
- Get an Upstage api key from Upstage Console and set it as an environment variable (
UPSTAGE_API_KEY
)
Chat Models
This package contains the ChatUpstage
class, which is the recommended way to interface with Upstage models.
See a usage example
Embeddings
See a usage example
Use solar-embedding-1-large
model for embeddings. Do not add suffixes such as -query
or -passage
to the model name.
UpstageEmbeddings
will automatically add the suffixes based on the method called.
Document Parse Loader
See a usage example
The use_ocr
option determines whether OCR will be used for text extraction from documents. If this option is not specified, the default policy of the Upstage Document Parse API service will be applied. When use_ocr
is set to True
, OCR is utilized to extract text. In the case of PDF documents, this involves converting the PDF into images before performing OCR. Conversely, if use_ocr
is set to False
for PDF documents, the text information embedded within the PDF is used directly. However, if the input document is not a PDF, such as an image, setting use_ocr
to False
will result in an error.
from langchain_upstage import UpstageDocumentParseLoader
file_path = "/PATH/TO/YOUR/FILE.image"
layzer = UpstageDocumentParseLoader(file_path, split="page")
docs = layzer.load()
for doc in docs[:3]:
print(doc)
If you are a Windows user, please ensure that the Visual C++ Redistributable is installed before using the loader.