Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
This Python package offers a robust client for seamless interaction with the Document Intelligence API. It enables you to efficiently retrieve, manage, and process document data within your workspace.
To install the package directly from GitHub using pip
, run the following command:
pip install sema4ai-di-client
After installing the package, you can import it and start using the DocumentIntelligenceClient
class.
from sema4ai.di_client import DocumentIntelligenceClient
To retrieve a document work item, make sure you've set the required environment variables before initializing the client. Specifically, ensure that DOCUMENT_INTELLIGENCE_SERVICE_URL
and AGENTS_EVENTS_SERVICE_URL
is set.
When running in Sema4.ai Control Room these are all handled by the platform.
DOCUMENT_INTELLIGENCE_SERVICE_URL
, AGENTS_EVENTS_SERVICE_URL
) are set in your environment before running the code.DocumentIntelligenceClient(workspace_id='<your_workspace_id>')
, which is optional and deduced on non-local environments from the URL passed.Here's an example:
from sema4ai.di_client import DocumentIntelligenceClient
# Ensure environment variables are set for the service URLs
# Example:
# export DOCUMENT_INTELLIGENCE_SERVICE_URL='https://api.yourdomain.com'
# export AGENTS_EVENTS_SERVICE_URL='https://agents.yourdomain.com'
# Initialize the client
client = DocumentIntelligenceClient()
# Specify the document ID you want to retrieve
document_id = 'your_document_id'
# Get the document work item
try:
document_work_item = client.get_document_work_item(document_id)
if document_work_item:
print("Document Work Item:")
print(document_work_item)
else:
print("No document work item found for the given document ID.")
except Exception as e:
print(f"An error occurred: {e}")
finally:
client.close() # Make sure to close the client connection
The DocumentIntelligenceClient
offers several methods to interact with the Document Intelligence API and manage work items. Below are examples of the available operations:
Get Document Type: Retrieve details about a specific document type.
document_type = client.get_document_type(document_type_name)
Get Document Format: Fetch the format of a document based on its type and class.
document_format = client.get_document_format(document_type_name, document_class_name)
Store Extracted Content: Store extracted content after processing a document.
client.store_extracted_content(extracted_content)
Store Transformed Content: Save content that has been transformed by a process.
client.store_transformed_content(transformed_content)
Store Computed Content: Submit content that has been computed after analysis.
client.store_computed_content(computed_content)
Get Document Content: Retrieve document content in various states, such as raw, extracted, transformed, or computed.
content = client.get_document_content(document_id, content_state)
Remove Document Content: Delete content for a document in a specific state.
client.remove_document_content(document_id, content_state)
Complete Work Item Stage: Mark a work item’s current stage as complete and move to the next stage.
response = client.work_items_complete_stage(
work_item_id='your_work_item_id',
status='SUCCESS', # or 'FAILURE'
status_reason='optional_reason', # Optional
log_details_path='optional_log_path' # Optional
)
The package requires the following dependencies:
urllib3 >= 1.25.3, < 2.1.0
python-dateutil
pydantic >= 2
typing-extensions >= 4.7.1
These should be installed automatically when you install the package via pip
.
Below is an example code snippet if you are testing on prod.
from sema4ai.di_client import DocumentIntelligenceClient
client = DocumentIntelligenceClient()
# Fetch and print the document type
try:
document_type = client.get_document_type("CounterParty Reconciliation")
print(document_type)
except Exception as e:
print(f"An error occurred: {e}")
finally:
client.close() # Ensure proper resource cleanup
Below is an example code snippet if you are testing on local.
from sema4ai.di_client import DocumentIntelligenceClient
import os
# Set the required environment variables within the Python code
os.environ['DOCUMENT_INTELLIGENCE_SERVICE_URL'] = 'http://127.0.0.1:9080'
os.environ['AGENTS_EVENTS_SERVICE_URL'] = 'http://127.0.0.1:9080'
workspace_id = "<your Sema4.ai Control Room Workspace ID>"
client = DocumentIntelligenceClient(workspace_id=workspace_id)
# Fetch and print the document type
try:
document_type = client.get_document_type("CounterParty Reconciliation")
print(document_type)
except Exception as e:
print(f"An error occurred: {e}")
finally:
client.close() # Ensure proper resource cleanup
FAQs
Sema4AI Document Intelligence API client
We found that sema4ai-di-client demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.