Socket
Socket
Sign inDemoInstall

sensible-api

Package Overview
Dependencies
30
Maintainers
1
Versions
10
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

Comparing version 0.0.5 to 0.0.6

2

package.json
{
"name": "sensible-api",
"version": "0.0.5",
"version": "0.0.6",
"description": "Javascript SDK for Sensible, the developer-first platform for extracting structured data from documents so that you can build document-automation features into your SaaS products",

@@ -5,0 +5,0 @@ "keywords": ["IDP","parsing","conversion","openai","processing","csv","excel","convert","json","LLMs","pdf","png","tiff","jpeg","doc","docx","document","text","data","extraction","extract","classification","classify","sensible","openapi","gpt-3","gpt-4","senseml","automation","sdk","query","document-processing"," intelligent-document-processing"," pdf-conversion"," pdf-extraction"," document-extraction"," pdf-crawler","pdf-parser","pdf-extract","pdf-to-data","document-data-extraction","document-automation","sensible-api"],

@@ -0,6 +1,11 @@

# Sensible Node SDK
The open-source Sensible Node SDK offers convenient access to the [Sensible API](https://docs.sensible.so/reference/choosing-an-endpoint). Use the Sensible Node SDK to:
Welcome! Sensible is a developer-first platform for extracting structured data from documents, for example, business forms in PDF format. use Sensible to build document-automation features into your SaaS products. Sensible is highly configurable: you can get simple data [in minutes](https://docs.sensible.so/docs/getting-started-ai) by leveraging GPT-4 and other large-language models (LLMs), or you can tackle complex and idiosyncratic document formatting with Sensible's powerful [layout-based document primitives](https://docs.sensible.so/docs/getting-started).
- [Extract](#extract-document-data): Extract structured data from your custom documents. Configure the extractions for a set of similar documents, or *document type*, in the Sensible app or Sensible API, then you run extractions for documents of the type with this SDK.
![Click to enlarge](https://raw.githubusercontent.com/sensible-hq/sensible-docs/main/readme-sync/assets/v0/images/final/intro_SDK_2.png)
This open-source Sensible SDK offers convenient access to the [Sensible API](https://docs.sensible.so/reference/choosing-an-endpoint). Use this SDK to:
- [Extract](#extract-document-data): Extract structured data from your custom documents. Configure the extractions for a set of similar documents, or *document type*, in the Sensible app or Sensible API, then run extractions for documents of the type with this SDK.
- [Classify](#classify): Classify documents by the types you define, for example, bank statements or tax forms. Use classification to determine which documents to extract prior to calling a Sensible extraction endpoint, or route each document in a system of record.

@@ -10,7 +15,18 @@

For configuration options, see [Node SDK reference](https://docs.sensible.so/docs/sdk-node).
- For extraction and classification response schemas, see [Sensible API](https://docs.sensible.so/reference/choosing-an-endpoint).
- For configuring document extractions, see [SenseML reference](https://docs.sensible.so/docs/senseml-reference-introduction).
## Versions
- The latest version of this SDK is v0.
- The latest version of the Sensible API is v0.
## Node and Typescript support
- This SDK supports all non-end-of-life Node versions.
- This SDK supports all non-end-of-life Typescript versions.
## Install
In an environment in which you've installed Node, create a directory for a test project, open a command prompt in the directory, and install the dependencies:
In an environment in which you've installed Node, create a directory for a test project, open a command prompt in the directory, and install the dependencies:

@@ -31,3 +47,3 @@ ```shell

To initialize the dependency, paste the following code into your `index.mjs` file and replace `YOUR_API_KEY` with your [API key](https://app.sensible.so/account/):
To initialize the dependency, paste the following code into your `index.mjs` file and replace `YOUR_API_KEY` with your [API key](https://app.sensible.so/account/).

@@ -40,21 +56,24 @@ ```node

## Extract document data
## Quickstart
#### Option 1: document URL
To extract data from a sample document at a URL:
1. Paste the following code into your `index.mjs` file:
1. Install the Sensible SDK using the steps in the previous section.
2. Paste the following code into an empty `index.mjs` file:
```node
import { SensibleSDK } from "sensible-api"
const sensible = new SensibleSDK(YOUR_API_KEY); //replace with your API key
const request = await sensible.extract({
url: "https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/contract.pdf",
documentType: "sensible_instruct_basics",
environment: "development" // see Node SDK reference for full list of configuration options
environment: "development"
});
const results = await sensible.waitFor(request); // waitFor is optional if you configure a webhook
console.log(results); // see Node SDK reference to convert results from JSON to Excel
const results = await sensible.waitFor(request); // polls every 5 seconds. Optional if you configure a webhook
console.log(results);
```
2. In a command prompt in the same directory as your `index.mjs` file, run the code with the following command:
2. Replace `YOUR_API_KEY` with your [API key](https://app.sensible.so/account/):
3. In a command prompt in the same directory as your `index.mjs` file, run the code with the following command:

@@ -65,31 +84,8 @@ ```shell

The code extracts data from an example document (`contract.pdf`) using an example document type (`sensible_instruct_basics`) and an example extraction configuration.
The code extracts data from an example document (`contract.pdf`) using an example document type (`sensible_instruct_basics`) and an example extraction configuration.
#### Option 2: local file
#### Results
To extract from a local file:
You should see the following extracted document text in the `parsed_document` object in the logged response:
1. Download the following example file and save it in the same directory as your `index.mjs` file:
| Example document | [Download link](https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/contract.pdf) |
| ---------------- | ------------------------------------------------------------ |
2. Paste the following code into your `index.mjs` file, then run it according to the steps in the previous option:
```node
const request = await sensible.extract({
path: ("./contract.pdf"),
documentType: "sensible_instruct_basics",
});
const results = await sensible.waitFor(request); // waitFor is optional if you configure a webhook
console.log(results); // see Node SDK reference to convert results from JSON to Excel
```
This code uploads your local file to a Sensible-hosted URL and extracts data from an example document (`contract.pdf`) using an example document type (`sensible_instruct_basics`) and an example extraction configuration.
#### Check results
The following excerpt of the results shows the extracted document text in the `parsed_document` object:
```json

@@ -110,92 +106,116 @@ {

For more information about the response body schema, see [Extract data from a document](https://docs.sensible.so/reference/extract-data-from-a-document) and expand the 200 responses in the middle pane and the right pane to see the model and an example, respectively.
#### Optional: Understand extraction
#### Optional: understand extraction
Navigate to https://app.sensible.so/editor/instruct/?d=sensible_instruct_basics&c=contract&g=contract to see how the extraction you just ran works in the Sensible app. You can add more fields to the left pane to extract more data:
Navigate to https://app.sensible.so/editor/instruct/?d=sensible_instruct_basics&c=contract&g=contract to see how the extraction you just ran works in the Sensible app. You can add more fields to the extraction configuration to extract more data:
![Click to enlarge](https://raw.githubusercontent.com/sensible-hq/sensible-docs/main/readme-sync/assets/v0/images/final/sdk_node_1.png)
#### Complete code example
## Usage: Extract document data
See the following code for a complete example of how to use the SDK for document extraction in your own app.
You can use this SDK to extract data from a document, as specified by the extraction configurations and document types defined in your Sensible account.
```node
import { SensibleSDK } from "sensible-api"
### Overview
const sensible = new SensibleSDK(YOUR_API_KEY);
const request = await sensible.extract({
path: ("./contract.pdf"),
documentType: "sensible_instruct_basics",
environment: "development" // see Node SDK reference for configuration options
});
const results = await sensible.waitFor(request); // waitFor is optional if you configure a webhook
console.log(results); // see Node SDK reference to convert results from JSON to Excel
```
See the following steps for an overview of the SDK's workflow for document data extraction. Every method returns a chainable promise:
## Classify
1. Instantiate an SDK object with `new SensibleSDK()`.
2. Request a document extraction with `sensible.extract()`. Use the following required parameters:
1. **(required)** Specify the document from which to extract data using the `url`, `path`, or `file` parameter.
2. **(required)** Specify the user-defined document type or types using the `documentType` or `documentTypes` parameter.
3. Wait for the result. Use `sensible.waitFor()`, or use a webhook.
4. Optionally convert extractions to Excel file with `generateExcel()`.
5. Consume the data.
You can classify a document by its similarity to each document type you define in your Sensible account. For example, if you define a [bank statements](https://github.com/sensible-hq/sensible-configuration-library/tree/main/bank_statements) type and a [tax_forms](https://github.com/sensible-hq/sensible-configuration-library/tree/main/tax_forms) type in your account, you can classify 1040 forms, 1099 forms, Bank of America statements, Chase statements, and other documents, into those two types.
### Extraction configuration
See the following code example for classifying a document.
You can configure options for document data extraction:
```node
const request = await sensible.classify({path: "./boa_sample.pdf"});
const results = await sensible.waitFor(request);
const request = await sensible.extract({
path: ("./1040_john_doe.pdf"),
documentType: "tax_forms",
webhook: {
url:"YOUR_WEBHOOK_URL",
payload: "additional info, for example, a UUID for verification",
}});
```
To classify an example document, take the following steps:
See the following table for information about configuration options:
1. Follow the steps in [Out-of-the-box extractions](https://docs.sensible.so/reference/choosing-an-endpoint/library-quickstart) to add support for bank statements to your account.
| key | value | description |
| ----------------- | ---------------------------------------------------------- | ------------------------------------------------------------ |
| path | string | An option for submitting the document you want to extract data from.<br/> Pass the path to the document. For more information about supported file types, see [Supported file types](https://docs.sensible.so/docs/file-types). |
| file | string | An option for submitting the document you want to extract data from.<br/> Pass the non-encoded document bytes. |
| url | string | An option for submitting the document you want to extract data from.<br/>URL that responds to a GET request with the bytes of the document you want to extract data from. This URL must be either publicly accessible, or presigned with a security token as part of the URL path. To check if the URL meets these criteria, open the URL with a web browser. The browser must either render the document as a full-page view with no other data, or download the document, without prompting for authentication. |
| documentType | string | An option for specifying the document type or types.<br/>Type of document to extract from. Create your custom type in the Sensible app (for example, `rate_confirmation`, `certificate_of_insurance`, or `home_inspection_report`). |
| documentTypes | array | An option for specifying the document type or types.<br/>Types of documents to extract from. Use this parameter to extract from multiple documents that are packaged into one file (a "portfolio"). This parameter specifies the document types contained in the portfolio. Sensible then segments the portfolio into documents using the specified document types (for example, 1099, w2, and bank_statement) and then runs extractions for each document. For more information, see [Multi-doc extraction](https://docs.sensible.so/docs/portfolio). |
| configurationName | string | If specified, Sensible uses the specified config to extract data from the document instead of automatically choosing the best-scoring extraction in the document type.<br/>If unspecified, Sensible automatically detects the best-fit extraction from among the extraction queries ("configs") in the document type.<br/>Not applicable for portfolios. |
| documentName | string | If you specify the filename of the document using this parameter, then Sensible returns the filename in the extraction response and populates the file name in the Sensible app's list of recent extractions. |
| environment | `"production"` or `"development"`. default: `"production"` | If you specify `development`, Sensible extracts preferentially using config versions published to the development environment in the Sensible app. The extraction runs all configs in the doc type before picking the best fit. For each config, falls back to production version if no development version of the config exists. |
| webhook | object | Specifies to return extraction results to the specified webhook URL as soon as they're complete, so you don't have to poll for results status. Sensible also calls this webhook on error.<br/> The webhook object has the following parameters:<br/>`url`: string. Webhook destination. Sensible will POST to this URL when the extraction is complete.<br/>`payload`: string, number, boolean, object, or array. Information additional to the API response, for example a UUID for verification. |
2. Follow the steps in the preceding sections to install and initialize the SDK.
### Extraction results
3. Download the following example file and save it in the same directory as your `index.mjs` file:
Get extraction results by using a webhook or calling the Wait For method.
| Example document | [Download link](https://github.com/sensible-hq/sensible-configuration-library/raw/main/bank_statements/bank_of_america/boa_sample.pdf) |
| ---------------- | ------------------------------------------------------------ |
For the schema for the results of an extraction request, see [Extract data from a document](https://docs.sensible.so/reference/extract-data-from-a-document) and expand the 200 responses in the middle pane and the right pane to see the model and an example, respectively.
4. Paste the preceding code into your `index.mjs` file. Ensure you replaced`YOUR_API_KEY` with your [API key]((https://app.sensible.so/account/) and `YOUR_DOCUMENT.pdf` with `boa_sample.pdf`. See the following code example to check your code completeness.
### Example: Extract from PDFs in directory and output an Excel file
5. In a command prompt in the same directory as your `index.mjs` file, run the code with the following command:
See the following code for a complete example of how to use the SDK for document extraction in your own app.
```shell
node index.mjs
The example:
1. Filters a directory to find the PDF files.
2. Extracts data from the PDF files using the extraction configurations in a `bank_statements` document type.
3. Writes the extractions to an Excel file. The Generate Excel method takes an extraction or an array of extractions, and outputs an Excel file. For more information about the conversion process, see [SenseML to spreadsheet reference](https://docs.sensible.so/docs/excel-reference).
```node
import { promises as fs } from "fs";
import { SensibleSDK } from "sensible-api";
import got from "got";
const apiKey = process.env.SENSIBLE_APIKEY;
const sensible = new SensibleSDK(apiKey);
const dir = process.argv[2];
const files = (await fs.readdir(dir)).filter((file) => file.match(/\.pdf$/));
const extractions = await Promise.all(
files.map(async (filename) => {
const path = `${dir}/${filename}`;
return sensible.extract({
path,
documentType: "bank_statements",
});
})
);
await Promise.all(
extractions.map((extraction) => sensible.waitFor(extraction))
);
const excel_download = await sensible.generateExcel(extractions);
console.log(excel_download);
const excelFile = await got(excel_download.url);
await fs.writeFile(`${dir}/output.xlsx`, excelFile.rawBody);
```
#### Check results
## Usage: Classify documents by type
The following excerpt of the results shows the extracted document text in the `TO_DO` object:
You can use this SDK to classify a document by type, as specified by the document types defined in your Sensible account. For more information, see [Classifying documents by type](https://docs.sensible.so/docs/classify).
```json
{
"document_type": {
"id": "22666f4f-b8d6-4cb5-ad52-d00996989729",
"name": "bank_statements",
"score": 0.8922476745112722
},
"reference_documents": [
{
"id": "c82ac28e-7725-4e42-b77c-e74551684caa",
"name": "boa_sample",
"score": 0.9999980536061833
},
{
"id": "f80424a0-58f8-40e7-814a-eb49b199221e",
"name": "wells_fargo_checking_sample",
"score": 0.8946129923339182
},
{
"id": "cf17daf8-7e8b-4b44-bc4b-7cdd6518d963",
"name": "chase_consolidated_balance_summary_sample",
"score": 0.8677569417649393
}
]
}
```
### Overview
#### Complete code example
See the following steps for an overview of the SDK's workflow for document classification. Every method returns a chainable promise:
Here's a complete example of how to use the SDK for document classification in your own app:
1. Instantiate an SDK object (`new SensibleSDK()`.
2. Request a document classification (`sensible.classify()`. Specify the document to classify using the `path` or `file` parameter.
3. Poll for the result (`sensible.waitFor()`.
4. Consume the data.
### Classification configuration
You can configure options for document data extraction:
```node

@@ -205,3 +225,5 @@ import { SensibleSDK } from "sensible-api"

const sensible = new SensibleSDK(YOUR_API_KEY);
const request = await sensible.classify({path:"./boa_sample.pdf"});
const request = await sensible.classify({
path:"./boa_sample.pdf"
});
const results = await sensible.waitFor(request);

@@ -211,6 +233,14 @@ console.log(results);

See the following table for information about configuration options:
| key | value | description |
| ---- | ------ | ------------------------------------------------------------ |
| path | string | An option for submitting the document you want to extract data from. Pass the path to the document. For more information about supported file types, see [Supported file types](https://docs.sensible.so/docs/file-types). |
| file | string | Pass the non-encoded document bytes. For information about supported file types, see [Supported file types](https://docs.sensible.so/docs/file-types). |
### Classification results
Get results from this method by calling the Wait For method. For the schema for the results of a classification request , see [Classify document by type (sync)](https://docs.sensible.so/reference/classify-document-sync) and expand the 200 responses in the middle pane and the right pane to see the model and an example, respectively.
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc