Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@ascari/reco

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@ascari/reco

Generic text classifier for Mexican electronic invoices.

  • 1.0.1
  • latest
  • npm
  • Socket score

Version published
Weekly downloads
1
Maintainers
1
Weekly downloads
 
Created
Source

reco

A text recognition engine that classifies concept descriptions found in electronic invoices used in Mexico.

Installation

Command Line Utility

You must install reco globally to use the command line interface.

npm i @ascari/reco -g

Module

npm i @ascari/reco --save

USAGE

Command Line Utility

Create reco.json
reco init

A ./reco.json file will be created with default values. By default it uses a sqlite3 database.

You may edit the configuration now.

Scaffold a new project

Requires a valid ./reco.json file to be present.

reco create

A database will be scaffolded in the current directory.

By default it creates a ./database folder where a sqlite3 database file will be stored.

You may edit the first migration: ./database/migrations/0.js to better accomodate your database structure. Keep in mind that the autogenerated tables and columns are required.


NOTE The following commands can only be called after creating a project.


Add a single invoice

Load xml, parse information and store unique: suppliers, clients & concepts found.

reco xml path/to/invoice.xml

Note Ideally, valid SAT invoices should be fed to reco, however reco does not verify its integrity, this means you can feed non-compliant xml invoices as well, as long as they follow a similar structure:

<?xml version="1.0" encoding="utf-8"?>
<Comprobante fecha="{{INVOICE_DATE}}" sello="{{SELLO_DIGITAL}}">
  <Emisor rfc="{{CLIENT_RFC}}" name="{{CLIENT_NAME}}" />
  <Receptor rfc="{{SUPPLIER_RFC}}" name="{{SUPPLIER_NAME}}" />
  <Conceptos>
    <Concepto descripcion="{{CONCEPT_DESCRIPTION_A}}" />
    <Concepto descripcion="{{CONCEPT_DESCRIPTION_B}}" />
  </Conceptos>
</Comprobante>
Add all invoices in a folder

Load and store all invoice files found in a folder.

reco xmls path/to/invoices

Currenly, reco cannot read folders recursively

Label a concept for training

Will create a label that is used to train a classifier.

reco label "LABEL" "CONCEPT"

You may use the --rfc option to scope a label to a supplier.

reco label "LABEL" "CONCEPT" --rfc XXX0123456X7

Scoping a label improves recognition accuracy. The imporovment comes from weighting higher classifications that belong to a supplier when a recognition test is also scoped to a supplier. The reasoning being that a supplier will generally have their own unique set of concepts for their products and or services, that will more likely match a label scoped to the same supplier.

In other words, a Pizza supplier will tend to better identify concepts with the word "pizza", since its products have the word "pizza" in them, when we are classifying a concept from the Pizza supplier. Otherwise, a Toy supplier with toy pizza games may rank higher.

One more time: When we know a invoice concept comes from a certain supplier, it is better to test it against a classifier that has only been trained on its own invoice concepts and labels from the same supplier.

Add all labels found in a file

Add labels found in a list.

reco labels path/to/labels.lst

Labels are seperated by new lines, where labels and concept are seperated by a (:) colon.

example:

apple:I WANT AN APPLE
orange:ORANGE YOU GLAD?
lemon:EAT SOME LEMON PIE

You may use the -v option to see progress.

You may use the --rfc option to scope labels to a supplier.

You may use the --delim option to specify a different delimeter.

You mau use the --no-delim option to specify that the list is not delimeted, that is, it does not have a label and a concept, instead the label and the concept are the same.

This is usefull for adding a supplier's catalog, when identifying their concepts.

Train classifiers

Train classifiers.

Be patient, it may take a while.

reco train
Test an arbitrary concept

Test recognition by classifying a specified concept. Will return label with the best score.

reco test "CONCEPT"

You may use the --rfc option to scope test to a supplier.

You may use the -v option to return classification information. 10 rows are returned by default, if you specify a number: -v 20, you can specify how many classification rows to return, ordered from best match to least.

Module

const Reco = require('reco');

// Reco configuration
const recoConfig = { .... };

// instanciate
const reco = new Reco(recoConfig);
API

Reco::contructor(recoConfig);

Where recoConfig can have the follwing options:

Note the database property is fed to knex.

{
  database: {
    client: 'sqlite3',
    connection: {
      filename: './database/database.sqlite'
    },
    migrations: {
      tableName: 'migrations',
      directory: './database/migrations',
      stub: './database/stub.migration.js'
    },
    seeds: {
      directory: './database/seeds',
    },
    useNullAsDefault: true,
  },
}

Promise Reco::addLabel(String label, String concept, String [supplierRfc=null]);

Add a label.

Promise Reco::addXmlInvoice(String xml);

Add an invoice.

Promise Reco::train();

Train classifiers.

Promise Reco::test(String input, String [supplierRfc=null]);

Test classifiers

License

See LICENSE in respository.

Keywords

FAQs

Package last updated on 13 Sep 2017

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc