📅 You're Invited: Meet the Socket team at RSAC (April 28 – May 1).RSVP
Socket
Sign inDemoInstall
Socket

pdf-aggregator

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pdf-aggregator

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline

0.0.1
PyPI
Maintainers
1

PDF aggregator

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline.

PDF aggregator

Works offline and relies on tika for PDF parsing and matplotlib for plotting. It relies on regular expressions stored in simple configuration files to extract bank statements balance, date, account number...

Installation

pip install -r requirements.txt

Usage

Aggregate

Scan PDF files and aggregate financial data into an accounts.json summary file:

python aggregate.py path/to/folder/with/PDF

or

python aggregate.py path/to/file.pdf

--help for more options.

Add a new config

python aggregate.py path/to/PDF/file --test

It should print out the content of the pdf. Then test regular expression:

python aggregate.py path/to/PDF/file --test 'Ending balance on (\d+)/(\d+)/(\d+)

You can then create conf file and test detection with -vvv:

python aggregate.py path/to/PDF/file -vvv

Plot

Plot aggregated data:

python plot.py path/to/folder/with/multiple/accounts.json

or

python plot.py path/to/accounts.json

--help for more options.

Example:

python.exe .\plot.py .\accounts\ --subtotals --no_real_estate_appreciation

Keywords

pdf aggregate extract banking financial statement

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts