Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

pdfforms

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pdfforms

Populate fillable pdf forms from csv data file

  • 2.0.0
  • PyPI
  • Socket score

Maintainers
1

pdfforms

.. home-start

pdfforms is a small utility for populating fillable pdf forms from a spreadsheet data source. It was created with the intent of filling US tax forms using tax data prepared with a spreadsheet, but should be equally applicable to other forms.

Features

  • Assigns numeric id for each field
  • Generates test pdf showing ids of text fields
  • Merges spreadsheet data into final filled pdf
  • Works with multiple spreadsheet formats
  • Can process multiple pdfs at a time
  • Can be used as a library or CLI
  • Optional rounding and number formatting

Requirements

pdfforms requires Python 3.5 or higher, pyexcel_ for data loading, and pdftk_, which does all the real work.

.. _pyexcel: https://pyexcel.readthedocs.io/en/stable/index.html .. _pdftk: https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/

Installation

To install: pip install pdfforms

.. home-end

Documentation

For complete documentation, see https://pdfforms.readthedocs.io/

Example

.. cli-example-start

Let's say you have a spreadsheet with your tax calculations. You want to populate your tax forms with the data from the spreadsheet. pdfforms allows you to do so with the following steps:

#. First pdfforms must inspect the forms to be filled. pdfforms will extract a list of fields in each of the specified documents. Each field is assigned a numeric id, and test documents are generated with filled forms, showing the id of each text field::

$ pdfforms inspect f1040*.pdf
f1040sse.pdf
f1040sce.pdf
f1040.pdf

The filled test pdfs are stored by default in the test/ subdirectory.

#. Browse the test pdf files and add the field numbers of the fields you need to fill to your spreadsheet. pdfforms only reads the first and third columns of the datafile. The first column should contain the name of the pdf file with the form to fill and the field numbers. The third column should contain the data to be written into the field. The rest of the sheet is ignored, so you can use it for notes, calculations, etc.

pdfforms is case sensitive! The file name in the spreadsheet must match exactly the name of the pdf to be filled.

Below is an example spreadsheet for a (fictional) 2016 tax return.

.. csv-table::

    f1040.pdf,Form 1040,,2016,
    3,First Name and initial,John Q,,
    4,Last Name,Public,,
    5,SSN,321546789,321-54-6789,
    6,Spouse's Name,Susie,,
    7,Spouse's Last Name,Public,,
    8,SSN,132458697,132-45-8697,
    9,Address,5776 Winding Ln,,
    11,,"Springfield, MA",,
    18,Filing status,MJ,,
    24,Exemption - self,1,,
    25,Exemption - spouse,1,,
    27,Dependent name,Timothy Public,,
    28,Dependent ssn,531248680,531-24-8680,
    29,Dependent relationship,Son,,
    30,Dependent under 17,1,,
    31,Dependent name,Abigail Public,,
    32,Dependent ssn,428775031,428-77-5031,
    33,Dependent relationship,Daughter,,
    34,Dependent under 17,1,,
    45,Line 6a,2,,
    46,Line 6c,2,,
    49,Line 6d,4,,
    50,Line 7,"60,000",salaries,
    52,Line 8a,124,taxable interest,
    64,Line 12,"15,000",business income,
    92,Line 22,"75,124",total income,
    102,Line 27,"1,060",half SE tax,
    121,Line 36,"1,060",,
    123,Line 37,"74,064",Adjusted Gross Income,
    125,Line 38,"74,064",,
    133,Line 40,"12,600",Standard Deduction,
    135,Line 41,"61,464",,
    137,Line 42,"16,200",Exemptions,"$ 4,050"
    139,Line 43,"45,264",Taxable income,
    145,Line 44,"4,528",Tax,
    151,Line 47,"4,528",,
    161,Line 52,"2,000",Child Tax Credit,
    171,Line 55,"2,000",Total Credits,
    173,Line 56,"2,528",,
    175,Line 57,"2,119",Self-employment tax,
    196,Line 63,"4,647",Total Tax,
    198,Line 64,"8,688",Tax withheld,
    225,Line 74,"8,688",Total Payments,
    227,Line 75,"4,041",Amount you overpaid,
    230,Line 76a,"4,041",Amount you want refunded,
    232,Line 76b,123654789,Routing Number,
    234,Line 76c,Savings,Account Type,
    235,Line 76d,135724,Account Number,
    247,Occupation,Salesman,,
    248,Daytime phone number,413-555-1212,,
    249,Spouse's Occupation,Artist,,
    ,,,,
    f1040sce.pdf,Schedule C-EZ,,,
    0,Name,Susie Public,,
    1,SSN,132-45-8697,,
    9,Line F,2,No,
    2,Line A,Artist,Principle business or profession,
    3,Line B,711510,Business Code,
    13,Line 1,"22,000",gross receipts,
    15,Line 2,"7,000",total expenses,
    17,Line 3,"15,000",net profit,
    ,,,,
    f1040sse.pdf,Form SE - Section A Short Schedule SE,,,
    0,Name,Susie Public,,
    1,SSN,132-45-8697,,
    6,Line 2,"15,000",,
    8,Line 3,"15,000",92.35%,
    10,Line 4,"13,853",15.30%,
    12,Line 5,"2,119",50.00%,
    14,Line 6,"1,060",,

The test pdfs do not show field numbers for checkboxes. Currently the only way to fill checkboxes is to examine the fields.json file and find the field number and allowed values of the checkbox.

#. Once the file name and field numbers have been added to your spreadsheet, save the spreadsheet as a csv file and fill the forms::

    $ pdfforms fill mydata.csv
    f1040sse.pdf
    f1040sce.pdf
    f1040.pdf

The final, populated pdf files are saved by default to the filled/ subdirectory.

.. cli-example-end

Changelog

2.0.0 """"" :date: 15 Aug, 2021

  • Use pyexcel to load spreadsheet data, supports xlsx, ods, csv, and more
  • Add options to round values, add thousands separators
  • Split codebase up and publish an API
  • Make .pdf suffix recognition case-insensitive
  • Better handling of invalid input
  • Expanded documentation
  • General code clean-up, refactoring, linting, and reformatting

1.2.1 """"" :date: 3 July, 2020

  • Don’t crash when subcommand not supplied (Thanks @PiDelport for the PR)

1.2.0 """"" :date: 24 September, 2019

  • Added --no-flatten option to keep form fillable
  • inspect doesn’t crash if passed a pdf without fillable form

1.1.0 """"" :date: 4 July, 2018

  • Fixed handling of whitespace (Thanks @rohitkhirapate for the bug report)
  • Added python 3.4 compatibility (Thanks @oneyb for the PR)

1.0.0 """"" :date: 1 May, 2017

  • Initial release

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc