New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details → →

Book a Demo Sign in

@itaylor/pdf2square

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

@itaylor/pdf2square

CLI and Library: Convert PDF pages to exactly NxN images (letterboxed), using pdfjs + Sharp

latest

Source

npm

Version: 0.0.4

Version published: 5 months ago

Maintainers: 1

Created: 5 months ago

Source

pdf2square

Convert PDF pages to exactly square (NxN) images with letterboxing, plus extract text from each page. Available as both a CLI tool and a JavaScript library.

Features

📄 Convert PDF pages to square images (letterboxed to maintain aspect ratio)
🖼️ Support for PNG and JPEG output formats
📝 Extract text from each page
🎯 High DPI rendering for crisp text (configurable)
⚡ Concurrent processing for better performance
🎨 Customizable background colors (including transparency for PNG)
📚 Available as both CLI tool and programmatic library
🔤 Base64 encoded output for library usage

Requirements

No system dependencies required - uses PDF.js for PDF processing and Sharp for image manipulation.

Installation

npm install pdf2square

Or for global CLI usage:

npm install -g pdf2square

CLI Usage

Basic Usage

# Convert PDF to 896x896 PNG images + text files  
pdf2square input.pdf

# Specify output prefix
pdf2square input.pdf output/page

# Convert with custom options
pdf2square input.pdf --size 512 --dpi 300 --format jpg

CLI Options

Option	Description	Default
`-n, --max-pages <int>`	Maximum pages to convert	10
`-s, --size <int>`	Target square size in pixels	896
`--dpi <int>`	Render DPI (higher = crisper text)	700
`--first <int>`	First page to convert (1-based)	1
`--format <fmt>`	Output format: png or jpg	png
`--bg <color>`	Background color (#RRGGBB[AA] or 'transparent')	#ffffffff
`--concurrency <int>`	Max parallel processes	4
`--keep-intermediate`	Keep intermediate renders	false

CLI Examples

# Convert first 5 pages to 512x512 JPEG with white background
pdf2square document.pdf --max-pages 5 --size 512 --format jpg --bg "#ffffff"

# Convert pages 3-7 with transparent background (PNG only)
pdf2square document.pdf --first 3 --max-pages 5 --bg transparent

# High DPI conversion for crisp text
pdf2square document.pdf --dpi 1000 --size 1024

# Process with higher concurrency
pdf2square document.pdf --concurrency 8

Library Usage

Basic Example

import { convert } from 'pdf2square';

// Convert a PDF file to square images with extracted text
const results = await convert('./path/to/your/document.pdf');

results.forEach((page) => {
  console.log(`Page ${page.pageNumber}:`);
  console.log(`- Image: ${page.base64EncodedImage.substring(0, 50)}...`);
  console.log(`- Text: ${page.extractedText.substring(0, 100)}...`);
});

API Reference

`convert(pathToPdf, options?)`

Converts PDF pages to base64 encoded square images with extracted text.

Parameters:

pathToPdf (string): Path to the input PDF file
options (object, optional): Conversion options

Returns: Promise<ConvertedPDFPage[]>

ConvertedPDFPage

interface ConvertedPDFPage {
  pageNumber: number; // Page number (1-based)
  originalPath: string; // Path to the original PDF file
  base64EncodedImage: string; // Base64 encoded image with data URL prefix
  extractedText: string; // Extracted text from the page
}

ConvertOptions

interface ConvertOptions {
  maxPages?: number; // Maximum pages to convert (default: 10)
  size?: number; // Target square size in pixels (default: 896)
  dpi?: number; // Render DPI (default: 700)
  first?: number; // First page to convert (default: 1)
  format?: 'png' | 'jpg'; // Output format (default: 'png')
  bg?: string; // Background color (default: '#ffffffff')
  concurrency?: number; // Max parallel processes (default: 4)
}

Library Examples

Custom Options

import { convert } from 'pdf2square';

const results = await convert('./path/to/document.pdf', {
  maxPages: 5,
  size: 512,
  dpi: 300,
  format: 'jpg',
  bg: '#ffffff',
  concurrency: 2,
});

Save Images to Files

import { convert } from 'pdf2square';
import fs from 'node:fs/promises';

const results = await convert('./path/to/document.pdf');

for (const page of results) {
  // Extract base64 data (remove data URL prefix)
  const base64Data = page.base64EncodedImage.replace(
    /^data:image\/\w+;base64,/,
    '',
  );
  const buffer = Buffer.from(base64Data, 'base64');

  // Save image and text files
  await fs.writeFile(`page-${page.pageNumber}.png`, buffer);
  await fs.writeFile(`page-${page.pageNumber}.txt`, page.extractedText);
}

Web Service Integration

import { convert } from 'pdf2square';

// Express.js route handler example
export async function processPDF(req, res) {
  try {
    // req.file.path comes from multer file upload middleware
    const results = await convert(req.file.path, {
      maxPages: 10,
      size: 896,
      format: 'png',
    });

    res.json({
      success: true,
      totalPages: results.length,
      pages: results.map((page) => ({
        pageNumber: page.pageNumber,
        image: page.base64EncodedImage,
        text: page.extractedText,
        textWordCount: page.extractedText.split(/\s+/).length,
      })),
    });
  } catch (error) {
    res.status(500).json({
      success: false,
      error: error.message,
    });
  }
}

Batch Processing

import { convert } from 'pdf2square';

async function processBatch(pdfPaths) {
  const results = [];

  for (const pdfPath of pdfPaths) {
    console.log(`Processing ${pdfPath}...`);

    try {
      const pages = await convert(pdfPath, {
        maxPages: 5,
        concurrency: 2, // Lower concurrency when processing multiple files
      });

      results.push({
        pdfPath,
        success: true,
        pageCount: pages.length,
        data: pages,
      });
    } catch (error) {
      console.error(`Failed to process ${pdfPath}:`, error.message);
      results.push({
        pdfPath,
        success: false,
        error: error.message,
      });
    }
  }

  return results;
}

// Example usage:
// const results = await processBatch([
//   './documents/file1.pdf',
//   './documents/file2.pdf',
//   './documents/file3.pdf'
// ]);

Output Format

CLI Output Files

When using the CLI, files are saved with the following naming convention:

input.pdf → input-001.png, input-001.txt
          → input-002.png, input-002.txt
          → ...

Library Output

The library returns base64 encoded images with data URL prefixes:

{
  pageNumber: 1,
  originalPath: "/path/to/input.pdf",
  base64EncodedImage: "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...",
  extractedText: "This is the text content from page 1..."
}

Background Colors

Supported background color formats:

Hex colors: #RRGGBB or #RRGGBBAA
Transparent: transparent (PNG only)
Examples: #ffffff, #ff0000aa, transparent

Note: JPEG format cannot be transparent and will fallback to white background.

Performance Tips

Concurrency: Adjust --concurrency based on your CPU cores and available memory
DPI: Higher DPI produces better text quality but increases processing time and memory usage
Format: JPEG is faster and produces smaller files than PNG but doesn't support transparency
Page Range: Use --first and --max-pages to process only the pages you need
Memory: Large PDFs with high DPI settings may require significant memory

Error Handling

Common errors and solutions:

"Could not determine page count": PDF may be corrupted, password-protected, or invalid
"No pages to convert": Check --first and --max-pages parameters
"Invalid PDF structure": The file is not a valid PDF document
"Format must be 'png' or 'jpg'": Use a supported output format
"Invalid background color": Use hex format (#RRGGBB or #RRGGBBAA) or 'transparent'
Memory errors: Reduce DPI, page count, or concurrency settings

TypeScript Support

The library includes TypeScript declarations for better development experience:

import { convert, ConvertedPDFPage, ConvertOptions } from 'pdf2square';

const options: ConvertOptions = {
  maxPages: 5,
  size: 512,
  format: 'png',
  dpi: 300,
  bg: '#ffffff',
};

const results: ConvertedPDFPage[] = await convert(
  './path/to/document.pdf',
  options,
);

// Process results with full type safety
results.forEach((page: ConvertedPDFPage) => {
  console.log(
    `Page ${page.pageNumber}: ${page.extractedText.length} characters`,
  );
});

License

MIT License - see LICENSE file for details.

Keywords

FAQs

What is @itaylor/pdf2square?

Is @itaylor/pdf2square well maintained?

Package last updated on 31 Oct 2025

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

@itaylor/pdf2square

pdf2square

Features

Requirements

Installation

CLI Usage

Basic Usage

CLI Options

CLI Examples

Library Usage

Basic Example

API Reference

convert(pathToPdf, options?)

ConvertedPDFPage

ConvertOptions

Library Examples

Custom Options

Save Images to Files

Web Service Integration

Batch Processing

Output Format

CLI Output Files

Library Output

Background Colors

Performance Tips

Error Handling

TypeScript Support

License

Keywords

Related posts

Axios Maintainer Confirms Social Engineering Attack Behind npm Compromise

Node.js Drops Bug Bounty Rewards After Funding Dries Up

`convert(pathToPdf, options?)`