New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

@transkripid/pdf-text-replace

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@transkripid/pdf-text-replace

Find and replace text in PDF files with preserved formatting

latest
npmnpm
Version
1.0.0
Version published
Maintainers
1
Created
Source

pdf-text-replace

License: MIT

Find and replace text in PDF files while preserving formatting.

Features

  • Chainable API mimicking JavaScript's String.replace()
  • Supports string and RegExp search patterns
  • Preserves font styles, colors, and layout
  • Handles FlateDecode compressed streams
  • Graceful error handling (returns original buffer on failure)
  • Automatic Unicode transliteration (CJK, Cyrillic, accented characters → ASCII)
  • Pure TypeScript with minimal dependencies (pako for zlib, any-ascii for transliteration)

Installation

# From npm (when published)
npm install pdf-text-replace

# From local path
npm install /path/to/pdf-text-replace

Usage

import { PDF } from 'pdf-text-replace';
import { readFileSync, writeFileSync } from 'fs';

const input = readFileSync('document.pdf');

const modified = new PDF(input)
  .replace('John Doe', 'Jane Smith')
  .replace('old@email.com', 'new@email.com')
  .replace(/\d{4}-\d{4}-\d{4}/g, 'XXXX-XXXX-XXXX')
  .toBuffer();

writeFileSync('modified.pdf', modified);

API

new PDF(input: Buffer | Uint8Array)

Create a new PDF instance from a buffer.

.replace(search: string | RegExp, replacement: string): this

Queue a text replacement operation. Returns this for chaining.

  • search - String or RegExp pattern to find
  • replacement - Text to replace matches with

.toBuffer(): Buffer

Apply all queued replacements and return the modified PDF as a Buffer.

Returns the original buffer unchanged if:

  • No matches are found
  • An error occurs during processing

How It Works

The library parses PDF content streams (both raw and FlateDecode compressed), finds text operators (Tj, TJ), and performs replacements while:

  • Preserving the original font and styling
  • Adjusting horizontal scaling (Tz operator) when replacement text has different width
  • Rebuilding the PDF with updated stream lengths and xref table

Unicode Support

Replacement text containing Unicode characters is automatically transliterated to ASCII for compatibility with standard PDF fonts (WinAnsiEncoding):

// Chinese → Pinyin
.replace('Author', '银宵')        // Becomes "YinXiao"

// Korean → Romanized  
.replace('Name', '스트레이')      // Becomes "seuteulei"

// Cyrillic → Latin
.replace('Hello', 'Привет')       // Becomes "Privet"

// Accented → Plain ASCII
.replace('Name', 'José García')   // Becomes "Jose Garcia"

This uses any-ascii for transliteration.

Limitations

  • Only works with PDFs using WinAnsiEncoding (standard Latin text)
  • Complex font encodings (CID, Identity-H) are not supported
  • Unicode replacement text is transliterated to ASCII (original Unicode cannot be preserved)
  • Text split across multiple operators may not be found
  • Scanned/image-based PDFs cannot be modified

License

MIT

Keywords

pdf

FAQs

Package last updated on 28 Dec 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts