
Research
/Security News
Critical Vulnerability in NestJS Devtools: Localhost RCE via Sandbox Escape
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
PDFlex?
PDFlex is a powerful PDF processing toolkit for Python. It provides robust tools for PDF validation, text extraction, merging (with custom separator pages), searching, and more—all built to streamline your PDF automation workflows.
PDFlex is available on PyPI. To install using pip:
pip install -U pdflex
Alternatively, install in an isolated environment with pipx:
pipx install pdflex
For the fastest installation using uv:
uv tool install pdflex
PDFlex provides a convenient CLI for merging and searching PDFs. The CLI supports two primary commands: merge
and search
.
Merge multiple PDF files into a single document while automatically inserting a separator page before each document.
Usage:
pdflex merge /path/to/file1.pdf /path/to/file2.pdf -o merged_output.pdf
Add the --landscape
flag to create separator pages in landscape orientation:
pdflex merge /path/to/file1.pdf /path/to/file2.pdf -o merged_output.pdf --landscape
Search for PDF files in a directory based on filename filters (or search for lecture slides with numeric float prefixes) and merge them into one PDF.
Usage:
General Search:
pdflex search /path/to/search -o merged_output.pdf --prefix "Chapter" --suffix ".pdf"
Lecture Slides Merge:
(Merges all PDFs whose filenames start with a numeric float prefix like 1.2_
, 3.2_
, etc., in sorted order. Separator pages will be in landscape orientation.)
pdflex search /path/to/algorithms-and-computation -o merged_lectures.pdf --lecture
You can also use PDFlex directly from your Python code. Below are examples for some common tasks.
from pathlib import Path
from pdflex.merge import merge_pdfs
# List of PDF file paths to merge
pdf_files = [
"/path/to/document1.pdf",
"/path/to/document2.pdf"
]
# Merge files, using landscape separator pages (ideal for lecture slides)
merge_pdfs(pdf_files, output_path="merged_output.pdf", landscape=True)
from pdflex.search import search_pdfs, search_numeric_prefixed_pdfs
# General search: Find PDFs that start with a prefix and/or end with a suffix
pdf_list = search_pdfs("/path/to/search", prefix="Chapter", suffix=".pdf")
print("Found PDFs:", pdf_list)
# Lecture slides: Find PDFs with numeric float prefixes (e.g., "1.2_Intro.pdf")
lecture_slides = search_numeric_prefixed_pdfs("/path/to/algorithms-and-computation")
print("Found lecture slides:", lecture_slides)
Contributions are welcome! Whether it's bug reports, feature requests, or code contributions, please feel free to:
This project is built upon several awesome PDF open-source projects:
PDFlex is released under the MIT license.
Copyright (c) 2020 to present PDFlex and contributors.
FAQs
Python tools for PDF automation.
We found that pdflex demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
A flawed sandbox in @nestjs/devtools-integration lets attackers run code on your machine via CSRF, leading to full Remote Code Execution (RCE).
Product
Customize license detection with Socket’s new license overlays: gain control, reduce noise, and handle edge cases with precision.
Product
Socket now supports Rust and Cargo, offering package search for all users and experimental SBOM generation for enterprise projects.