New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

node-pptx-parser

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

node-pptx-parser

A PowerPoint (PPTX) parser that extracts text content with preserved formatting

latest
Source
npmnpm
Version
1.0.1
Version published
Weekly downloads
6.8K
-21.69%
Maintainers
1
Weekly downloads
 
Created
Source

node-pptx-parser

A Node.js library for parsing PowerPoint (PPTX) files and extracting text content. This library maintains text formatting, line breaks, and paragraph structures from the original presentation.

Features

  • Extract text content from PPTX files with preserved formatting

  • Parse PPTX structure into manageable JavaScript objects

  • Access raw XML content of presentation components

  • Written in TypeScript for type safety

  • Promise-based API

  • Preserves line breaks and paragraph formatting

  • Minimal dependencies

Installation


npm  install  node-pptx-parser

Usage

Once the package is installed you can you it with import or require statements like this:

// ESM import:
import PptxParser from "node-pptx-parser";

// CommonJs require:
const PptxParser = require("node-pptx-parser").default;

Basic Text Extraction

import PptxParser from "node-pptx-parser";

async function main() {
  const parser = new PptxParser("presentation.pptx");

  try {
    // Extract text from all slides
    const textContent = await parser.extractText();

    // Print text from each slide
    textContent.forEach((slide) => {
      console.log(`\nSlide ${slide.id}:`);

      console.log(slide.text.join("\n"));
    });
  } catch (error) {
    console.error("Error:", error.message);
  }
}

main();

Advanced Usage - Full Presentation Parsing

import PptxParser from "node-pptx-parser";

async function main() {
  const parser = new PptxParser("presentation.pptx");

  try {
    // Get complete parsed presentation content
    const parsedContent = await parser.parse();

    // Access presentation structure
    console.log(parsedContent.presentation.parsed);

    // Access individual slides
    parsedContent.slides.forEach((slide) => {
      console.log(`Slide ${slide.id}:`, slide.parsed);
    });

    // Access raw XML if needed
    console.log(parsedContent.presentation.xml);
  } catch (error) {
    console.error("Error:", error.message);
  }
}

main();

API Reference

PptxParser

The main class for parsing PPTX files.

Constructor


constructor(filePath: string)

Creates a new instance of PptxParser.

  • filePath: Path to the PPTX file to be parsed

Methods

parse()

async parse(): Promise<ParsedPresentation>

Parses the entire PPTX file and returns its content.

  • Returns: Promise resolving to a ParsedPresentation object containing the complete presentation structure
extractText()

async extractText(): Promise<SlideTextContent[]>

Extracts formatted text content from all slides.

  • Returns: Promise resolving to an array of SlideTextContent objects

Types

ParsedPresentation

interface ParsedPresentation {
  presentation: {
    path: string;
    xml: string;
    parsed: any;
  };
  relationships: {
    path: string;
    xml: string;
    parsed: any;
  };
  slides: ParsedSlide[];
}

ParsedSlide

interface ParsedSlide {
  id: string;
  path: string;
  xml: string;
  parsed: any;
}

SlideTextContent

interface SlideTextContent extends ParsedSlide {
  text: string[];
}

Error Handling

The library throws errors in the following cases:

  • Invalid PPTX file structure

  • File reading errors

  • XML parsing errors

Example error handling:

try {
  const parser = new PptxParser("presentation.ppt");
  const content = await parser.extractText();
} catch (error) {
  if (error.message.includes("Invalid PPTX file structure")) {
    console.error("The PPTX file is corrupted or invalid");
  } else {
    console.error("An error occurred:", error.message);
  }
}

Dependencies

  • unzipper: For extracting PPTX files
  • xml2js: For parsing XML content

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Keywords

pptx

FAQs

Package last updated on 17 Feb 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts