New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →
Socket
Book a DemoSign in
Socket

@anyparser/test

Package Overview
Dependencies
Maintainers
0
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install
Package was removed
Sorry, it seems this package was removed from the registry

@anyparser/test

The `@anyparser/core` Typescript SDK enables developers to quickly extract structured data from a wide variety of file formats like PDFs, images, websites, audio, and videos.

unpublished
latest
npmnpm
Version
0.1.1
Version published
Weekly downloads
0
Maintainers
0
Weekly downloads
 
Created
Source

@anyparser/core

A powerful JavaScript SDK for parsing and extracting structured data from various file formats including PDFs, images, and web pages.

Table of Contents

  • Features
  • Requirements
  • Installation
  • Quick Start
  • Usage Examples
  • API Reference
  • Configuration
  • Contributing
  • License

Features

  • 📄 Text extraction from PDFs and images using OCR
  • 🌐 Support for multiple languages and OCR presets
  • 🕷️ Web crawling capabilities
  • 🔄 Configurable output formats (JSON, Markdown, HTML)
  • ✅ Built-in validation and error handling
  • 🚀 Promise-based async/await API
  • 📦 TypeScript support out of the box

Requirements

  • Node.js 20.x or higher

Installation

npm install @anyparser/core

For TypeScript users, types are included in the package.

Quick Start

Before starting, add a new API key on the Anyparser Studio.

export ANYPARSER_API_URL=https://anyparserapi.com
export ANYPARSER_API_KEY=<your-api-key>
import { Anyparser } from '@anyparser/core'

const parser = new Anyparser()

async function main() {
  const result = await parser.parse('docs/sample.docx')
  console.log(result)
}

main().catch(console.error)

Usage Examples

1. Parsing Multiple Files

import { Anyparser, AnyparserResultBase } from '@anyparser/core'

const parser = new Anyparser()

async function main() {
  const files = ['docs/sample1.docx', 'docs/sample2.pdf']
  const result = await parser.parse(files) as AnyparserResultBase[]

  for (const item of result) {
    console.log('File:', item.originalFilename)
    console.log('Total characters:', item.totalCharacters)
    console.log('Markdown:', item.markdown?.substring(0, 500))
  }
}

main().catch(console.error)

2. OCR Configuration

import { Anyparser, AnyparserOption, OCR_LANGUAGES } from '@anyparser/core'

const options = {
  model: 'ocr',
  format: 'markdown',
  ocrLanguage: [OCR_LANGUAGES.JAPANESE]
}

const parser = new Anyparser(options)

async function main() {
  const result = await parser.parse('docs/document.png')
  console.log(result)
}

main().catch(console.error)

API Reference

Anyparser Class

Constructor

new Anyparser(options?: AnyparserOption)

Methods

  • parse(filePathsOrUrl: string | string[]): Promise<Result>

Configuration Options

The AnyparserOption interface supports the following configuration:

interface AnyparserOption {
  apiUrl?: URL                    // API endpoint URL
  apiKey?: string                 // Your API key
  format?: 'json' | 'markdown' | 'html'
  model?: string                  // e.g. 'ocr'
  image?: boolean                 // Extract images
  table?: boolean                 // Extract tables
  ocrLanguage?: string[]         // OCR language codes
  ocrPreset?: string             // OCR preset ('scan', etc)
}

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

Apache-2.0

Keywords

anyparser

FAQs

Package last updated on 25 Feb 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts