Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

sentence-parse

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

sentence-parse

A simple utility to parse text into sentences

latest
Source
npmnpm
Version
1.3.1
Version published
Weekly downloads
1.5K
-49.85%
Maintainers
1
Weekly downloads
 
Created
Source

📄 Sentence Parse

A simple utility to parse text into sentences.

sentence-parse

Installation

npm install sentence-parse

Usage

The parser can be used to split text into sentences with various options. Here's a basic example:

import { parseSentences } from 'sentence-parse';

// Parse from string
const text = "Hello world! This is a test.";
const sentences = await parseSentences(text);
console.log(sentences);
// Output: ["Hello world!", "This is a test."]

// Parse from file
import { readFile } from 'fs/promises';
import { join } from 'path';

const fileText = await readFile(join(process.cwd(), 'text-file.txt'), 'utf8');
const fileSentences = await parseSentences(fileText);
console.log(fileSentences);

Options

  • observeMultipleLineBreaks: Treats two or more consecutive line breaks as separate sentences. Default is false.
  • removeStartLineSequences: Removes specified sequences at the start of each line. Default is an empty array [].
  • preserveHTMLBreaks: Preserves HTML <br> and <p> tags as line breaks in the text. Default is true.
  • preserveListItems: Preserves list items by adding a prefix to each <li> element. Default is true.
  • listItemPrefix: Specifies the prefix to use for list items when preserveListItems is true. Default is '- '.
  • excludeNonLetterSentences: Excludes segments that contain no letters (only numbers, symbols, etc). Default is false.

Examples

Using observeMultipleLineBreaks

import { parseSentences } from 'sentence-parse';

const text = "Hello world!\n\nThis is a test.";
const sentences = await parseSentences(text, { observeMultipleLineBreaks: true });
console.log(sentences);
// Output: ["Hello world!", "This is a test."]

Using removeStartLineSequences

import { parseSentences } from 'sentence-parse';

const text = "> Hello world!\n> This is a test.";
const sentences = await parseSentences(text, { removeStartLineSequences: ['>'] });
console.log(sentences);
// Output: ["Hello world!", "This is a test."]

Using HTML Options

import { parseSentences } from 'sentence-parse';

const htmlText = `
<p>Hello world!<br>This is a test.</p>
<ul>
  <li>First item</li>
  <li>Second item</li>
</ul>
`;

const sentences = await parseSentences(htmlText, {
  preserveHTMLBreaks: true,
  preserveListItems: true,
  listItemPrefix: '* '
});

console.log(sentences);
// Output: ["Hello world!", "This is a test.", "* First item", "* Second item"]

Using excludeNonLetterSentences

import { parseSentences } from 'sentence-parse';

const text = "Hello world! $4,000,000. This is a test.";
const sentences = await parseSentences(text, { excludeNonLetterSentences: true });
console.log(sentences);
// Output: ["Hello world!", "This is a test."]

Example

Check out example/example.js for a working example that parses sentences from a text file.

Run the example:

cd example
node example

Keywords

sentence

FAQs

Package last updated on 30 Jan 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts