Socket
Socket
Sign inDemoInstall

parse-english

Package Overview
Dependencies
7
Maintainers
1
Versions
55
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    parse-english

English natural language parser


Version published
Weekly downloads
96K
decreased by-4.44%
Maintainers
1
Install size
114 kB
Created
Weekly downloads
 

Readme

Source

parse-english

Build Coverage Downloads Size Chat

English language parser for retext producing nlcst nodes.

Install

This package is ESM only: Node 12+ is needed to use it and it must be imported instead of required.

npm:

npm install parse-english

Use

import inspect from 'unist-util-inspect'
import {ParseEnglish} from 'parse-english'

var tree = new ParseEnglish().parse(
  'Mr. Henry Brown: A hapless but friendly City of London worker.'
)

console.log(inspect(tree))

Yields:

RootNode[1] (1:1-1:63, 0-62)
└─ ParagraphNode[1] (1:1-1:63, 0-62)
   └─ SentenceNode[23] (1:1-1:63, 0-62)
      ├─ WordNode[2] (1:1-1:4, 0-3)
      │  ├─ TextNode: "Mr" (1:1-1:3, 0-2)
      │  └─ PunctuationNode: "." (1:3-1:4, 2-3)
      ├─ WhiteSpaceNode: " " (1:4-1:5, 3-4)
      ├─ WordNode[1] (1:5-1:10, 4-9)
      │  └─ TextNode: "Henry" (1:5-1:10, 4-9)
      ├─ WhiteSpaceNode: " " (1:10-1:11, 9-10)
      ├─ WordNode[1] (1:11-1:16, 10-15)
      │  └─ TextNode: "Brown" (1:11-1:16, 10-15)
      ├─ PunctuationNode: ":" (1:16-1:17, 15-16)
      ├─ WhiteSpaceNode: " " (1:17-1:18, 16-17)
      ├─ WordNode[1] (1:18-1:19, 17-18)
      │  └─ TextNode: "A" (1:18-1:19, 17-18)
      ├─ WhiteSpaceNode: " " (1:19-1:20, 18-19)
      ├─ WordNode[1] (1:20-1:27, 19-26)
      │  └─ TextNode: "hapless" (1:20-1:27, 19-26)
      ├─ WhiteSpaceNode: " " (1:27-1:28, 26-27)
      ├─ WordNode[1] (1:28-1:31, 27-30)
      │  └─ TextNode: "but" (1:28-1:31, 27-30)
      ├─ WhiteSpaceNode: " " (1:31-1:32, 30-31)
      ├─ WordNode[1] (1:32-1:40, 31-39)
      │  └─ TextNode: "friendly" (1:32-1:40, 31-39)
      ├─ WhiteSpaceNode: " " (1:40-1:41, 39-40)
      ├─ WordNode[1] (1:41-1:45, 40-44)
      │  └─ TextNode: "City" (1:41-1:45, 40-44)
      ├─ WhiteSpaceNode: " " (1:45-1:46, 44-45)
      ├─ WordNode[1] (1:46-1:48, 45-47)
      │  └─ TextNode: "of" (1:46-1:48, 45-47)
      ├─ WhiteSpaceNode: " " (1:48-1:49, 47-48)
      ├─ WordNode[1] (1:49-1:55, 48-54)
      │  └─ TextNode: "London" (1:49-1:55, 48-54)
      ├─ WhiteSpaceNode: " " (1:55-1:56, 54-55)
      ├─ WordNode[1] (1:56-1:62, 55-61)
      │  └─ TextNode: "worker" (1:56-1:62, 55-61)
      └─ PunctuationNode: "." (1:62-1:63, 61-62)

API

This package exports the following identifiers: ParseEnglish. There is no default export.

parse-english has the same API as parse-latin.

Algorithm

All of parse-latin is included, and the following support for the English natural language:

  • Unit abbreviations (tsp., tbsp., oz., ft., and more)
  • Time references (sec., min., tues., thu., feb., and more)
  • Business Abbreviations (Inc. and Ltd.)
  • Social titles (Mr., Mmes., Sr., and more)
  • Rank and academic titles (Dr., Rep., Gen., Prof., Pres., and more)
  • Geographical abbreviations (Ave., Blvd., Ft., Hwy., and more)
  • American state abbreviations (Ala., Minn., La., Tex., and more)
  • Canadian province abbreviations (Alta., Qué., Yuk., and more)
  • English county abbreviations (Beds., Leics., Shrops., and more)
  • Common elision (omission of letters) (’n’, ’o, ’em, ’twas, ’80s, and more)

License

MIT © Titus Wormer

Keywords

FAQs

Last updated on 01 Apr 2021

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc