Socket
Socket
Sign inDemoInstall

compromise-dates

Package Overview
Dependencies
6
Maintainers
1
Versions
39
Alerts
File Explorer

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

    compromise-dates

plugin for nlp-compromise


Version published
Weekly downloads
2.4K
decreased by-28.77%
Maintainers
1
Created
Weekly downloads
 

Readme

Source
date-parsing plugin for compromise
npm install compromise-dates

This library is an earnest attempt to get date information out of text, in a clear way -

- including all informal text formats, and folksy shorthands.
import nlp from 'compromise'
import datePlugin from 'compromise-dates'
nlp.plugin(datePlugin)

let doc = nlp('the second monday of february')
doc.dates().get()[0]
/*
  { start: '2021-02-08T00:00:00.000Z', end: '2021-02-08T23:59:59.999Z'}
*/
Tokenization and disambiguation with compromise.
Timezone and DST reckoning with spacetime [1]
Number-parsing with compromise-numbers [1]
Timezone reconciliation with spacetime-informal [1]

Demo

Things it does well:

explicit-datesdescriptionStartEnd
march 2ndMarch 2, 12:00amMarch 2, 11:59pm
2 march''''
tues march 2''''
march the secondnatural-language number''''
on the 2ndimplicit months''''
tuesday the 2nddate-reckoning''''

numeric-dates:
2020/03/02iso formats''''
2020-03-02''''
03-02-2020british formats''''
03/02''''
2020.08.13alt-ISO''''

named-dates:
today--
tomorrow''''
christmas evecalendar-holidaysDec 24, 12:00amDec 24, 11:59pm
easterastronomical holidays-depends--
q1Jan 1, 12:00amMar 31, 11:59pm

times:
2pm''''
2:12pm''''
2:12''''
02:12:00weird iso-times''''
two oclockwritten formats''''
before 1''''
noon''''
at nightinformal daytimes''''
in the morning''''
tomorrow evening''''

timezones:
eastern timeinformal zone support''''
estTZ shorthands''''
peru time''''
..in beirutby location''''
GMT+9by UTC/GMT offset''''
-4h''''''
Canada/EasternIANA codes''''

relative durations:
this march''''
this week''''
this sunday''''
next april''''
this past year''''
second week of march''''
last weekend of march''''
last spring''''
the saturday after next''''

punted dates:
in seven weeksnow+duration''''
two days after june 6thdate+duration''''
2 weeks from now''''
2 weeks after june''''
2 years, 4 months, and 5 days agocomplex durations''''
a week and a half beforewritten-out numbers''''
a week fridayidiom format''''

start/end:
end of the weekup-against the ending''''
start of next yearlean-toward starting''''
middle of q2 last yearrough-center calculation''''

date-ranges:
between june and julyexplicit ranges''''
from today to next haloween''''
aug 1 - aug 31dash-ranges''''
22-23 February''''
today to next friday''''
during june''''
aug to june 1999shared range info''''
before [2019]up-to a date''''
by march''''
after februarydate-to-infinity''''

repeating-intervals:
any wednesdayn-repeating dates
any day in Junerepeating-date in rangeJune 1 ..... June 30
any wednesday this week''''
weekends in Julymore-complex interval''''
every weekday until Februaryinterval until date''''

Things it does awkwardly:

hmmm,descriptionStartEnd
middle of 2019/Junetries to find the sorta-centerJune 15''
good friday 2025tries to reckon astronomically-set holidays''''
Oct 22 1975 2am in PSThistorical DST changes (assumes current dates)''''

Things it doesn't do:

😓,descriptionStartEnd
not this Saturday, but the Saturday afterself-reference logic''''
3 years ago tomorrowfolksy short-hand''''
2100military time formats''''
may 97'bare' 2-digit years''''

API

Configuration:

.dates() accepts an optional object, that lets you set the context for the date parsing.

const context = {
  timezone: 'Canada/Eastern', //the default timezone is 'ETC/UTC'
  today: '2020-02-20', //the implicit, or reference day/year
  punt: { weeks: 2 }, // the implied duration to use for 'after june 2nd'
  dayStart: '8:00am',
  dayEnd: '5:30pm',
}

nlp('in two days').dates(context).get()
/*
  [{ start: '2020-02-22T08:00:00.000+5:00', end: '2020-02-22T17:30:00.000+5:00' }]
*/

Opinions:

Start of week:

By default, weeks start on a Monday, and 'next week' will run from Monday morning to Sunday night. This can be configued in spacetime, but right now we are not passing-through this config.

Implied durations:

'after October' returns a range starting Nov 1st, and ending 2-weeks after, by default. This can be configured by setting punt param in the context object:

doc.dates({ punt: { month: 1 } })

Future bias:

'May 7th' will prefer a May 7th in the future.

The parser will return a past-date though, in the current-month:

// from march 2nd
nlp('feb 30th').dates({ today: '2021-02-01' }).get()

This/Next/Last:

named-weeks or months eg 'this/next/last week' are mostly straight-forward.

This monday

A bare 'monday' will always refer to itself, or the upcoming monday.

  • Saying 'this monday' on monday, is itself.
  • Saying 'this monday' on tuesday , is next week.

Likewise, 'this june' in June, is itself. 'this june' in any other month, is the nearest June in the future.

Future versions of this library could look at sentence-tense to help disambiguate these dates - 'i paid on monday' vs 'i will pay on monday'.

Last monday

If it's Tuesday, 'last monday' will not mean yesterday.

  • Saying 'last monday' on a tuesday will be -1 week.
  • Saying 'a week ago monday' will also work.
  • Saying 'this past monday' will return yesterday.

For reference, Wit.ai & chronic libraries both return yesterday. Natty and SugarJs returns -1 week, like we do.

'last X' can be less than 7 days backward, if it crosses a week starting-point:

  • Saying 'last friday' on a monday will be only a few days back.
Next Friday

If it's Tuesday, 'next wednesday' will not be tomorrow. It will be a week after tomorrow.

  • Saying 'next wednesday' on a tuesday, will be +1 week.
  • Saying 'a week wednesday' will also be +1 week.
  • Saying 'this coming wednesday' will be tomorrow.

For reference, Wit.ai, chronic, and Natty libraries all return tomorrow. SugarJs returns +1 week, like we do.

Nth Week:

The first week of a month, or a year is the first week with a thursday in it. This is a weird, but widely-held standard. I believe it's a military formalism. It cannot be (easily) configued. This means that the start-date for first week of January may be a Monday in December, etc.

As expected, first monday of January will always be in January.

British/American ambiguity:

by default, we use the same interpretation of dates as javascript does - we assume 01/02/2020 is Jan 2nd, (US-version) but allow 13/01/2020 to be Jan 13th (UK-version). This should be possible to configure in the near future.

Seasons:

By default, 'this summer' will return June 1 - Sept 1, which is northern hemisphere ISO. Configuring the default hemisphere should be possible in the future.

Day times:

There are some hardcoded times for 'lunch time' and others, but mainly, a day begins at 12:00am and ends at 11:59pm - the last millisecond of the day.

Invalid dates:

compromise will tag anything that looks like a date, but not validate the dates until they are parsed.

  • 'january 34th 2020' will return Jan 31 2020.
  • 'tomorrow at 2:62pm' will return just return 'tomorrow'.
  • '6th week of february will return the 2nd week of march.
  • Setting an hour that's skipped, or repeated by a DST change will return the closest valid time to the DST change.

Inclusive/exclusive ranges:

'between january and march' will include all of march. This is usually pretty-ambiguous normally.

Date greediness:

This library makes no assumptions about the input text, and is careful to avoid false-positive dates. If you know your text is a date, you can crank-up the date-tagger with a compromise-plugin, like so:

nlp.extend(function (Doc, world) {
  // ambiguous words
  world.addWords({
    weds: 'WeekDay',
    wed: 'WeekDay',
    sat: 'WeekDay',
    sun: 'WeekDay',
  })
  world.postProcess(doc => {
    // tag '2nd quarter' as a date
    doc.match('#Ordinal quarter').tag('#Date')
    // tag '2/2' as a date (not a fraction)
    doc.match('/[0-9]{1,2}/[0-9]{1,2}/').tag('#Date')
  })
})

Misc:

  • 'thursday the 16th' - will set to the 16th, even if it's not thursday
  • 'in a few hours/years' - in 2 hours/years
  • 'jan 5th 2008 to Jan 6th the following year' - date-range explicit references
  • assume 'half past 5' is 5pm

About:

1 - Regular-expressions are too-brittle to parse dates.
2 - Neural-nets are too-wonky to parse dates.
3 - A corporation, or startup is the wrong place to build a universal date-parser.

Parsing dates, times, durations, and intervals from natural language can be a solved-problem.

A rule-based, community open-source library - one based on simple NLP - is the best way to build a natural language date parser - commercial, or otherwise - for the frontend, or the backend.

The match-syntax is effective and easy, javascript is prevailing, and the more people who contribute, the better.

See also

compromise-date is sponsored by

MIT licenced

FAQs

Last updated on 16 Feb 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc