Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
A spaCy extension for enhanced date and number entity recognition and extraction as structured data.
Date spaCy is a collection of custom spaCy pipeline component that enables you to easily identify date entities in a text and fetch the parsed date values using spaCy's token extensions. It uses RegEx to find dates and then uses the dateparser library to convert those dates into structured datetime data. One current limitation is that if no year is given, it presumes it is the current year. The dateparser
output is stored in a custom entity extension: ._.date
.
This lightweight approach can be added to an existing spaCy pipeline or to a blank model. If using in an existing spaCy pipeline, be sure to add it before the NER model.
To install date_spacy
, simply run:
pip install date-spacy
First, you'll need to import the find_dates
component and add it to your spaCy pipeline:
import spacy
from date_spacy import find_dates
# Load your desired spaCy model
nlp = spacy.blank('en')
# Add the component to the pipeline
nlp.add_pipe('find_dates')
After adding the component, you can process text as usual:
doc = nlp("""The event is scheduled for 25th August 2023.
We also have a meeting on 10 September and another one on the twelfth of October and a
final one on January fourth.""")
You can iterate over the entities in the doc
and access the special date extension:
for ent in doc.ents:
if ent.label_ == "DATE":
print(f"Text: {ent.text} -> Parsed Date: {ent._.date}")
This will output:
Text: 25th August 2023 -> Parsed Date: 2023-08-25 00:00:00
Text: 10 September -> Parsed Date: 2023-09-10 00:00:00
Text: twelfth of October -> Parsed Date: 2023-10-12 00:00:00
Text: January fourth -> Parsed Date: 2023-01-04 00:00:00
FAQs
A spaCy extension for enhanced date and number entity recognition and extraction as structured data.
We found that date-spacy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.