Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

text-annotator

Package Overview
Dependencies
Maintainers
1
Versions
38
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

text-annotator

A JavaScript library for locating and annotating plain text in HTML

  • 0.9.7
  • latest
  • Source
  • npm
  • Socket score

Version published
Maintainers
1
Created
Source

text-annotator

News

We released a new version of text-annotator: text-annotator-v2, with some improvements and breaking changes. We will still maintain this library here and it still works in our product stably. But you are welcome to try text-annotator-v2 which is supposed to be more robust and lightweight.

Introduction

text-annotator is a JavaScript library for annotating plain text in the HTML.
The annotation process is:

  1. Search: Search for a piece of plain text in the HTML; if finding it, store its location identified by an index and then return the index for later annotation
  2. Annotate: Annotate the found text given its index
    It can be seen that in order to annotate a piece of text, two steps, search and annotate, are taken. The idea of decomposing the annotation process into the two steps is to allow more flexibility, e.g., the user can search for all pieces of text first, and then annotate them later when required (e.g., when clicking a button). There is also a function combining the two steps, as can be seen in the An example of the usage section.
    text-annotator can be used in the browser or the Node.js server.

Import

install it via npm

npm install --save text-annotator

import TextAnnotator from 'text-annotator'

include it into the head tag

<script src="public/js/text-annotator.min.js"></script>

An example of the usage

// below is the HTML
/*
<div id="content">
  <p><b>Europe PMC</b> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p>Europe PMC is <i>developed by <b>EMBL-EBI</b></i>. It is a partner of <b>PubMed Central</b> and a repository of choice for many international science funders.</p>
</div>
*/

// create an instance of TextAnnotator
// content is the HTML string within which a piece of text can be annotated
var annotator = new TextAnnotator({content: document.getElementById('content').innerHTML})

// search for 'EMBL-EBI' in the HTML
// if found, store the location of 'EMBL-EBI' and then return the index; otherwise return -1
var highlightIndex = annotator.search('EMBL-EBI')
// highlightIndex = 0

// annotate 'EMBL-EBI' in the HTML
if (highlightIndex !== -1) {
  document.getElementById('content').innerHTML = annotator.highlight(highlightIndex)
  // <span id="highlight-0" class="highlight"> is used to annotate 'EMBL-EBI', see below
/*
<div id="content">
  <p><b>Europe PMC</b> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p>Europe PMC is <i>developed by <span id="highlight-0" class="highlight"><b>EMBL-EBI</b></span></i>. It is a partner of <b>PubMed Central</b> and a repository of choice for many international science funders.</p>
</div>
*/
}

// search for all occurances of 'Europe PMC' in the HTML
var highlightIndexes = annotator.searchAll('Europe PMC')
// highlightIndexes = [1, 2]

// annotate all the found occurances of 'Europe PMC' given their indexes
if (highlightIndexes.length) {
  document.getElementById('content').innerHTML = annotator.highlightAll(highlightIndexes)
  // <span id="highlight-1" class="highlight"> and <span id="highlight-2" class="highlight"> are used to annotate 'Europe PMC', see below
/*
<div id="content">
  <p><span id="highlight-1" class="highlight"><b>Europe PMC</b><span> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p><span id="highlight-2" class="highlight">Europe PMC</span> is <i>developed by <span id="highlight-0" class="highlight"><b>EMBL-EBI</b></span></i>. It is a partner of <b>PubMed Central</b> and a repository of choice for many international science funders.</p>
</div>
*/
}

// search for and then annotate 'a partner of PubMed Central'
document.getElementById('content').innerHTML = annotator.searchAndHighlight('a partner of PubMed Central')
// searchAndHighlight returns { content, highlightIndex }
// <span id="highlight-3" class="highlight"> is used to annotate 'a partner of PubMed Central', see below
/*
<div id="content">
  <p><span id="highlight-1" class="highlight"><b>Europe PMC</b><span> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p><span id="highlight-2" class="highlight">Europe PMC</span> is <i>developed by <span id="highlight-0" class="highlight"><b>EMBL-EBI</b></span></i>. It is <span id="highlight-3" class="highlight">a partner of <b>PubMed Central</b></span> and a repository of choice for many international science funders.</p>
</div>
*/

// remove annotation 'EMBL-EBI' given its index
// the index is 0 as shown above
document.getElementById('content').innerHTML = annotator.unhighlight(highlightIndex)
// annotation <span id="highlight-0" class="highlight"> is removed, see below
/*
<div id="content">
  <p><span id="highlight-1" class="highlight"><b>Europe PMC</b><span> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p><span id="highlight-2" class="highlight">Europe PMC</span> is <i>developed by <b>EMBL-EBI</b></i>. It is <span id="highlight-3" class="highlight">a partner of <b>PubMed Central</b></span> and a repository of choice for many international science funders.</p>
</div>
*/

// help annotate one occurance of 'science' - the one within 'international science funders', by providing the prefix and postfix of 'Europe PMC'
var highlightIndex = annotator.search('science', { prefix: 'international ', postfix: ' funders' })
if (highlightIndex !== -1) {
  document.getElementById('content').innerHTML = annotator.highlight(highlightIndex)
}
// <span id="highlight-4" class="highlight"> is used to annotate 'science' within 'international science funders', see below
/*
<div id="content">
  <p><span id="highlight-1" class="highlight"><b>Europe PMC</b><span> is an <i>open science platform</i> that enables access to a worldwide collection of life science publications and preprints from trusted sources around the globe.</p>
  <p><span id="highlight-2" class="highlight">Europe PMC</span> is <i>developed by <b>EMBL-EBI</b></i>. It is <span id="highlight-3" class="highlight">a partner of <b>PubMed Central</b></span> and a repository of choice for many international <span id="highlight-4" class="highlight">science</span> funders.</p>
</div>
*/

Constructor

new TextAnnotator(options = {content})
PropTypeDescription
contentstringThe HTML string within which a piece of text can be annotated.

Search APIs

search(str, options = {trim, caseSensitive, prefix, postfix})
searchAll(str, options = {trim, caseSensitive, prefix, postfix})
PropTypeDescription
trimbooleanWhether to trim the piece of text to be annotated. Default is true.
caseSensitivebooleanWhether to consider case in search. Default is false.
prefixstringA string BEFORE the piece of text to be annotated. Default is ''.
postfixstringA string AFTER the piece of text to be annotated. Default is ''.

Annotation APIs

highlight(highlightIndex, options = {highlightTagName, highlightClass, highlightIdPattern})
highlightAll(highlightIndexes, options = {highlightTagName, highlightClass, highlightIdPattern})
unhighlight(highlightIndex, options = {highlightTagName, highlightClass, highlightIdPattern})
PropTypeDescription
highlightTagNamestringThe name of the annotation tag. Default is span so that the tag is <span ...>.
highlightClassstringThe class name of the annotation tag. Default is highlight so that the tag is <span class="highlight" ...>.
highlightIdPatternstringThe ID pattern of the annotation tag. Default is highlight- so that the tag is <span id="highlight-[highlightIndex]" ...>.

searchAndHighlight API

searchAndHighlight(str, options), where options = { searchOptions, highlightOptions }, searchOptions and highlightOptions are described above in the Annotation options table.

Examples from Europe PMC

text-annotator has been widely used in Europe PMC, an open science platform that enables access to a worldwide collection of life science publications. Here is a list of examples:

  1. Article title highlighting: https://europepmc.org/search?query=cancer "Article title highlighting" "Article title highlighting"
  2. Snippets: https://europepmc.org/article/PPR/PPR158972 (Visit from https://europepmc.org/search?query=cancer) "Snippets" "Snippets"
  3. SciLite: https://europepmc.org/article/PPR/PPR158972 (Click the Annotations link in the right panel) "SciLite" "SciLite"
  4. Linkback: https://europepmc.org/article/PPR/PPR158972#europepmc-6e6312219dcad15c9a7dda8f71dce9af (In the popup shown in Example 3, click "Share" to get this linkback URL) "Linkback" "Linkback"

Contact

Zhan Huang

Keywords

FAQs

Package last updated on 30 Jun 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc