Introducing Socket Firewall: Free, Proactive Protection for Your Software Supply Chain.Learn More →

Book a Demo Install Sign in

Book a Demo Install Sign in

maven

Categories
Server
Text Processing

Text Processing

org.apache.commons:commons-text

Apache Commons Text is a set of utility functions and reusable components for processing and manipulating text in a Java environment.

1.14.0 • 2 months ago

edu.stanford.nlp:stanford-parser

Stanford Parser processes raw text in English, Chinese, German, Arabic, and French, and extracts constituency parse trees.

3.9.2 • 7 years ago

org.apache.jena:jena-text

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.

5.5.0 • 3 months ago

com.twitter:twitter-text

Text processing routines for Twitter Tweets

1.14.7 • 8 years ago

com.univocity:univocity-parsers

univocity's open source parsers for processing different text formats using a consistent API

2.9.1 • 5 years ago

org.nlpcn:nlp-lang

basic TEXT NLP process util !

1.7.9 • 4 years ago

cc.mallet:mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

2.0.8 • 9 years ago

com.twitter.twittertext:twitter-text

Text processing routines for Twitter Tweets

3.1.0 • 6 years ago

com.twitter.penguin:korean-text

Scala library to process Korean text

4.4.4 • 9 years ago

org.apache.seatunnel:seatunnel-format-text

Production ready big data processing product based on Apache Spark and Apache Flink.

2.3.12 • 4 weeks ago

de.tudarmstadt.ukp.dkpro.core:de.tudarmstadt.ukp.dkpro.core.io.text-asl

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

1.10.0 • 7 years ago

org.openkoreantext:open-korean-text

Scala/Java library to process Korean text

2.3.1 • 7 years ago

org.dkpro.core:dkpro-core-io-text-asl

DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework.

2.5.0 • last year

org.modeshape:modeshape-sequencer-text

ModeShape Sequencer that processes fixed width and delimited text files

5.4.1.Final • 8 years ago

com.github.sanskrit-coders:indic-transliteration

A collection of scala and java classes for some basic character level processing for the Sanskrit and other Indic (kannada, telugu, etc..) languages, contributed by the open source sanskrit-coders projects and friends. Some notable facilities: * Transliterate text from one script or encoding scheme to another. * Some grammar simulation. Examples: see https://github.com/sanskrit-coders/indic-transliteration Contributions and suggestions are invited at https://github.com/sanskrit-coders/indic-transliteration . (Sister projects there may also be of interest.)

1.6 • 8 years ago

org.opencastproject:opencast-speech-to-text-api

Opencast is a media capture, processing, management and distribution system

16.10 • 6 months ago

com.ibm.watson.developer_cloud:natural-language-understanding

Natural language processing for advanced text analysis

6.14.2 • 6 years ago

uk.ac.gate:gate-core

GATE - general architecture for text engineering - is open source software capable of solving almost any text processing problem. This artifact enables you to embed the core GATE Embedded with its essential dependencies. You will able to use the GATE Embedded API and load and store GATE XML documents. This artifact is the perfect dependency for CREOLE plugins or for applications that need to customize the GATE dependencies due to conflict with their own dependencies or for lower footprint.

9.0.1 • 5 years ago

com.github.steveash.mallet:mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

2.0.12 • 8 years ago

org.dkpro.similarity:dkpro-similarity-algorithms-api-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

edu.ucla.sspace:sspace-wordsi

The S-Space Package is a collection of algorithms for building Semantic Spaces as well as a highly-scalable library for designing new distributional semantics algorithms. Distributional algorithms process text corpora and represent the semantic for words as high dimensional feature vectors. This package also includes matrices, vectors, and numerous clustering algorithms. These approaches are known by many names, such as word spaces, semantic spaces, or distributed semantics and rest upon the Distributional Hypothesis: words that appear in similar contexts have similar meanings.

2.0 • 13 years ago

com.johnsnowlabs.nlp:spark-nlp_2.12

Spark NLP is an open-source text processing library for advanced natural language processing.

6.1.4 • last week

org.dkpro.similarity:dkpro-similarity-uima-core-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

com.github.rrodriguessilico:mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

2.0.8-RC3-Unofficial • 10 years ago

com.twitter.penguin:korean-text-scala-2.10

Scala library to process Korean text

4.2.0 • 10 years ago

org.openimaj:nlp

The OpenIMAJ NLP Library contains a text pre-processing pipeline which goes from raw unstructured text to part of speech tagged stemmed text.

1.3.10 • 6 years ago

com.johnsnowlabs.nlp:spark-nlp-gpu_2.12

Spark NLP is an open-source text processing library for advanced natural language processing.

6.1.4 • last week

org.dkpro.similarity:dkpro-similarity-algorithms-lexical-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-uima-api-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-algorithms-vsm-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.apache.stanbol:org.apache.stanbol.commons.solr.extras.kuromoji

This provides an Bundle for processing Japanese Texts with Lucene

1.0.0 • 9 years ago

org.dkpro.similarity:dkpro-similarity-ml-core-gpl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

com.johnsnowlabs.nlp:spark-nlp-aarch64_2.12

Spark NLP is an open-source text processing library for advanced natural language processing.

6.1.4 • last week

de.tudarmstadt.langtech.substituter.twsi:de.tudarmstadt.langtech.substituter.twsi

This is software to produce lexical substitutions in context for over 1000 frequent nouns. The software processes English text.

1.0.2 • 12 years ago

org.dkpro.similarity:dkpro-similarity-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-uima-core-gpl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-uima-io-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.fcrepo:modeshape-sequencer-text

ModeShape Sequencer that processes fixed width and delimited text files

5.5.1.fcr • 4 years ago

org.dkpro.similarity:dkpro-similarity-uima-data-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-ml-io-gpl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-algorithms-structure-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-algorithms-style-asl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.dkpro.similarity:dkpro-similarity-algorithms-ml-gpl

DKPro Similarity is an open source framework for text similarity. Our goal is to provide a comprehensive repository of text similarity measures which are implemented using standardized interfaces. The framework is designed to complement DKPro Core, a collection of software components for natural language processing based on the Apache UIMA framework.

2.3.0 • 7 years ago

org.opencastproject:opencast-speech-to-text-remote

Opencast is a media capture, processing, management and distribution system

16.10 • 6 months ago

com.github.nyla-solutions:nyla.solutions.core

This Java API provides support for application utilities (application configuration, data encryption, debugger, text processing, and more).

2.3.3 • 2 months ago

org.scify:JInsect

The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classiﬁcation and indexing.

1.1 • 6 years ago

org.openimaj:openimaj

OpenIMAJ (Open Intelligent Multimedia in Java) is a collection of libraries and tools for multimedia analysis written in the Java programming language. OpenIMAJ intends to be the first truly complete multimedia analysis library and contains modules for analysing images, videos, text, audio and even webpages. The OpenIMAJ image and video analysis and feature extraction modules contain methods for processing visual content and extracting state-of-the-art features, including SIFT. The OpenIMAJ clustering and nearest-neighbour libraries contain efficient, multi-threaded implementations of clustering algorithms including Hierarchical K-Means and Approximate K-Means. The clustering library makes it possible to easily create visual-bag-of-words representations for images and video with very large vocabularies. The text-analysis modules contain implementations of a statistical language classifier and low-level processing pipeline. A number of modules deal with content creation, including interactive slideshows and animations. The hardware integration modules allow cross-platform integration with devices including webcams, the Microsoft Kinect, and even devices such as GPS's. OpenIMAJ also incorporates a number of tools to enable extremely-large-scale multimedia analysis using a distributed computing approach based on Apache Hadoop.

1.3.10 • 6 years ago

org.apache.hyracks.examples.text:textserver

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.

0.3.9 • 2 years ago

org.apache.hyracks:text-example

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are characterized by a collaborative, consensus based development process, an open and pragmatic software license, and a desire to create high quality software that leads the way in its field. We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.

0.3.9 • 2 years ago

com.textrazor:textrazor

Official Java SDK for the TextRazor Text Analytics API TextRazor offers state-of-the-art natural language processing tools through a simple API, allowing you to build semantic technology into your applications in minutes. Hundreds of applications rely on TextRazor to understand unstructured text across a range of verticals, with use cases including social media monitoring, enterprise search, recommendation systems and ad targetting.

1.0.14 • 8 months ago

Product

Package Alerts
Integrations
Docs
Pricing
FAQ
Roadmap
Changelog

About

About
Love
Blog
Glossary
CareersHiring
Send Feedback
Contact Us
System Status

Packages

Explore Rubygems

Stay in touch

Get open source security insights delivered straight into your inbox.

Enter your email

Terms
Privacy
Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.