Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
WSO2 is an open source application development software company focused on providing service-oriented architecture solutions for professional developers.
Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
Apache Solr (module: extraction)
Cloud-Native Headless Content Management Services (CMS) for Spring. Integrates with Spring Data, Spring Data REST and Apache Solr
Solr resources for StormCrawler
Vind is build to enable the integration of search facilities in java projects without getting to deep into the search topic
Kylo is an enterprise-ready, open source, data lake management software platform for Hadoop and Spark integrating best practices around metadata management, governance, and security learned from over Think Big's 150+ successful big data projects.
Consumer for writing documents to a Solr server
Adds Teiid SOLR translator
java tools
Cloud-Native Headless Content Management Services (CMS) for Spring. Integrates with Spring Data, Spring Data REST and Apache Solr
Apache Solr Server
Snowball Analyzers
Build Information $Id: pom.xml 5365 2010-09-30 00:30:05Z mdiggory $ $URL: https://scm.dspace.org/svn/repo/modules/dspace-discovery/tags/discovery-modules-0.9.2/provider/pom.xml $
Solr Specific Additional Analyzers
java tools
OpenCB commons project contains several Java libs for Bioinformatics
Hadoop Unit
Solr resources for StormCrawler
Kite SDK is a set of libraries, tools, and docs to simplify the development of data-related systems.
SIREn Solr plugin
Cloud-Native Headless Content Management Services (CMS) for Spring. Integrates with Spring Data, Spring Data REST and Apache Solr
Implementation of an annotation engine that links the content item to a set of possible categories from a dedicated Solr index using MoreLikeThis queries. The classification can be either applied to a complete document (text in a given language) which is the default behavior or to a specific portion of the text (using a TextAnnotation).
Implementation of the solr service api
Implementation of the solr service api
Implementation of the solr service api
An Implementation of the datastore logisland api for Chronix
A fork of the Apache Cassandra Project that uses Lucene indexes for providing near real time search such as ElasticSearch or Solr, including full text search capabilities, multi-dimensional queries, and relevance scoring.
Cloud-Native Headless Content Management Services (CMS) for Spring. Integrates with Spring Data, Spring Data REST and Apache Solr
A lightweight open source chinese tokenizer with keywords, key sentences, summary extracts support and provide the latest lucene,solr,elasticsearch embedding API.
A lightweight open source chinese tokenizer with keywords, key sentences, summary extracts support and provide the latest lucene,solr,elasticsearch embedding API.
testcontainers-scala-solr
testcontainers-scala-solr
testcontainers-scala-solr
testcontainers-scala-solr
Camel-based indexing service for Solr
Vitro semantic web application project
The Eclipse JNoSQL communication layer implementation to Apache Solr
Funktion :: Connector :: Solr
bobo solr plugin
Queries - various query object exotica not in core
Solr bindings for tap-room
Provides support to Install Solr indexes by using the Apache Sling Installer framework
This is the highlighter for apache lucene java
Box is an add-on framework for Java applications that performs asynchronous gathering of source data with subsequent processing or transformation of that data into output JSON documents and exposing those documents through a web API. Box excels at maintaining existing documents including keeping them up-to-date in real time and deleting them when required. Box supports aggregating data from multiple sources as well as splitting data into multiple documents. How documents are processed is left up to each application. The web api makes the documents available by ID or by harvesting all documents. Harvests can be filtered using facets and documents can be pared down by requesting only needed fields.
A simple Solr JDBC connection holder and a Solr synonym filter loading synonyms from JDBC.
Spell Checker