
Security News
Meet the Socket Team at RSAC and BSidesSF 2025
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.
A Python implementation of Lunr.js by Oliver Nightingale.
A bit like Solr, but much smaller and not as bright.
This Python version of Lunr.js aims to bring the simple and powerful full text search capabilities into Python guaranteeing results as close as the original implementation as possible.
Lunr is a simple full text search solution for situations where deploying a full scale solution like Elasticsearch isn't possible, viable or you're simply prototyping. Lunr parses a set of documents and creates an inverted index for quick full text searches in the same way other more complicated solution.
The trade-off is that Lunr keeps the inverted index in memory and requires you to recreate or read the index at the start of your application.
A core objective of Lunr.py is to provide interoperability with the JavaScript version.
An example can be found in the MkDocs documentation library. MkDocs produces a set of documents from the pages of the documentation and uses Lunr.js in the frontend to power its built-in searching engine. This set of documents is in the form of a JSON file which needs to be fetched and parsed by Lunr.js to create the inverted index at startup of your application.
While this is not a problem for most sites, depending on the size of your document set, this can take some time.
Lunr.py provides a backend solution, allowing you to parse the documents in Python of time and create a serialized Lunr.js index you can pass have the browser version read, minimizing start up time of your application.
Each version of lunr.py targets a specific version of lunr.js and produces the same results for a non-trivial corpus of documents.
pip install lunr
An optional and experimental support for other languages thanks to the
Natural Language Toolkit stemmers is also available via
pip install lunr[languages]
. The usage of the language feature is subject to
NTLK corpus licensing clauses.
Please refer to the documentation page on languages for more information.
First, you'll need a list of dicts representing the documents you want to search on. These documents must have a unique field which will serve as a reference and a series of fields you'd like to search on.
Lunr provides a convenience lunr
function to quickly index this set of documents:
>>> from lunr import lunr
>>>
>>> documents = [{
... 'id': 'a',
... 'title': 'Mr. Green kills Colonel Mustard',
... 'body': 'Mr. Green killed Colonel Mustard in the study with the candlestick.',
... }, {
... 'id': 'b',
... 'title': 'Plumb waters plant',
... 'body': 'Professor Plumb has a green plant in his study',
... }]
>>> idx = lunr(
... ref='id', fields=('title', 'body'), documents=documents
... )
>>> idx.search('kill')
[{'ref': 'a', 'score': 0.6931722372559913, 'match_data': <MatchData "kill">}]
>>> idx.search('study')
[{'ref': 'b', 'score': 0.23576799568081389, 'match_data': <MatchData "studi">}, {'ref': 'a', 'score': 0.2236629211724517, 'match_data': <MatchData "studi">}]
Please refer to the documentation for more usage examples.
FAQs
A Python implementation of Lunr.js
We found that lunr demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Join Socket for exclusive networking events, rooftop gatherings, and one-on-one meetings during BSidesSF and RSA 2025 in San Francisco.
Security News
Biome's v2.0 beta introduces custom plugins, domain-specific linting, and type-aware rules while laying groundwork for HTML support and embedded language features in 2025.
Security News
Next.js has patched a critical vulnerability (CVE-2025-29927) that allowed attackers to bypass middleware-based authorization checks in self-hosted apps.