
Research
Security News
The Landscape of Malicious Open Source Packages: 2025 MidâYear Threat Report
A look at the top trends in how threat actors are weaponizing open source packages to deliver malware and persist across the software supply chain.
GrimoireELK is the component of GrimoireLab that interacts with the ElasticSearch database. Its goal is two-fold, first it aims at offering a convenient way to store the data coming from Perceval, second it processes and enriches the data in a format that can be consumed by Kibiter.
The Perceval data is stored in ElasticSearch indexes as raw documents (one per item extracted by Perceval). Those raw documents, which will be referred to as "raw data" in this documentation, include all information coming from the original data source which grants the platform to perform multiple analysis without the need of downloading the same data over and over again. Once raw data is retrieved, a new phase starts where data is enriched according to the data source from where it was collected and stored in ElasticSearch indexes. The enrichment removes information not needed by Kibiter and includes additional information which is not directly available within the raw data. For instance, pair programming information for Git data, time to solve (i.e., close or merge) issues and pull requests for GitHub data, and identities and organization information coming from SortingHat . The enriched data is stored as JSON documents, which embed information linked to the corresponding raw documents to ease debugging and guarantee traceability.
Each raw document stored in an ElasticSearch index contains a set of common first level fields, regardless of the data source:
Each enriched index includes one or more types of documents, which are summarized below.
Each enriched document contains a set of fields, they can be (i) common to all data sources (e.g., metadata fields, time field), (ii) specific to the data source, (iii) related to contributorâs profile information (i.e., identity fields) or (iv) to the project listed in the Mordred projects.json (i.e., project fields).
enrich_extra_data
study.Details of the fields of each data source is available in the Schema folder.
There are several ways to install GrimoireELK on your system: packages or source code using Poetry or pip.
GrimoireELK can be installed using pip, a tool for installing Python packages. To do it, run the next command:
$ pip install grimoire-elk
To install from the source code you will need to clone the repository first:
$ git clone https://github.com/chaoss/grimoirelab-elk
$ cd grimoirelab-elk
Then use pip or Poetry to install the package along with its dependencies.
To install the package from local directory run the following command:
$ pip install .
In case you are a developer, you should install GrimoireELK in editable mode:
$ pip install -e .
We use poetry for dependency management and packaging. You can install it following its documentation. Once you have installed it, you can install GrimoireELK and the dependencies in a project isolated environment using:
$ poetry install
To spaw a new shell within the virtual environment use:
$ poetry shell
Tests are located in the folder tests. In order to run them, you need to have in your machine instances (or Docker containers) of ElasticSearch and MySQL
Then you need to:
http://localhost:9200
. For example, if you are using the secure edition of elasticsearch, it will be located at https://admin:admin@localhost:9200
[Database]
section of the file with both user
and password
parameterstest_sh
and test_projects
in your MySQL instance (e.g., mysql -u root -e "create database test_sh"
, if you are running mysql in a container use docker exec -i <container id> mysql -u root -e "create database test_sh"
)test_projects
with the SQL file test_projects.sql (e.g., mysql -u root test_projects < tests/test_projects.sql
)The full battery of tests can be executed with run_tests.py. However, it is also possible to execute
a sub-set of tests by running the single test files (test_*
files in the tests folder)
The tests can be run in combination with the Python package coverage
. The steps below show how to do it:
$ pip3 install coveralls
$ cd <path-to-ELK>/tests
$ python3 -m coverage run run_tests.py --source=grimoire_elk
Coverage will generate a file .coverage
in the tests folder, which can be inspected with the following command:
cd <path-to-ELK>/tests
python3 -m coverage report -m
The output will be similar to the following one:
Name Stmts Miss Cover Missing
--------------------------------------------------------------------------------------------------------------------------------------------------
.../ELK/grimoire_elk/__init__.py 4 0 100%
.../ELK/grimoire_elk/_version.py 1 0 100%
FAQs
GrimoireELK processes and stores software development data to ElasticSearch
We found that grimoire-elk demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A look at the top trends in how threat actors are weaponizing open source packages to deliver malware and persist across the software supply chain.
Security News
ESLint now supports HTML linting with 48 new rules, expanding its language plugin system to cover more of the modern web development stack.
Security News
CISA is discontinuing official RSS support for KEV and cybersecurity alerts, shifting updates to email and social media, disrupting automation workflows.