Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
cognite-extractor-manager
Advanced tools
cogex
cogex
is a tool for managing extractors for Cognite Data Fusion written in Python. It provides
utilities for initializing a new extractor project and building self-contained executables of Python
based extractors.
pyenv
pyenv
is a neat tool for managing Python installations.
Since cogex
uses PyInstaller to build executables, we need Python to be installed with a shared
instance of libpython
, which pyenv
does not do by default. To fix this, make sure to add the
--enable-shared
flag when installing new Python versions with pyenv
, like so:
env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.9.0
You can read more about it in the PyInstaller documentation
To start a new extractor project, move to the desired directory and run
cogex init
You will first be prompted for some information, before cogex
will initialize a new project.
Extractor projects initiated with cogex
will use poetry
for managing dependencies. Running
cogex init
will automatically install the Cognite SDK and extractor-utils framework, but if your
extractor needs any other dependency, simply add them using poetry
, like so:
poetry add requests
It is recommended that you run code checkers on your extractor, in particular:
black
is an opinionated code style checker that will enforce a consistent code style throughout
your project. This is useful to avoid unecessary changes and minimizing PR diffs.isort
is a tool that sorts your imports, also contributing to a consistent code style and
minimal PR diffs.mypy
is a static type checker for Python which ensures that you are not making any type errors
in your code that would go unnoticed before suddently breaking your extractor in production.cogex
will install all of these, and automatically run them on every commit. If you for some
reason need to perform a commit despite one of these failing, you can run git commit --no-verify
,
although this is not recommended.
It is not always an option to rely on a Python installation at the machine your extractor will be deployed at. For those scenarios it is useful to package the extractor, including its dependencies and the Python runtime, into a single self-contained executable. To do this, run
cogex build
This will create a new executable (for the operating system you ran cogex build
from) in the
dist
directory.
To build a docker image, you first need to add a [tools.cogex.docker]
section to your pyproject
file. The required fields are
tags
: A list of tags to tag the resulting image with. These support some simple templating, if
you include {version}
in your tag, it will be replaced with the current version of the
extractor. {major}
will be replaced with the current major version.[tool.poetry.scripts]
includes multiple entries, you need to specify which one to use
in the docker image with the entrypoint
fieldIn addition, you have some additional fields:
base-image
: Which base image to use. By default, the debian-slim
based python image for the
python version currently running with be chosen.install-dir
if you want to specify where in the image the extractor should be installedpreamble
which can contain additional dockerimage statements to run in the beginning of the
dockerfile.Minimal example:
[tool.cogex.docker]
tags = ["cognite/my-extractor:{version}"]
Larger example (from the DB Extractor):
[tool.cogex.docker]
base-image = "python:3.10"
preamble = """
RUN apt-get update \
&& apt-get dist-upgrade -y dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server \
&& gpgconf gpgsm gpgv libssl-dev libssl1.1 openssl
RUN apt-get install -y apt-utils build-essential
RUN apt-get install -y unixodbc-dev unixodbc
"""
tags = [
"eu.gcr.io/cognite-registry/db-extractor-base:latest",
"eu.gcr.io/cognite-registry/db-extractor-base:{version}",
"cognite/db-extractor-base:{version}",
]
You can now build and tag docker images with
cogex build --dockerimage
If you just want to see the generated dockerfile, instead run
cogex build --dockerfile
To keep track of which version of the code base is running at a given deployment it is very useful to version your extractor. When releasing a new version, run
poetry version [patch/minor/major]
To automatically bump the corresponding version number. Note that this only updates the version
number in pyproject.toml
. When running cogex build
this new version number will be propagated
through the rest of the code base.
Any extractor project should follow semantic versioning, which means you should bump
patch
for any minor bug fixes or improvementsminor
for new features or bigger improvements that doesn't break compatabilitymajor
for new feature or improvements that breaks compatability with previous versions, in
other words for those scenarios where the new version is not a drop-in replacement for an old
version. For example:
FAQs
A project manager for Python based extractors
We found that cognite-extractor-manager demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.