Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
AexPy /eɪkspaɪ/ is Api EXplorer in PYthon for detecting API breaking changes in Python packages.
AexPy /eɪkspaɪ/ is Api EXplorer in PYthon for detecting API breaking changes in Python packages.
Explore AexPy's APIs, and the main branch on AexPy itself. AexPy also runs an index project for some packages shown here, trying to replace pypi.org
to aexpy.netlify.app
in the package PyPI URLs to explore their APIs.
[!NOTE] AexPy is the prototype implementation of the conference paper "AexPy: Detecting API Breaking Changes in Python Packages" in Proceedings of the 33rd IEEE International Symposium on Software Reliability Engineering (ISSRE 2022), Charlotte, North Carolina, USA, October 31 - November 3, 2022.
If you use our approach or results in your work, please cite it according to the citation file.
X. Du and J. Ma, "AexPy: Detecting API Breaking Changes in Python Packages," 2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE), 2022, pp. 470-481, doi: 10.1109/ISSRE55969.2022.00052.
graph LR;
Package-->Version-1;
Package-->Version-2;
Version-1-->Preprocessing-1;
Version-2-->Preprocessing-2;
Preprocessing-1-->Extraction-1;
Preprocessing-2-->Extraction-2;
Extraction-1-->Difference;
Extraction-2-->Difference;
Difference-->Evaluation;
Evaluation-->Breaking-Changes;
AexPy also provides a framework to process Python packages, extract APIs, and detect changes, which is designed for easily reusing and customizing. See the following "Advanced Tools" section and the source code for details.
Take the package generator-oj-problem v0.0.1 and v0.0.2 as an example.
cache/api1.json
and cache/api2.json
report.txt
# Install AexPy package and tool
pip install aexpy
# Extract APIs from v0.0.1
echo generator-oj-problem@0.0.1 | aexpy extract - api1.json -r
# Extract APIs from v0.0.1
echo generator-oj-problem@0.0.2 | aexpy extract - api2.json -r
# Diff APIs between two versions
aexpy diff api1.json api2.json changes.json
View results on online AexPy.
See also about API Level, Call Graph, and Inheritance Diagram.
We provide the Python package on PyPI. Use pip to install the package.
python -m pip install --upgrade aexpy
aexpy --help
[!IMPORTANT] Please ensure your Python interpreter works in UTF-8 mode.
We also provide the Docker image to avoid environment errors.
docker pull stardustdl/aexpy:latest
docker run --rm stardustdl/aexpy:latest --help
# or the image from the main branch
docker pull stardustdl/aexpy:main
[!TIP]
- AexPy match commands by their prefixes, so you do not need to write the whole command name, but just a distinguishable prefix.
# aexpy preprocess --help aexpy pre --help
- All results produced by AexPy are in JSON format, so you could modify it in any text editor.
- Pass
-
to I/O arguments to use stdin/stdout.
Preprocess a distribution for a package release.
AexPy provide four preprocessing modes:
-s
, --src
: (default) Use given distribution information (path to code, package name, modules)-r
, --release
: download and unpack the package wheel and automatically load from dist-info-w
, --wheel
: Unpack existing package wheel file and automatically load from dist-info-d
, --dist
: Automatically load from unpacked wheel, and its dist-infoAexPy will automatically load package name, version, top-level modules, and dependencies from dist-info.
There are also options to specify fields in the distribution:
-p
, --project
: Package name and its version, e.g. project@version
.-m
, --module
: (multiple) Top-level module names.-D
, --depends
: (multiple) Package dependencies.-R
, --requirements
: Package requirements.txt
file path, to load dependencies.-P
, --pyversion
: Specify Python version for this distribution, supported Python 3.8+.[!TIP] You could modify the generated distribution file in a text editor to change field values.
# download the package wheel and unpack into ./cache
# output the distribution file to ./cache/distribution.json
aexpy preprocess -r -p generator-oj-problem@0.0.1 ./cache ./cache/distribution.json
# or output the distribution file to stdout
aexpy preprocess -r -p generator-oj-problem@0.0.1 ./cache -
# use existing wheel file
aexpy preprocess -w ./cache/generator_oj_problem-0.0.1-py3-none-any.whl ./cache/distribution.json
# use existing unpacked wheel directory, auto load metadata from .dist-info directory
aexpy preprocess -d ./cache/generator_oj_problem-0.0.1-py3-none-any ./cache/distribution.json
# use existing source code directory, given the package's name, version, and top-level modules
aexpy preprocess ./cache/generator_oj_problem-0.0.1-py3-none-any ./cache/distribution.json -p generator-oj-problem@0.0.1 -m generator_oj_problem
View results at AexPy Online.
Extract the API description from a distribution.
AexPy provide four modes for the input distribution file:
-j
, --json
: (default) The file is the JSON file produced by AexPy (preprocess
command)-r
, --release
: The file is a text containing the release ID, e.g., aexpy@0.1.0
-w
, --wheel
: The file is a wheel, i.e., .whl
file. when reading from stdin, please also give the wheel file name through --wheel-name
option.-s
, --src
: The file is a ZIP file that contains the package code directory
[!IMPORTANT] About Dependencies AexPy would dynamically import the target module to detect all available APIs. So please ensure all dependencies have been installed in the extraction environment, or specify the
dependencies
field in the distribution, and AexPy will install them into the extraction environment.If the
wheelFile
field is valid (i.e. the target file exists), AexPy will firstly try to install the wheel and ignore thedependencies
field (used when the wheel installation fails).
[!TIP] About Environment AexPy use micromamba as default environment manager. Use
AEXPY_ENV_PROVIDER
environment variable to specifyconda
,mamba
, ormicromamba
(if the variable hasn't been specified, AexPy will detect the environment manager automatically).
- Use flag
--no-temp
to let AexPy use the current Python environment (as same as AexPy) as the extraction environment (the default behavior of the installed AexPy package).- Use flag
--temp
to let AexPy create a temporary mamba(conda) environment that matches the distribution's pyverion field (the default behavior of our docker image).- Use option
-e
,--env
to specify an existing mamba(conda) env name as the extraction environment (will ignore the temp flag).
aexpy extract ./cache/distribution.json ./cache/api.json
# or input the distribution file from stdin
# (this feature is also supported in other commands)
aexpy extract - ./cache/api.json
# or output the api description file to stdout
aexpy extract ./cache/distribution.json -
# extract from the target project release
echo aexpy@0.0.1 | aexpy extract - api.json -r
# extract from the wheel file
aexpy extract ./temp/aexpy-0.1.0.whl api.json -w
cat ./temp/aexpy-0.1.0.whl | aexpy extract - api.json -w --wheel-name aexpy-0.1.0.whl
# extract from the project source code ZIP archive
zip -r - ./project | aexpy extract - api.json -s
# Use a env named demo-env
aexpy extract ./cache/distribution.json - -e demo-env
# Create a temporary env
aexpy extract ./cache/distribution.json - --temp
View results at AexPy Online.
Diff two API descriptions and detect changes.
aexpy diff ./cache/api1.json ./cache/api2.json ./cache/diff.json
View results at AexPy Online.
[!TIP] If you have both stdin for OLD and NEW, please split two API descriptions by a comma
,
.This situation only support for normal IO mode, not compressing IO mode.
echo "," | cat ./api1.json - ./api2.json | aexpy diff - - ./changes.json
Generate report from detect changes.
aexpy report ./cache/diff.json ./cache/report.json
View results at AexPy Online.
View produced data.
aexpy view ./cache/distribution1.json
aexpy view ./cache/distribution2.json
aexpy view ./cache/api1.json
aexpy view ./cache/api2.json
aexpy view ./cache/diff.json
aexpy view ./cache/report.json
The docker image keeps the same command-line interface, but always use stdin/stdout for host-container data transferring.
echo generator-oj-problem@0.0.1 | docker run -i aexpy/aexpy extract - - > ./api.json
echo "," | cat ./api1.json - ./api2.json | docker run -i aexpy/aexpy diff - - - > ./changes.json
[!TIP] If you want to write processed data to filesystem, not the standard IO, add a volume mapping to
/data
for file access.Please ensure using the same user as the owner of the mounted directory, to access mounted files.
docker run -v $pwd/cache:/data -u $(id -u):$(id -g) aexpy/aexpy extract /data/distribution.json /data/api.json
When you installed AexPy package, you could use tool runimage
command for a quick runner of containers (if you have Docker installed).
[!TIP] The volume directory will mount to
/data
in the containerAll file path arguments passed to container should use absolute paths with
/data
prefix or use a path relative to/data
.
# Use the same version of the image as current AexPy version
# Use current as mount directory
aexpy tool runimage -- --version
aexpy runimage -- --version
# Extract from ./dist.json
aexpy runimage -- extract ./dist.json ./api.json
# Use a specified image tag and mount directory
aexpy tool runimage -v ./mount -t stardustdl/aexpy:latest -- --version
# Extract from ./mount/dist.json
aexpy runimage -v ./mount -- extract ./dist.json ./api.json
aexpy runimage -v ./mount -- extract /data/dist.json /data/api.json
The processing may cost time, you can use multiple -v
for verbose logs (which are outputed to stderr).
aexpy -vvv view ./cache/report.json
When the package is large, the JSON data produced by AexPy might be large, too. AexPy support gzip format to compress/decompress for IO streams, use -z/--gzip
option or AEXPY_GZIP_IO
environemnt variable to enable it.
aexpy --gzip extract ./cache/distribution.json ./cache/api.json.gz
AEXPY_GZIP_IO=1 aexpy extract ./cache/distribution.json.gz ./cache/api.json
aexpy view ./cache/api.json.gz
[!TIP] AexPy will detect input file format automatically, no matter compressed-IO enabled or not.
When enabling compressed-IO mode, all output JSON streams will be regarded as gzip JSON streams.
Add -i
or --interact
to enable interactive mode, every command will create an interactive Python shell after finishing processing. Here are some useful variable you could use in the interactive Python shell.
result
: The produced data objectcontext
: The producing context, use exception
to access the exception if failing to processaexpy -i view ./cache/report.json
[!TIP] Feel free to use
locals()
anddir()
to explore the interactive environment.
AexPy provides tools to count numbers from produced data in aexpy.tools.stats
module.
It loads products from given files, runs builtin counters, and then records them as kay-value pairs of the release (or release pair).
aexpy tool stat ./*.json ./stats.json
aexpy stat ./*.json ./stats.json
aexpy view ./stats.json
AexPy has four loosely-coupled stages in its pipeline. The adjacent stages transfer data by JSON, defined in models directory. You can easily write your own implementation for every stage, and combine your implementation into the pipeline.
To write your own services, copy from aexpy/services.py and write your subclass of ServiceProvider
and modify the getService
function to return your service instance.
from aexpy.services import ServiceProvider
class MyServiceProvider(ServiceProvider):
...
def getService():
return MyServiceProvider()
Then you can load your service file by -s/--service
option or AEXPY_SERVICE
environment variable.
aexpy -s services.py -vvv view --help
AEXPY_SERVICE=services.py aexpy -vvv view --help
We have implemented an image service provider, which replaces the default extractor, differ, and reporter by the container worker. See aexpy/tools/workers/services.py for its implementation. Here is the demo service file to use the image service provider.
from aexpy.tools.workers.services import DockerWorkerServiceProvider
def getService():
return DockerWorkerServiceProvider(tag="stardustdl/aexpy:latest")
FAQs
AexPy /eɪkspaɪ/ is Api EXplorer in PYthon for detecting API breaking changes in Python packages.
We found that aexpy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.