Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

booktest

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

booktest

Booktest is a snapshot testing library for review driven testing.

0.3.45
PyPI

Maintainers: 1

Booktest

booktest is review-driven testing tool that combines Jupyterbook style data science development with traditional regression testing. Booktest is developed by Lumoa.me, the actionable feedback analytics platform.

booktest is designed to tackle a common problem with the data science RnD work flows and regression testing:

Data science produces results such as probability estimates, which can be good or bad, but not really right or wrong as in the traditional software engineering.
- Because the DS results are not strictly right or wrong, it's very difficult to use assertions for quality assurance and preventing regression.
- For example, you cannot really say that accuracy 0.84 is correct, while the accuracy 0.83 is incorrect, especially if you have other measurements (log likelihood) giving conflicting results. Neither evaluating a topic model as correct or incorrect is non-sensical. In practice, most data science applications require an expert review.
- This less ambigious quality also creates need for a better visibility of how the system behaves. One typically wants to print out edge cases and their diagnostics to see the behavior, see intermediate steps and see the results for different data sets .
There is also the problem of the data science data being big and the intermediate results being computationally expensive.
- Jupyter notebook deals with this problem by keeping the state in memory between runs, while traditional unittests tend to lose the program state between runs. This leads to very slow test runs, slow iteration speed and low productivity.
While the Jupyter Notebook provides good visibility to results required by the expert review and powerful caching functionality: it fails short on a) often requiring copy-pasting production code to make results visible, b) it doesn't support automated regression testing and c) expert review requires expensive full review even if nothing changed.

booktest solves this problem setting by delivering on 3 main points:

Focus on the results and analytic as in Jupyter notebook by allowing user to print the results as MD files.
Keep the intermediate results cached either in memory or in filesystem by having two level cache.
Instead of doing strict assertions, do testing by comparing old results with new results.

As such, booktest does snapshot testing, and it stores the snapshots in filesystem and in Git. Additional benefit of this approach is that you can trace the result development in Git.

Getting started guide

You can find getting started guide here

Workflows, coverage and CI

You can find guide on common workflows, coverage measurements and continuous integration here

Examples

Examples are found in the test example directory.

Example results are visible in the book index.

There is also separate example project

API reference

API reference is generated under docs directory. Main classes are:

TestCaseRun, which provides API for tests
TestBook, which provide a base class for test suite object
TestSuite, which provide a base class for test suite object
Tests, which manages CLI interface

Developing booktest

Development guide is available here

FAQs

What is booktest?

Is booktest well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

booktest

Booktest

Getting started guide

Workflows, coverage and CI

Examples

API reference

Developing booktest

Related posts

Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation