CoverUp

LLM-powered test coverage improver

0.2.0

PyPI

Maintainers: 1

Readme

by Juan Altmayer Pizzorno and Emery Berger at UMass Amherst's PLASMA lab.

pyversions

About CoverUp

CoverUp automatically generates tests that ensure that more of your code is tested (that is, it increases its code coverage). CoverUp can also create a test suite from scratch if you don't yet have one. The new tests are based on your code, making them useful for regression testing.

CoverUp is designed to work closely with the pytest test framework. To generate tests, it first measures your suite's coverage using SlipCover, our state-of-the art coverage analyzer. It then selects portions of the code that need more testing (that is, code that is uncovered). CoverUp then engages in a conversation with an LLM, prompting for tests, checking the results to verify that they run and increase coverage (again using SlipCover), and re-prompting for adjustments as necessary. Finally, CoverUp optionally checks that the new tests integrate well, attempting to resolve any issues it finds.

For technical details and a complete evaluation, see our arXiv paper, CoverUp: Coverage-Guided LLM-Based Test Generation (PDF).

Installing CoverUp

CoverUp is available from PyPI, so you can install simply with

$ python3 -m pip install coverup

LLM model access

CoverUp can be used with OpenAI, Anthropic or AWS Bedrock models; it requires that the access details be defined as shell environment variables: OPENAI_API_KEY, ANTHROPIC_API_KEY or AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_REGION_NAME, respectively.

For example, for OpenAI you would create an account, ensure it has a positive balance and then create an an API key, storing its "secret key" (usually a string starting with sk-) in an environment variable named OPENAI_API_KEY:

$ export OPENAI_API_KEY=<...your-api-key...>

Using CoverUp

If your module is named mymod, its sources are under src and the tests under tests, you can run CoverUp as

$ coverup --source-dir src/mymod --tests-dir tests

CoverUp then creates tests named test_coverup_N.py, where N is a number, under the tests directory.

Example

Here we have CoverUp create additional tests for the popular package Flask:

$ coverup --source-dir src/flask --tests-dir tests --disable-polluting --no-isolate-tests
Measuring test suite coverage...  starting coverage: 90.2%
Prompting gpt-4-1106-preview for tests to increase coverage...
100%|███████████████████████████████████████████████████| 95/95 [02:49<00:00,  1.79s/it, usage=~$3.30, G=51, F=141, U=22, R=0]
Checking test suite...  tests/test_coverup_2.py is failing, looking for culprit(s)...
Disabling tests/test_coverup_19.py
Checking test suite...  tests ok!
End coverage: 94.2%

In under 3 minutes, CoverUp increases Flask's test coverage from 90.2% to 94.2%. It detected that one of the new tests, test_coverup_19, was causing another test to fail and disabled it. That test remains as disabled_test_coverup_19.py, where it can be reviewed for the cause and possibly re-added to the suite.

Running CoverUp with Docker

To evaluate the tests generated by the LLM, CoverUp must execute them. For best security and to minimize the risk of damage to your system, we recommend running CoverUp with Docker.

Evaluation

The graph shows CoverUp in comparison to the state-of-the-art CodaMosa, which itself uses LLM queries to improve on the Pynguin test generator. For this experiment, both CoverUp and CodaMosa created tests "from scratch", that is, ignoring any existing test suite. The bars show the difference in coverage percentage between CoverUp and CodaMosa for various Python modules; green bars, above 0, indicate that CoverUp achieved a higher coverage.

As the graph shows, CoverUp achieves higher coverage in almost every case.

Work In Progress

This is an early release of CoverUp. Please enjoy it, and pardon any disruptions as we work to improve it. We welcome bug reports, experience reports, and feature requests (please open an issue).

FAQs

What is CoverUp?

Is CoverUp well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install