Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

vcfpy

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

vcfpy

Python 3 VCF library with good support for both reading and writing

0.13.8
PyPI

Maintainers: 1

VCFPy

Python 3 VCF library with good support for both reading and writing

Free software: MIT license
Documentation: https://vcfpy.readthedocs.io.

Features

Support for reading and writing VCF v4.3
Interface to INFO and FORMAT fields is based on OrderedDict allows for easier modification than PyVCF (also I find this more pythonic)
Read (and jump in) and write BGZF files just using vcfpy

Why another VCF parser for Python!

I've been using PyVCF with quite some success in the past. However, the main bottleneck of PyVCF is when you want to modify the per-sample genotype information. There are some issues in the tracker of PyVCF but none of them can really be considered solved. I tried several hours to solve these problems within PyVCF but this never got far or towards a complete rewrite...

For this reason, VCFPy was born and here it is!

What's the State?

VCFPy is the result of two full days of development plus some maintenance work later now (right now). I'm using it in several projects but it is not as battle-tested as PyVCF.

Why Python 3 Only?

As I'm only using Python 3 code, I see no advantage in carrying around support for legacy Python 2 and maintaining it. At a later point when VCFPy is known to be stable, Python 2 support might be added if someone contributes a pull request.

Changelog

0.13.8 (2024-01-10)

Bug Fixes

fixing manifest for changelog (#169) (83c5b8e)

0.13.7 (2024-01-10)

Bug Fixes

remove versioneer Python 3.12 compatibility (#160) (5e2860e)

0.13.6 (2022-11-28)

Fixing bug in setup.py that prevented pysam dependency to be loaded (#150).

v0.13.5 (2022-11-13)

Treat .bgz files the same as .gz (#145, #149)

v0.13.4 (2022-04-13)

Switching to Github Actions for CI
Fix INFO flag raises TypeError (#146)

v0.13.3 (2020-09-14)

Adding Record.update_calls.
Making Record.{format,calls} use list when empty

v0.13.2 (2020-08-20)

Adding Call.set_genotype().

v0.13.1 (2020-08-20)

Fixed Call.ploidy.
Fixed Call.is_variant.

v0.13.0 (2020-07-10)

Fixing bug in case GT describes only one allele.
Proper escaping of colon and semicolon (or the lack of escaping) in INFO and FORMAT.

v0.12.2 (2020-04-29)

Fixing bug in case GT describes only one allele.

v0.12.1 (2019-03-08)

Not warning on PASS filter if not defined in header.

v0.12.0 (2019-01-29)

Fixing tests for Python >=3.6
Fixing CI, improving tox integration.
Applying black formatting.
Replacing Makefile with more minimal one.
Removing some linting errors from flake8.
Adding support for reading VCF without FORMAT or any sample column.
Adding support for writing headers and records without FORMAT and any sample columns.

v0.11.2 (2018-04-16)

Removing pip module from setup.py which is not recommended anyway.

v0.11.1 (2018-03-06)

Working around problem in HTSJDK output with incomplete FORMAT fields (#127). Writing out . instead of keeping trailing empty records empty.

v0.11.0 (2017-11-22)

The field FORMAT/FT is now expected to be a semicolon-separated string. Internally, we will handle it as a list.
Switching from warning helper utility code to Python warnings module.
Return str in case of problems with parsing value.

v0.10.0 (2017-02-27)

Extending API to allow for reading subsets of records. (Writing for sample subsets or reordered samples is possible through using the appropriate names list in the SamplesInfos for the Writer).
Deep-copying header lines and samples infos on Writer construction
Using samples attribute from Header in Reader and Writer instead of passing explicitely

0.9.0 (2017-02-26)

Restructuring of requirements.txt files
Fixing parsing of no-call GT fields

0.8.1 (2017-02-08)

PEP8 style adjustments
Using versioneer for versioning
Using requirements*.txt files now from setup.py
Fixing dependency on cyordereddict to be for Python <3.6 instead of <3.5
Jumping by samtools coordinate string now also allowed

0.8.0 (2016-10-31)

Adding Header.has_header_line for querying existence of header line
Header.add_*_line return a bool no indicating any conflicts
Construction of Writer uses samples within header and no extra parameter (breaks API)

0.7.0 (2016-09-25)

Smaller improvements and fixes to documentation
Adding Codacy coverage and static code analysis results to README
Various smaller code cleanup triggered by Codacy results
Adding __eq__, __neq__ and __hash__ to data types (where applicable)

0.6.0 (2016-09-25

Refining implementation for breakend and symbolic allele class
Removing record.SV_CODES
Refactoring parser module a bit to make the code cleaner
Fixing small typos and problems in documentation

0.5.0 (2016-09-24)

Deactivating warnings on record parsing by default because of performance
Adding validation for INFO and FORMAT fields on reading (#8)
Adding predefined INFO and FORMAT fields to pyvcf.header (#32)

0.4.1 (2016-09-22)

Initially enabling codeclimate

0.4.0 (2016-09-22)

Exporting constants for encoding variant types
Exporting genotype constants HOM_REF, HOM_ALT, HET
Implementing Call.is_phased, Call.is_het, Call.is_variant, Call.is_phased, Call.is_hom_ref, Call.is_hom_alt
Removing Call.phased (breaks API, next release is 0.4.0)
Adding tests, fixing bugs for methods of Call

0.3.1 (2016-09-21)

Work around FORMAT/FT being a string; this is done so in the Delly output

0.3.0 (2016-09-21)

Reader and Writer can now be used as context manager (with with)
Including license in documentation, including Biopython license
Adding support for writing bgzf files (taken from Biopython)
Adding support for parsing arrays in header lines
Removing example-4.1-bnd.vcf example file because v4.1 tumor derival lacks ID field
Adding AltAlleleHeaderLine, MetaHeaderLine, PedigreeHeaderLine, and SampleHeaderLine
Renaming SimpleHeaderFile to SimpleHeaderLine
Warn on missing FILTER entries on parsing
Reordered parameters in from_stream and from_file (#18)
Renamed from_file to from_stream (#18)
Renamed Reader.jump_to to Reader.fetch
Adding header_without_lines function
Generally extending API to make it esier to use
Upgrading dependencies, enabling pyup-bot
Greatly extending documentation

0.2.1 (2016-09-19)

First release on PyPI

Keywords

vcfpy

FAQs

What is vcfpy?

Is vcfpy well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

vcfpy

VCFPy

Features

Why another VCF parser for Python!

What's the State?

Why Python 3 Only?

Changelog

0.13.8 (2024-01-10)

Bug Fixes

0.13.7 (2024-01-10)

Bug Fixes

0.13.6 (2022-11-28)

v0.13.5 (2022-11-13)

v0.13.4 (2022-04-13)

v0.13.3 (2020-09-14)

v0.13.2 (2020-08-20)

v0.13.1 (2020-08-20)

v0.13.0 (2020-07-10)

v0.12.2 (2020-04-29)

v0.12.1 (2019-03-08)

v0.12.0 (2019-01-29)

v0.11.2 (2018-04-16)

v0.11.1 (2018-03-06)

v0.11.0 (2017-11-22)

v0.10.0 (2017-02-27)

0.9.0 (2017-02-26)

0.8.1 (2017-02-08)

0.8.0 (2016-10-31)

0.7.0 (2016-09-25)

0.6.0 (2016-09-25

0.5.0 (2016-09-24)

0.4.1 (2016-09-22)

0.4.0 (2016-09-22)

0.3.1 (2016-09-21)

0.3.0 (2016-09-21)

0.2.1 (2016-09-19)

Keywords

Related posts

Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation