django-pgcrypto-fields
django-pgcrypto-fields
is a Django
extension which relies upon pgcrypto
to
encrypt and decrypt data for fields.
Requirements
- postgres with
pgcrypto
- Supports Django 2.2.x, 3.0.x, 3.1.x and 3.2.x
- Compatible with Python 3 only
Last version of this library that supports Django
1.8.x, 1.9.x, 1.10.x
was django-pgcrypto-fields
2.2.0.
Last version of this library that supports Django
2.0.x and 2.1.x was
was django-pgcrypto-fields
2.5.2.
Installation
Install package
pip install django-pgcrypto-fields
Django settings
Our library support different crypto keys for multiple databases by
defining the keys in your DATABASES
settings.
In settings.py
:
import os
BASEDIR = os.path.dirname(os.path.dirname(__file__))
PUBLIC_PGP_KEY_PATH = os.path.abspath(os.path.join(BASEDIR, 'public.key'))
PRIVATE_PGP_KEY_PATH = os.path.abspath(os.path.join(BASEDIR, 'private.key'))
PUBLIC_PGP_KEY = open(PUBLIC_PGP_KEY_PATH).read()
PRIVATE_PGP_KEY = open(PRIVATE_PGP_KEY_PATH).read()
PGCRYPTO_KEY='ultrasecret'
DIFF_PUBLIC_PGP_KEY_PATH = os.path.abspath(
os.path.join(BASEDIR, 'tests/keys/public_diff.key')
)
DIFF_PRIVATE_PGP_KEY_PATH = os.path.abspath(
os.path.join(BASEDIR, 'tests/keys/private_diff.key')
)
INSTALLED_APPS = (
'pgcrypto',
)
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'pgcryto_fields',
'USER': 'pgcryto_fields',
'PASSWORD': 'xxxx',
'HOST': 'psql.test.com',
'PORT': 5432,
'OPTIONS': {
'sslmode': 'require',
}
},
'diff_keys': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'pgcryto_fields_diff',
'USER': 'pgcryto_fields_diff',
'PASSWORD': 'xxxx',
'HOST': 'psqldiff.test.com',
'PORT': 5432,
'OPTIONS': {
'sslmode': 'require',
},
'PGCRYPTO_KEY': 'djangorocks',
'PUBLIC_PGP_KEY': open(DIFF_PUBLIC_PGP_KEY_PATH, 'r').read(),
'PRIVATE_PGP_KEY': open(DIFF_PRIVATE_PGP_KEY_PATH, 'r').read(),
},
}
Generate GPG keys if using Public Key Encryption
The public key is going to encrypt the message and the private key will be
needed to decrypt the content. The following commands have been taken from the
pgcrypto documentation
(see Generating PGP Keys with GnuPG).
Generating a public and a private key (The preferred key type is "DSA and Elgamal".):
$ gpg --gen-key
$ gpg --list-secret-keys
/home/bob/.gnupg/secring.gpg
---------------------------
sec 2048R/21 2014-10-23
uid Test Key <example@example.com>
ssb 2048R/42 2014-10-23
$ gpg -a --export 42 > public.key
$ gpg -a --export-secret-keys 21 > private.key
Limitations
This library currently does not support Public Key Encryption private keys that are password protected yet. See Issue #89 to help implement it.
Upgrading to 2.4.0 from previous versions
The 2.4.0 version of this library received a large rewrite in order to support
auto-decryption when getting encrypted field data as well as the ability to filter
on encrypted fields without using the old PGPCrypto aggregate functions available
in previous versions.
The following items in this library have been removed and therefore references in
your application to these items need to be removed as well:
managers.PGPManager
admin.PGPAdmin
aggregates.*
Fields
django-pgcrypto-fields
has 3 kinds of fields:
- Hash based fields
- Public Key (PGP) fields
- Symmetric fields
Hash Based Fields
Supported hash based fields are:
TextDigestField
TextHMACField
TextDigestField
is hashed in the database using the digest
pgcrypto function
using the sha512
algorithm.
TextHMACField
is hashed in the database using the hmac
pgcrypto function
using a key and the sha512
algorithm. This is similar to the digest version however
the hash can only be recalculated knowing the key. This prevents someone from altering
the data and also changing the hash to match.
Public Key Encryption Fields
Supported PGP public key fields are:
CharPGPPublicKeyField
EmailPGPPublicKeyField
TextPGPPublicKeyField
DatePGPPublicKeyField
DateTimePGPPublicKeyField
TimePGPPublicKeyField
IntegerPGPPublicKeyField
BigIntegerPGPPublicKeyField
DecimalPGPPublicKeyField
FloatPGPPublicKeyField
Public key encryption creates a token generated with a public key to
encrypt the data and a private key to decrypt it.
Public and private keys can be set in settings with PUBLIC_PGP_KEY
and
PRIVATE_PGP_KEY
.
Symmetric Key Encryption Fields
Supported PGP symmetric key fields are:
CharPGPSymmetricKeyField
EmailPGPSymmetricKeyField
TextPGPSymmetricKeyField
DatePGPSymmetricKeyField
DateTimePGPSymmetricKeyField
TimePGPSymmetricKeyField
IntegerPGPSymmetricKeyField
BigIntegerPGPSymerticKeyField
DecimalPGPSymmetricKeyField
FloatPGPSymmetricKeyField
Encrypt and decrypt the data with settings.PGCRYPTO_KEY
which acts like a password.
Django Model Field Equivalents
Django Field | Public Key Field | Symmetric Key Field |
---|
CharField | CharPGPPublicKeyField | CharPGPSymmetricKeyField |
EmailField | EmailPGPPublicKeyField | EmailPGPSymmetricKeyField |
TextField | TextPGPPublicKeyField | TextPGPSymmetricKeyField |
DateField | DatePGPPublicKeyField | DatePGPSymmetricKeyField |
DateTimeField | DateTimePGPPublicKeyField | DateTimePGPSymmetricKeyField |
TimeField | TimePGPPublicKeyField | TimePGPSymmetricKeyField |
IntegerField | IntegerPGPPublicKeyField | IntegerPGPSymmetricKeyField |
BigIntegerField | BigIntegerPGPPublicKeyField | BigIntegerPGPSymmetricKeyField |
DecimalField | DecimalPGPPublicKeyField | DecimalPGPSymmetricKeyField |
FloatField | FloatPGPPublicKeyField | FloatPGPSymmetricKeyField |
Other Django model fields are not currently supported. Pull requests are welcomed.
Usage
Model Definition
from django.db import models
from pgcrypto import fields
class MyModel(models.Model):
digest_field = fields.TextDigestField()
digest_with_original_field = fields.TextDigestField(original='pgp_sym_field')
hmac_field = fields.TextHMACField()
hmac_with_original_field = fields.TextHMACField(original='pgp_sym_field')
email_pgp_pub_field = fields.EmailPGPPublicKeyField()
integer_pgp_pub_field = fields.IntegerPGPPublicKeyField()
pgp_pub_field = fields.TextPGPPublicKeyField()
date_pgp_pub_field = fields.DatePGPPublicKeyField()
datetime_pgp_pub_field = fields.DateTimePGPPublicKeyField()
time_pgp_pub_field = fields.TimePGPPublicKeyField()
decimal_pgp_pub_field = fields.DecimalPGPPublicKeyField()
float_pgp_pub_field = fields.FloatPGPPublicKeyField()
email_pgp_sym_field = fields.EmailPGPSymmetricKeyField()
integer_pgp_sym_field = fields.IntegerPGPSymmetricKeyField()
pgp_sym_field = fields.TextPGPSymmetricKeyField()
date_pgp_sym_field = fields.DatePGPSymmetricKeyField()
datetime_pgp_sym_field = fields.DateTimePGPSymmetricKeyField()
time_pgp_sym_field = fields.TimePGPSymmetricKeyField()
decimal_pgp_sym_field = fields.DecimalPGPSymmetricKeyField()
float_pgp_sym_field = fields.FloatPGPSymmetricKeyField()
Encrypting
Data is automatically encrypted when inserted into the database.
Example:
>>> MyModel.objects.create(value='Value to be encrypted...')
Hash fields can have hashes auto updated if you use the original
attribute. This
attribute allows you to indicate another field name to base the hash value on.
from django.db import models
from pgcrypto import fields
class User(models.Model):
first_name = fields.TextPGPSymmetricKeyField(max_length=20, verbose_name='First Name')
first_name_hashed = fields.TextHMACField(original='first_name')
In the above example, if you specify the optional original attribute it would
take the unencrypted value from the first_name model field as the input value
to create the hash. If you did not specify an original attribute, the field
would work as it does now and would remain backwards compatible.
PGP fields
When accessing the field name attribute on a model instance we are getting the
decrypted value.
Example:
>>> # When using a PGP public key based encryption
>>> my_model = MyModel.objects.get()
>>> my_model.value
'Value decrypted'
Filtering encrypted values is now handled automatically as of 2.4.0. And aggregate
methods are not longer supported and have been removed from the library.
Also, auto-decryption is support for select_related()
models.
from django.db import models
from pgcrypto import fields
class EncryptedFKModel(models.Model):
fk_pgp_sym_field = fields.TextPGPSymmetricKeyField(blank=True, null=True)
class EncryptedModel(models.Model):
pgp_sym_field = fields.TextPGPSymmetricKeyField(blank=True, null=True)
fk_model = models.ForeignKey(
EncryptedFKModel, blank=True, null=True, on_delete=models.CASCADE
)
Example:
>>> import EncryptedModel
>>> my_model = EncryptedModel.objects.get().select_releated('fk_model')
>>> my_model.pgp_sym_field
'Value decrypted'
>>> my_model.fk_model.fk_pgp_sym_field
'Value decrypted'
Hash fields
To filter hash based values we need to compare hashes. This is achieved by using
a __hash_of
lookup.
Example:
>>> my_model = MyModel.objects.filter(digest_field__hash_of='value')
[<MyModel: MyModel object>]
>>> my_model = MyModel.objects.filter(hmac_field__hash_of='value')
[<MyModel: MyModel object>]
Limitations
.distinct('encrypted_field_name')
Due to a missing feature in the Django ORM, using distinct()
on an encrypted field
does not work for Django 2.0.x and lower.
The normal distinct works on Django 2.1.x and higher:
items = EncryptedFKModel.objects.filter(
pgp_sym_field__startswith='P'
).only(
'id', 'pgp_sym_field', 'fk_model__fk_pgp_sym_field'
).distinct(
'pgp_sym_field'
)
Workaround for Django 2.0.x and lower:
from django.db import models
items = EncryptedFKModel.objects.filter(
pgp_sym_field__startswith='P'
).annotate(
_distinct=models.F('pgp_sym_field')
).only(
'id', 'pgp_sym_field', 'fk_model__fk_pgp_sym_field'
).distinct(
'_distinct'
)
This works because the annotated field is auto-decrypted by Django as a F
field and that
field is used in the distinct()
.
Migrating existing fields into PGCrypto Fields
Migrating existing fields into PGCrypto Fields is not performed by this library. You will need to migrate the data
in a forwards migration or other means. The only migration that is supported except to create/activate the pgcrypto
extension in Postgres.
Migrating data is complicated as there might be a few things to consider such as:
- the shape of the data
- validations/constrains done on the table/model/form and anywhere else
The library has no way of doing all these guesses or to make all these decisions.
If you need to migrate data from unencrypted fields to encrypted fields, three ways to solve it:
- When there's no data in the db it should be possible to start from scratch by recreating the db
- When there's no data in the table it should be possible to recreate the table
- When there's data or if the project is shared it should be possible to do it in a non destructive way
Option 1: No data is in the db
- Drop the database
- Squash the migrations
- Recreate the db
Option 2: No data in the table
- Create a migration to drop the table
- Create a new migration for the table with the encrypted field
- Optionally squash the migration
Option 3: Migrating in a non-destructive way
The goal here is to be able to use to legacy field if something goes wrong.
Part 1:
- Create new field
- When data is saved write both to legacy and new field
- Create a data migration to cast data from legacy field to new field
- check existing data from legacy and new field are the same if possible
Part 2:
- Rename the fields and drop legacy fields
- Update the code to use only the new field
Security Limitations
Taken direction from the PostgreSQL documentation:
https://www.postgresql.org/docs/9.6/static/pgcrypto.html#AEN187024
All pgcrypto functions run inside the database server. That means that all the
data and passwords move between pgcrypto and client applications in clear text. Thus you must:
- Connect locally or use SSL connections.
- Trust both system and database administrator.
If you cannot, then better do crypto inside client application.
The implementation does not resist side-channel attacks. For example, the time
required for a pgcrypto decryption function to complete varies among ciphertexts of
a given size.
CHANGELOG
Master (unreleased)
2.6.0
- Added support for Django 3.1.x
- Updated requirements_dev.txt
- Dropped support for Python 3.5
- Dropped support for Django below 2.2.x LTS release
- Added support for BigIntegerFields (#169)
- Added documentation for migration existing data (#246)
2.5.2
- Added support for Django 3.x
- Updated requirements_dev.txt
2.5.1
- Fixed regression in the definition of EmailPGPPublicKeyField (#77)
- Removed dead code (remove_validators and RemoveMaxLengthValidatorMixin)
- Updated requirements_dev.txt
2.5.0
- Added new DecimalFields for both public and symmetric key (#64)
- Added new FloatFields for both public and symmetric key (#64)
- Added new TimeFields for both public and symmetric key (#64)
- Added support for different keys based on database (#67)
2.4.0
- Added auto-decryption of all encrypted fields including FK tables
- Removed django-pgcrypto-fields
aggregates
, PGPManager
and PGPAdmin
as they are no longer needed - Added support for
get_or_create()
and update_or_create()
(#27) - Added support for
get_by_natural_key()
(#23) - Added support for
only()
and defer()
as they were not supported with PGPManager
- Added support for
distinct()
(Django 2.1+ with workaround available for 2.0 and lower) - Separated out dev requirements from setup.py requirements
- Updated packaging / setup.py to include long description
- Added AUTHORS and updated CONTRIBUTING
- Updated TravisCI to use Xenial to gain Python 3.7 in the matrix
2.3.1
- Added
__range
lookup for Date / DateTime fields (#59) - Remove compatibility for
Django 1.8, 1.9, and 1.10
(#62) - Improved
setup.py
:
- check for Python 3.5+
- updated classifiers
- Improved
make
file for release to use twine
- Added additional shields to
README
- Updated Travis config to include Python 3.5 and 3.6
- Refactored lookups and mixins
2.3.0
- Invalid release, bump to 2.3.1
2.2.0
- Merge
.coveragerc
into setup.cfg
- Added
.gitignore
file - Updated out-dated requirements (latest versions of
Flake8
and pycodestyle
are incompatible with each other) - Updated
README
with better explanations of the fields - Implemented DatePGPPublicKeyField and DateTimePGPPublicKeyField
2.1.1
- Added support for Django 2.x+
- Updated requirements for testing
- Updated travis config with Python 3.6 and additional environments
2.1.0
Thanks to @peterfarrell:
- Add support for
DatePGPSymmetricKeyField
and DateTimePGPSymmetricKeyField
including support for serializing / deserializing django form fields. - Add support for auto decryption of symmetric key and public key fields via
the PGPManager (and support for disabling it in the Django Admin via the PGPAdmin)
2.0.0
- Remove compatibility for
Django 1.7
. - Add compatibility for
Django 1.10
. - Add
Django 1.9
to the travis matrix.
v1.0.1
- Exclude tests app from distributed package.
v1.0.0
- Rename package from
pgcrypto_fields
to pgcrypto
.
v0.7.0
- Make
get_placeholder
accepts a new argument compiler
- Fix buggy import to
Aggregate
Note: these changes have been done for django > 1.8.0.
v0.6.4
- Remove
MaxLengthValidator
from email fields.
v0.6.3
- Avoid setting
max_length
on PGP fields.
v0.6.2
- Allow/check
NULL
values for:
TextDigestField
;
TextHMACField
;
EmailPGPPublicKeyField
;
IntegerPGPPublicKeyField
;
TextPGPPublicKeyField
;
EmailPGPSymmetricKeyField
.
IntegerPGPSymmetricKeyField
.
TextPGPSymmetricKeyField
.
v0.6.1
- Fix
cast
ing bug when sending negative values to integer fields.
v0.6.0
- Add
EmailPGPPublicKeyField
and EmailPGPSymmetricKeyField
.
v0.5.0
- Rename the following fields:
PGPPublicKeyField
to TextPGPPublicKeyField
;
PGPSymmetricKeyField
to TextPGPSymmetricKeyField
;
DigestField
to TextDigestField
;
HMACField
to TextHMACField
. - Add new integer fields:
IntegerPGPPublicKeyField
;
IntegerPGPSymmetricKeyField
.
v0.4.0
- Make accessing decrypted value transparent. Fix bug when field had a string
representation of
memoryview
for PGP and keyed hash fields.
v0.3.1
- Fix
EncryptedProxyField
to select the correct item.
v0.3.0
- Access
PGPPublicKeyField
and PGPSymmetricKeySQL
decrypted values with
field's proxy _decrypted
. - Remove descriptor for field's name and raw value.
v0.2.0
- Add hash based lookup for
DigestField
and HMACField
. - Add
DigestField
, HMACField
, PGPPublicKeyAggregate
, PGPSymmetricKeyAggregate
.
v0.1.0
- Add decryption through an aggregate class.
- Add encryption when inserting data to the database.