Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
drf-partial-response
Advanced tools
Implements Google's partial response in Django RestFramework (Fork from Zapier's package and upgrade version)
Forked from https://github.com/zapier/django-rest-framework-jsonmask
Implements Google's Partial Response in Django RestFramework
Install using pip
...
$ pip install drf-partial-response
Most DRF addons that support ?fields=
-style data pruning do so purely at the serializaton layer. Many hydrate full ORM objects, including all of their verbose relationships, and then cut unwanted data immediately before JSON serialization. Any unwanted related data is still fetched from the database and hydrated into Django ORM objects, which severely undermines the usefulness of field pruning.
drf_partial_response
aims to do one better by allowing developers to declaratively augment their queryset in direct relation to individual requests. Under this pattern, you only declare the base queryset and any universal relationships on your ViewSet.queryset, leaving all additional enhancements as runtime opt-ins.
To use drf_partial_response
, first include its ViewSet and Serializer mixins in your code where appropriate. The following examples are taken from the mini-project used in this library's own unit tests.
# api/views.py
from drf_partial_response.views import OptimizedQuerySetMixin
class TicketViewSet(OptimizedQuerySetMixin, viewsets.ReadOnlyModelViewSet):
# Normally, for optimal performance, you would apply the `select_related('author')`
# call to the base queryset, but that is no longer desireable for data relationships
# that your frontend may stop asking for.
queryset = Ticket.objects.all()
serializer_class = TicketSerializer
# Data-predicate declaration is optional, but encouraged. This
# is where the library really shines!
@data_predicate('author')
def load_author(self, queryset):
return queryset.select_related('author')
# api/serializers.py
from drf_partial_response.serializers import FieldsListSerializerMixin
class TicketSerializer(FieldsListSerializerMixin, serializers.ModelSerializer):
# Aside from the mixin, everything else is exactly like normal
author = UserSerializer()
class Meta:
models = my_module.models.Ticket
fields = ('id', 'title', 'body', 'author',)
You have now set up your API to skip unnecessary joins (and possibly prefetches), unless the requesting client requires that data. Let's consider a few hypothetical requests and the responses they will each receive. (For brevity, in each of these examples, I will pretend pagination is turned off.)
GET /api/tickets/
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"body": "This is its text",
"author": {
"id": 5,
"username": "HomerSimpson",
}
}
]
Because no ?fields
querystring parameter was provided, author records were still loaded and serialized like normal.
Note:
drf_partial_response
treats all requests that lack any field definition as if all possible data is requested, and thus executes all data predicates. In the above example,author
data was loaded viaselected_related('author')
, and not N+1 queries.
GET /api/tickets/?fields=id,title,body
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"body": "This is its text"
}
]
In this example, since author
was not specified, it was not only not returned in the response payload - it was never queried for or serialized in the first place.
GET /api/tickets/?fields=id,title,body,author/username
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"body": "This is its text",
"author": {
"username": "HomerSimpson",
}
}
]
In this example, author
data was loaded via the ?fields
declaration, but no unwanted keys will appear in the response.
This is all good and fun, but what if author
has rarely used but expensive relationships, too? drf_partial_response
supports this, via the exact same mechanisms spelled out above, though sometimes a little extra attention to detail can be important. Let's now imagine that AuthorSerializer
looks like this:
class AuthorSerializer(FieldsListSerializerMixin, serializers.ModelSerializer):
accounts = AccountSerializer(many=True)
class Meta:
model = settings.AUTH_USER_MODEL
fields = ('id', 'username', 'email', 'photo', 'accounts', ...)
Naturally, if accounts
is sensitive, internal data, you simply might not use this serializer for external API consumption. Of course, that would solve your problem about how to decide whether to serialize accounts
data -- the supplied serializer would know nothing about that field! But, let's pretend that in our case, accounts
is safe for public consumption, and some ticketing API calls require it for triaging purposes, whereas others do not. In such a situation, we'll redefine our ViewSet like so:
class TicketViewSet(OptimizedQuerySetMixin, viewsets.ReadOnlyModelViewSet):
queryset = Ticket.objects.all()
serializer_class = TicketSerializer
@data_predicate('author')
def load_author(self, queryset):
return queryset.select_related('author')
# Add this extra data_predicate with prefetches `accounts` if and only if
# the requests promises to use that information
@data_predicate('author.accounts')
def load_author_with_accounts(self, queryset):
return queryset.select_related('author').prefetch_related('author__accounts')
Now, it is up to the client to decide which of the following options (or anything else imaginable) is most appropriate:
# Includes specified local fields plus all author fields and relationships
GET /api/tickets/?fields=id,title,author
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"author": {
"id": 5,
"username": "HomerSimpson",
"accounts": [
{"all_fields": "will_be_present"}
]
}
}
]
or
# Includes specified local fields plus specified author fields and relationships
GET /api/tickets/?fields=id,title,author(username,photo)
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"author": {
"username": "HomerSimpson",
"photo": "image_url"
}
}
]
or
# Includes specified local fields plus specified author fields and relationships plus specified accounts fields and relationships
GET /api/tickets/?fields=id,title,author(id,accounts(id,type_of,date))
200 OK
[
{
"id": 1,
"title": "This is a ticket",
"author": {
"id": 5,
"accounts": [
{
"id": 8001,
"type_of": "business",
"date": "2018-01-01T12:00:00Z"
},
{
"id": 6500,
"type_of": "trial",
"date": "2017-06-01T12:00:00Z"
}
]
}
}
]
In short, know that as long as the entire chain of Serializers implements the FieldsListSerializerMixin
, arbitrarily deep nesting of ?fields
declarations will be honored. However, in practice, because relationships are expensive to hydrate, you will probably want to limit that information and control what data you actually load using the @data_predicate
decorator on ViewSet methods.
$ make tests
or keep them running on change:
$ make watch
You can also use the excellent tox testing tool to run the tests against all supported versions of Python and Django. Install tox globally, and then simply run:
$ tox
$ make docs
FAQs
Implements Google's partial response in Django RestFramework (Fork from Zapier's package and upgrade version)
We found that drf-partial-response demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.