Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
.. Copyright (C) 2008-2013 Canonical Ltd.
This file is part of wadllib.
wadllib is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, version 3 of the License.
wadllib is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with wadllib. If not, see http://www.gnu.org/licenses/.
An Application object represents a web service described by a WADL file.
try: ... import importlib.resources as importlib_resources ... _ = importlib_resources.files # missing on Python 3.8 ... except (ImportError, AttributeError): ... import importlib_resources import os from wadllib.application import Application
The first argument to the Application constructor is the URL at which the WADL file was found. The second argument may be raw WADL markup.
def get_test_resource(filename): ... return importlib_resources.files('wadllib.tests.data').joinpath( ... filename)
wadl_bytes = get_test_resource('launchpad-wadl.xml').read_bytes() wadl = Application("http://api.launchpad.dev/beta/", wadl_bytes)
Or the second argument may be an open filehandle containing the markup.
def application_for(filename, url="http://www.example.com/"): ... with get_test_resource(filename).open('rb') as wadl_stream: ... return Application(url, wadl_stream) wadl = application_for("launchpad-wadl.xml", ... "http://api.launchpad.dev/beta/")
The preferred technique for finding a resource is to start at one of the resources defined in the WADL file, and follow links. This code retrieves the definition of the root resource.
service_root = wadl.get_resource_by_path('') service_root.url 'http://api.launchpad.dev/beta/' service_root.type_url '#service-root'
The service root resource supports GET.
get_method = service_root.get_method('get') get_method.id 'service-root-get'
get_method = service_root.get_method('GET') get_method.id 'service-root-get'
If we want to invoke this method, we send a GET request to the service root URL.
get_method.name 'get' get_method.build_request_url() 'http://api.launchpad.dev/beta/'
The WADL description of a resource knows which representations are available for that resource. In this case, the server root resource has a a JSON representation, and it defines parameters like 'people_collection_link', a link to a list of people in Launchpad. We should be able to use the get_parameter() method to get the WADL definition of the 'people_collection_link' parameter and find out more about it--for instance, is it a link to another resource?
def test_raises(exc_class, method, *args, **kwargs): ... try: ... method(*args, **kwargs) ... except Exception as e: ... if isinstance(e, exc_class): ... print(e) ... return ... raise ... raise Exception("Expected exception %s not raised" % exc_class)
from wadllib.application import NoBoundRepresentationError link_name = 'people_collection_link' test_raises( ... NoBoundRepresentationError, service_root.get_parameter, link_name) Resource is not bound to any representation, and no media media type was specified.
Oops. The code has no way to know whether 'people_collection_link' is a parameter of the JSON representation or some other kind of representation. We can pass a media type to get_parameter and let it know which representation the parameter lives in.
link_parameter = service_root.get_parameter( ... link_name, 'application/json') test_raises(NoBoundRepresentationError, link_parameter.get_value) Resource is not bound to any representation.
Oops again. The parameter is available, but it has no value, because there's no actual data associated with the resource. The browser can look up the description of the GET method to make an actual GET request to the service root, and bind the resulting representation to the WADL description of the service root.
You can't bind just any representation to a WADL resource description. It has to be of a media type understood by the WADL description.
from wadllib.application import UnsupportedMediaTypeError test_raises( ... UnsupportedMediaTypeError, service_root.bind, ... 'Some HTML', 'text/html') This resource doesn't define a representation for media type text/html
The WADL description of the service root resource has a JSON representation. Here it is.
json_representation = service_root.get_representation_definition( ... 'application/json') json_representation.media_type 'application/json'
We already have a WADL representation of the service root resource, so let's try binding it to that JSON representation. We use test JSON data from a file to simulate the result of a GET request to the service root.
def get_testdata(filename): ... return get_test_resource(filename + '.json').read_bytes()
def bind_to_testdata(resource, filename): ... return resource.bind(get_testdata(filename), 'application/json')
The return value is a new Resource object that's "bound" to that JSON test data.
bound_service_root = bind_to_testdata(service_root, 'root') sorted([param.name for param in bound_service_root.parameters()]) ['bugs_collection_link', 'people_collection_link'] sorted(bound_service_root.parameter_names()) ['bugs_collection_link', 'people_collection_link'] [method.id for method in bound_service_root.method_iter] ['service-root-get']
Now the bound resource object has a JSON representation, and now 'people_collection_link' makes sense. We can follow the 'people_collection_link' to a new Resource object.
link_parameter = bound_service_root.get_parameter(link_name) link_parameter.style 'plain' print(link_parameter.get_value()) http://api.launchpad.dev/beta/people personset_resource = link_parameter.linked_resource personset_resource.class <class 'wadllib.application.Resource'> print(personset_resource.url) http://api.launchpad.dev/beta/people personset_resource.type_url 'http://api.launchpad.dev/beta/#people'
This new resource is a collection of people.
personset_resource.id 'people'
The "collection of people" resource supports a standard GET request as well as a special GET and an overloaded POST. The get_method() method is used to retrieve WADL definitions of the possible HTTP requests you might make. Here's how to get the WADL definition of the standard GET request.
get_method = personset_resource.get_method('get') get_method.id 'people-get'
The method name passed into get_method() is treated case-insensitively.
personset_resource.get_method('GET').id 'people-get'
To invoke the special GET request, the client sets the 'ws.op' query parameter to the fixed string 'findPerson'.
find_method = personset_resource.get_method( ... query_params={'ws.op' : 'findPerson'}) find_method.id 'people-findPerson'
Given an end-user's values for the non-fixed parameters, it's possible to get the URL that should be used to invoke the method.
print(find_method.build_request_url(text='foo')) http://api.launchpad.dev/beta/people?text=foo&ws.op=findPerson
print(find_method.build_request_url( ... {'ws.op' : 'findPerson', 'text' : 'bar'})) http://api.launchpad.dev/beta/people?text=bar&ws.op=findPerson
An error occurs if the end-user gives an incorrect value for a fixed parameter value, or omits a required parameter.
find_method.build_request_url() Traceback (most recent call last): ... ValueError: No value for required parameter 'text'
find_method.build_request_url( ... {'ws.op' : 'findAPerson', 'text' : 'foo'}) ... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE Traceback (most recent call last): ... ValueError: Value 'findAPerson' for parameter 'ws.op' conflicts with fixed value 'findPerson'
To invoke the overloaded POST request, the client sets the 'ws.op' query variable to the fixed string 'newTeam':
create_team_method = personset_resource.get_method( ... 'post', representation_params={'ws.op' : 'newTeam'}) create_team_method.id 'people-newTeam'
findMethod() returns None when there's no WADL method matching the name or the fixed parameters.
print(personset_resource.get_method('nosuchmethod')) None
print(personset_resource.get_method( ... 'post', query_params={'ws_op' : 'nosuchparam'})) None
Let's say the browser makes a GET request to the person set resource and gets back a representation. We can bind that representation to our description of the person set resource.
bound_personset = bind_to_testdata(personset_resource, 'personset') bound_personset.get_parameter("start").get_value() 0 bound_personset.get_parameter("total_size").get_value() 63
We can keep following links indefinitely, so long as we bind to a representation to each resource as we get it, and use the representation to find the next link.
next_page_link = bound_personset.get_parameter("next_collection_link") print(next_page_link.get_value()) http://api.launchpad.dev/beta/people?ws.start=5&ws.size=5 page_two = next_page_link.linked_resource bound_page_two = bind_to_testdata(page_two, 'personset-page2') print(bound_page_two.url) http://api.launchpad.dev/beta/people?ws.start=5&ws.size=5 bound_page_two.get_parameter("start").get_value() 5 print(bound_page_two.get_parameter("next_collection_link").get_value()) http://api.launchpad.dev/beta/people?ws.start=10&ws.size=5
Let's say the browser makes a POST request that invokes the 'newTeam' named operation. The response will include a number of HTTP headers, including 'Location', which points the way to the newly created team.
headers = { 'Location' : 'http://api.launchpad.dev/~newteam' } response = create_team_method.response.bind(headers) location_parameter = response.get_parameter('Location') location_parameter.get_value() 'http://api.launchpad.dev/~newteam' new_team = location_parameter.linked_resource new_team.url 'http://api.launchpad.dev/~newteam' new_team.type_url 'http://api.launchpad.dev/beta/#team'
The 'linked_resource' property of a parameter lets you follow a link to another object. The 'link' property of a parameter lets you examine links before following them.
>>> import json
>>> links_wadl = application_for('links-wadl.xml')
>>> service_root = links_wadl.get_resource_by_path('')
>>> representation = json.dumps(
... {'scalar_value': 'foo',
... 'known_link': 'http://known/',
... 'unknown_link': 'http://unknown/'})
>>> bound_root = service_root.bind(representation)
>>> print(bound_root.get_parameter("scalar_value").link)
None
>>> known_resource = bound_root.get_parameter("known_link")
>>> unknown_resource = bound_root.get_parameter("unknown_link")
>>> print(known_resource.link.can_follow)
True
>>> print(unknown_resource.link.can_follow)
False
A link whose type is unknown is a link to a resource not described by WADL. Following this link using .linked_resource or .link.follow will cause a wadllib error. You'll need to follow the link using a general HTTP library or some other tool.
>>> known_resource.link.follow
<wadllib.application.Resource object ...>
>>> known_resource.linked_resource
<wadllib.application.Resource object ...>
>>> from wadllib.application import WADLError
>>> test_raises(WADLError, getattr, unknown_resource.link, 'follow')
Cannot follow a link when the target has no WADL
description. Try using a general HTTP client instead.
>>> test_raises(WADLError, getattr, unknown_resource, 'linked_resource')
Cannot follow a link when the target has no WADL
description. Try using a general HTTP client instead.
Although every representation is a representation of some HTTP resource, an HTTP resource doesn't necessarily correspond directly to a WADL or <resource_type> tag. Sometimes a representation is defined within a WADL tag.
find_method = personset_resource.get_method( ... query_params={'ws.op' : 'find'}) find_method.id 'people-find'
representation_definition = ( ... find_method.response.get_representation_definition( ... 'application/json'))
There may be no WADL or <resource_type> tag for the representation defined here. That's why wadllib makes it possible to instantiate an anonymous Resource object using only the representation definition.
from wadllib.application import Resource anonymous_resource = Resource( ... wadl, "http://foo/", representation_definition.tag)
We can bind this resource to a representation, as long as we explicitly pass in the representation definition.
anonymous_resource = anonymous_resource.bind( ... get_testdata('personset'), 'application/json', ... representation_definition=representation_definition)
Once the resource is bound to a representation, we can get its parameter values.
print(anonymous_resource.get_parameter( ... 'total_size', 'application/json').get_value()) 63
If you happen to have the URL to an object lying around, and you know its type, you can construct a Resource object directly instead of by following links.
from wadllib.application import Resource limi_person = Resource(wadl, "http://api.launchpad.dev/beta/~limi", ... "http://api.launchpad.dev/beta/#person") sorted([method.id for method in limi_person.method_iter])[:3] ['person-acceptInvitationToBeMemberOf', 'person-addMember', 'person-declineInvitationToBeMemberOf']
bound_limi = bind_to_testdata(limi_person, 'person-limi') sorted(bound_limi.parameter_names())[:3] ['admins_collection_link', 'confirmed_email_addresses_collection_link', 'date_created'] languages_link = bound_limi.get_parameter("languages_collection_link") print(languages_link.get_value()) http://api.launchpad.dev/beta/~limi/languages
You can bind a Resource to a representation when you create it.
limi_data = get_testdata('person-limi') bound_limi = Resource( ... wadl, "http://api.launchpad.dev/beta/~limi", ... "http://api.launchpad.dev/beta/#person", limi_data, ... "application/json") print(bound_limi.get_parameter( ... "languages_collection_link").get_value()) http://api.launchpad.dev/beta/~limi/languages
By default the representation is treated as a string and processed according to the media type you pass into the Resource constructor. If you've already processed the representation, pass in False for the 'representation_needs_processing' argument.
processed_limi_data = json.loads(limi_data.decode()) bound_limi = Resource(wadl, "http://api.launchpad.dev/beta/~limi", ... "http://api.launchpad.dev/beta/#person", processed_limi_data, ... "application/json", False) print(bound_limi.get_parameter( ... "languages_collection_link").get_value()) http://api.launchpad.dev/beta/~limi/languages
Most of the time, the representation of a resource is of the type you'd get by sending a standard GET to that resource. If that's not the case, you can specify a RepresentationDefinition as the 'representation_definition' argument to bind() or the Resource constructor, to show what the representation really looks like. Here's an example.
There's a method on a person resource such as bound_limi that's identified by a distinctive query argument: ws.op=getMembersByStatus.
method = bound_limi.get_method( ... query_params={'ws.op' : 'findPathToTeam'})
Invoke this method with a GET request and you'll get back a page from a list of people.
people_page_repr_definition = ( ... method.response.get_representation_definition('application/json')) people_page_repr_definition.tag.attrib['href'] 'http://api.launchpad.dev/beta/#person-page'
As it happens, we have a page from a list of people to use as test data.
people_page_repr = get_testdata('personset')
If we bind the resource to the result of the method invocation as happened above, we don't be able to access any of the parameters we'd expect. wadllib will think the representation is of type 'person-full', the default GET type for bound_limi.
bad_people_page = bound_limi.bind(people_page_repr) print(bad_people_page.get_parameter('total_size')) None
Since we don't actually have a 'person-full' representation, we won't be able to get values for the parameters of that kind of representation.
bad_people_page.get_parameter('name').get_value() Traceback (most recent call last): ... KeyError: 'name'
So that's a dead end. But, if we pass the correct representation type into bind(), we can access the parameters associated with a 'person-page' representation.
people_page = bound_limi.bind( ... people_page_repr, ... representation_definition=people_page_repr_definition) people_page.get_parameter('total_size').get_value() 63
If you invoke the method and ask for a media type other than JSON, you won't get anything.
print(method.response.get_representation_definition('text/html')) None
The values of date and dateTime parameters are automatically converted to Python datetime objects.
data_type_wadl = application_for('data-types-wadl.xml') service_root = data_type_wadl.get_resource_by_path('')
representation = json.dumps( ... {'a_date': '2007-10-20', ... 'a_datetime': '2005-06-06T08:59:51.619713+00:00'}) bound_root = service_root.bind(representation, 'application/json')
bound_root.get_parameter('a_date').get_value() datetime.datetime(2007, 10, 20, 0, 0) bound_root.get_parameter('a_datetime').get_value() datetime.datetime(2005, 6, 6, 8, ...)
A 'date' field can include a timestamp, and a 'datetime' field can omit one. wadllib will turn both into datetime objects.
representation = json.dumps( ... {'a_date': '2005-06-06T08:59:51.619713+00:00', ... 'a_datetime': '2007-10-20'}) bound_root = service_root.bind(representation, 'application/json')
bound_root.get_parameter('a_datetime').get_value() datetime.datetime(2007, 10, 20, 0, 0) bound_root.get_parameter('a_date').get_value() datetime.datetime(2005, 6, 6, 8, ...)
If a date or dateTime parameter has a null value, you get None. If the value is a string that can't be parsed to a datetime object, you get a ValueError.
representation = json.dumps( ... {'a_date': 'foo', 'a_datetime': None}) bound_root = service_root.bind(representation, 'application/json') bound_root.get_parameter('a_date').get_value() Traceback (most recent call last): ... ValueError: foo print(bound_root.get_parameter('a_datetime').get_value()) None
You must provide a representation when invoking certain methods. The representation() method helps you build one without knowing the details of how a representation is put together.
create_team_method.build_representation( ... display_name='Joe Bloggs', name='joebloggs') ('application/x-www-form-urlencoded', 'display_name=Joe+Bloggs&name=joebloggs&ws.op=newTeam')
The return value of build_representation is a 2-tuple containing the media type of the built representation, and the string representation itself. Along with the resource's URL, this is all you need to send the representation to a web server.
bound_limi.get_method('patch').build_representation(name='limi2') ('application/json', '{"name": "limi2"}')
Representations may require values for certain parameters.
create_team_method.build_representation() Traceback (most recent call last): ... ValueError: No value for required parameter 'display_name'
bound_limi.get_method('put').build_representation(name='limi2') Traceback (most recent call last): ... ValueError: No value for required parameter 'mugshot_link'
Some representations may safely include binary data.
with get_test_resource('multipart-binary-wadl.xml').open( ... 'rb') as binary_stream: ... binary_wadl = Application( ... "http://www.example.com/", binary_stream) service_root = binary_wadl.get_resource_by_path('')
Define a helper that processes the representation the same way zope.publisher would. (We simplify handling of the parsed form data, since we only care about form values and file contents.)
import io import multipart def assert_message_parts(media_type, doc, expected): ... environ = { ... 'wsgi.input': io.BytesIO(doc), ... 'REQUEST_METHOD': 'POST', ... 'CONTENT_TYPE': media_type, ... 'CONTENT_LENGTH': str(len(doc)), ... } ... forms, files = multipart.parse_form_data( ... environ, charset="UTF-8", memfile_limit=0) ... values = [] ... for _, value in forms.iterallitems(): ... values.append(value) ... for _, item in files.iterallitems(): ... values.append(item.file.read()) ... assert values == expected, ( ... 'Expected %s, got %s' % (expected, values))
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text", binary_field=b"\x01\x02\r\x81\r") print(media_type) multipart/form-data; boundary=... assert_message_parts(media_type, doc, ['text', b'\x01\x02\r\x81\r'])
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text", binary_field=b"\x01\x02\r\x81\r") print(media_type) multipart/form-data; boundary=... assert_message_parts(media_type, doc, ['text', b'\x01\x02\r\x81\r'])
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text\n", binary_field=b"\x01\x02\r\x81\n\r") print(media_type) multipart/form-data; boundary=... assert_message_parts( ... media_type, doc, ['text\r\n', b'\x01\x02\r\x81\n\r'])
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text\n", binary_field=b"\x01\x02\r\x81\n\r") print(media_type) multipart/form-data; boundary=... assert_message_parts( ... media_type, doc, ['text\r\n', b'\x01\x02\r\x81\n\r'])
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text\r\nmore\r\n", ... binary_field=b"\x01\x02\r\n\x81\r\x82\n") print(media_type) multipart/form-data; boundary=... assert_message_parts( ... media_type, doc, ['text\r\nmore\r\n', b'\x01\x02\r\n\x81\r\x82\n'])
method = service_root.get_method('post', 'multipart/form-data') media_type, doc = method.build_representation( ... text_field="text\r\nmore\r\n", ... binary_field=b"\x01\x02\r\n\x81\r\x82\n") print(media_type) multipart/form-data; boundary=... assert_message_parts( ... media_type, doc, ['text\r\nmore\r\n', b'\x01\x02\r\n\x81\r\x82\n'])
method = service_root.get_method('post', 'text/unknown') method.build_representation(field="value") Traceback (most recent call last): ... ValueError: Unsupported media type: 'text/unknown'
Some parameters take values from a predefined list of options.
option_wadl = application_for('options-wadl.xml') definitions = option_wadl.representation_definitions service_root = option_wadl.get_resource_by_path('') definition = definitions['service-root-json'] param = definition.params(service_root)[0] print(param.name) has_options sorted([option.value for option in param.options]) ['Value 1', 'Value 2']
Such parameters cannot take values that are not in the list.
definition.validate_param_values( ... [param], {'has_options': 'Value 1'}) {'has_options': 'Value 1'}
definition.validate_param_values( ... [param], {'has_options': 'Invalid value'}) Traceback (most recent call last): ... ValueError: Invalid value 'Invalid value' for parameter 'has_options': valid values are: "Value 1", "Value 2"
You'll get None if you try to look up a nonexistent resource.
print(wadl.get_resource_by_path('nosuchresource')) None
You'll get an exception if you try to look up a nonexistent resource type.
print(wadl.get_resource_type('#nosuchtype')) Traceback (most recent call last): KeyError: 'No such XML ID: "#nosuchtype"'
You'll get None if you try to look up a method whose parameters don't match any defined method.
print(bound_limi.get_method( ... 'post', representation_params={ 'foo' : 'bar' })) None
legacy-cgi
is only a test dependency. Make it an optional dependency.Add Python 3 compatibility
Add the ability to inspect links before following them.
Ensure that the sample data is packaged.
It's now possible to examine a link before following it, to see whether it has a WADL description or whether it needs to be fetched with a general HTTP client.
It's now possible to iterate over a resource's Parameter objects with the .parameters() method.
Remove unnecessary build dependencies.
Add missing dependencies to setup file.
Remove sys.path hack from setup.py.
FAQs
Navigate HTTP resources using WADL files as guides.
We found that wadllib demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 7 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.