userprovided

"Never trust user input!" is also true outside the security context: You cannot be sure users always provide you with valid and well-formatted data.
For a wide range of data, the Python package userprovided:
- checks for validity and plausibility
- normalizes input
- converts into standardized formats
- performs basic security checks
The code has type hints (PEP 484) and provides useful log and error messages.
Userprovided has functionality for the following inputs:
Installation
Install userprovided using pip or pip3. For example:
sudo pip3 install userprovided
You may consider using a virtualenv.
To upgrade to the latest version accordingly:
sudo pip install userprovided --upgrade
Handle Parameters
Check a Parameter Dictionary
If your application accepts parameters in the form of a dictionary, you have to test if all needed parameters are provided and if there are any unknown keys (maybe due to typos). There is a method for that:
userprovided.parameters.validate_dict_keys(
dict_to_check = {'a': 1, 'b': 2, 'c': 3},
allowed_keys = {'a', 'b', 'c', 'd'},
necessary_keys = {'b', 'c'})
Returns True if the dictionary dict_to_check contains only allowed keys and all necessary keys are present.
Avoid Keys without Value in a Dictionary
Check if all keys in a dictionary have a value. Return False if the value for any key is empty. Works for strings (including whitespace only), dictionaries, lists, tuples, and sets.
parameters.keys_neither_none_nor_empty({'a': 123, 'b': 'example'})
parameters.keys_neither_none_nor_empty({'a': ' ', 'b': 'example'})
parameters.keys_neither_none_nor_empty({'a': None, 'b': 'example'})
parameters.keys_neither_none_nor_empty({'a': list(), 'b': 'example'})
Convert into a set
Convert a string, a tuple, or a list into a set (i.e. no duplicates, unordered):
userprovided.parameters.convert_to_set(list)
Parse Separated Strings into a Set
Parse comma-separated (or custom separator) strings into a set of trimmed, non-empty values. This function supports:
- Custom separators (default: comma)
- Quoted fields to include the separator character within values
- Backslash escaping for special characters
- Automatic trimming and deduplication
userprovided.parameters.separated_string_to_set('a, b, c')
userprovided.parameters.separated_string_to_set('"hello, world", foo, bar')
userprovided.parameters.separated_string_to_set('a\\,b, c')
userprovided.parameters.separated_string_to_set('a|b|c', sep='|')
userprovided.parameters.separated_string_to_set('a, , b, ,c')
userprovided.parameters.separated_string_to_set('"a,b",c', allow_quotes=False)
userprovided.parameters.separated_string_to_set(None)
Parameters:
raw_string: The string to parse (or None)
sep: Separator character (default: ',')
allow_quotes: Enable quote parsing (default: True)
quote_char: Quote character (default: '"')
Raises:
ValueError: If separator/quote_char is not a single character, if quote_char equals separator, or if quotes are unclosed
Check Range of Numbers and Strings
def numeric_in_range(parameter_name,
given_value,
minimum_value,
maximum_value,
fallback_value) -> Union[int, float]
def string_in_range(string_to_check,
minimum_length,
maximum_length,
strip_string: bool = True) -> bool
userprovided.parameters.is_port(int)
Check Integer Range
Similar to numeric_in_range, but with strict type checking to ensure all values are exactly integers (not floats). This is useful when you need to guarantee integer types, for example when working with array indices, counts, or IDs.
userprovided.parameters.int_in_range(
parameter_name='user_age',
given_value=25,
minimum_value=0,
maximum_value=120,
fallback_value=18
)
userprovided.parameters.int_in_range(
parameter_name='page_number',
given_value=500,
minimum_value=1,
maximum_value=100,
fallback_value=1
)
userprovided.parameters.int_in_range(
parameter_name='count',
given_value=5.0,
minimum_value=1,
maximum_value=10,
fallback_value=5
)
The function validates that minimum ≤ maximum and that the fallback value is within the allowed range.
Enforce Boolean Type
Validates that a parameter is exactly of type bool (True or False), not just a truthy or falsy value. Use this when you need to ensure strict boolean parameters and avoid subtle bugs from implicit type conversions.
userprovided.parameters.enforce_boolean(True)
userprovided.parameters.enforce_boolean(False, parameter_name='debug_mode')
userprovided.parameters.enforce_boolean(1)
userprovided.parameters.enforce_boolean('true')
Validate AWS S3 Bucket Names
Check if a string complies with AWS S3 bucket naming rules. AWS has strict requirements for bucket names to ensure they work properly across all regions and services.
userprovided.parameters.is_aws_s3_bucket_name('my-valid-bucket-name')
userprovided.parameters.is_aws_s3_bucket_name('192.168.1.1')
userprovided.parameters.is_aws_s3_bucket_name('xn--bucket')
userprovided.parameters.is_aws_s3_bucket_name('bucket-s3alias')
AWS S3 bucket name requirements enforced:
- Length: 3-63 characters
- Allowed characters: lowercase letters, numbers, hyphens, and dots
- Must start and end with a letter or number
- Cannot resemble an IP address (e.g., 192.168.1.1)
- Cannot contain consecutive dots (..) or dot-hyphen combinations (.- or -.)
- Cannot start with reserved prefixes:
xn--, sthree-, amzn-s3-demo-
- Cannot end with reserved suffixes:
-s3alias, --ol-s3, .mrap, --x-s3, --table-s3
Handle URLs
Normalize URLs
Normalizing a URL means:
- remove whitespace around it,
- convert scheme and hostname to lowercase,
- remove ports if they are the standard port for the scheme,
- remove duplicate slashes from the path,
- remove fragments (like #foo),
- remove empty elements of the query part,
- order the elements in the query part alphabetically
The optional parameter drop_keys allows you to remove specific keys, like session ids or trackers, from the query part of the URL.
url = ' https://www.Example.com:443//index.py?c=3&a=1&b=2&d='
userprovided.url.normalize_url(url)
userprovided.url.normalize_url(url, drop_keys=['c'])
Check URLs
To check whether a string is a valid URL - including a scheme (like https) - use userprovided.url.is_url.
userprovided.url.is_url('https://www.example.com')
userprovided.url.is_url('www.example.com')
You can insist on a specific scheme:
userprovided.url.is_url('https://www.example.com', ('ftp'))
userprovided.url.is_url('ftp://www.example.com', ('ftp'))
To check the URL with an actual connection attempt, you could use the salted library.
Check for Shortened URLs
Check whether a URL is from a known URL shortening service. Such URLs can be useful and harmless, but could also be a way for an attacker to disguise the target of a link.
userprovided.url.is_shortened_url('https://bit.ly/example')
userprovided.url.is_shortened_url('https://www.example.com/page')
userprovided.url.is_shortened_url('https://youtu.be/dQw4w9WgXcQ')
This function recognizes a list of 22 popular URL shortening services that allow random targets. By design, it will not recognize platform-specific short URLs like youtu.be as they point to a specific platform (YouTube) rather than arbitrary destinations.
Determine a File Extension
Guess the correct filename extension from a URL and / or the mime-type returned by the server.
Sometimes a valid URL does not contain a file extension (like https://www.example.com/), or it is ambiguous.
So the mime type acts as a fallback. In case the correct extension cannot be determined at all, it is set to 'unknown'.
userprovided.url.determine_file_extension(
url='https://www.example.com',
provided_mime_type='text/html'
)
userprovided.url.determine_file_extension(
'https://www.example.com/example.pdf',
None
)
Extract Domain from URL
Extract the domain (hostname) from a URL, with optional subdomain removal. Correctly handles 2-part TLDs like .co.uk and .com.au, and returns IP addresses and localhost unchanged.
userprovided.url.extract_domain('https://www.example.com:8080/path')
userprovided.url.extract_domain('https://www.example.com', drop_subdomain=True)
userprovided.url.extract_domain('https://subdomain.example.co.uk/page', drop_subdomain=True)
userprovided.url.extract_domain('https://www.example.com.au/page', drop_subdomain=True)
userprovided.url.extract_domain('http://192.168.1.1:8080/path', drop_subdomain=True)
userprovided.url.extract_domain('http://localhost:3000', drop_subdomain=True)
Extract the top-level domain (TLD) from a URL. Correctly identifies 2-part TLDs like .co.uk and .com.au, returning them as a single unit.
userprovided.url.extract_tld('https://www.example.com/path')
userprovided.url.extract_tld('https://example.co.uk')
userprovided.url.extract_tld('https://subdomain.example.com.au/page')
userprovided.url.extract_tld('http://192.168.1.1')
userprovided.url.extract_tld('http://localhost')
Check Email Addresses
userprovided.mail.is_email('example@example.com')
userprovided.mail.is_email('example+test@example.com')
userprovided.mail.is_email('invalid.email')
Hashes
Check Hash Availability
You can check whether a specific hash method is available. This will raise a DeprecatedHashAlgorithm exception for MD5 and SHA1 even if they are available, because they are deprecated.
print(userprovided.hash.hash_available('md5'))
print(userprovided.hash.hash_available('sha256'))
Calculate a file hash
You can calculate hash sums for files. If you do not provide the method, this defaults to SHA256. Other supported methods are SHA224 and SHA512.
userprovided.hash.calculate_file_hash(pathlib.Path('./foo.txt'))
If you provide an expected value for the hash you can check for file changes or tampering. In the case the provided value and the calculated hash do not match, a ValueError exception is raised.
userprovided.hash.calculate_file_hash(
file_path = pathlib.Path('./foo.txt'),
hash_method = 'sha512',
expected_hash = 'not_the_right_value')
Calculate String Hash
Compute a deterministic hash of string data for non-security use cases such as fingerprints, cache keys, or content de-duplication.
userprovided.hash.calculate_string_hash('example data')
userprovided.hash.calculate_string_hash('example data', hash_method='sha512')
userprovided.hash.calculate_string_hash('example data', encoding='utf-8')
Important Security Warning: Do NOT use this function for:
- Password storage
- Message integrity/authenticity
- Anything needing resistance to brute force or active attackers
This is a generic hash utility for non-security scenarios only. For security-sensitive applications, use proper cryptographic libraries with salting, key derivation functions (like bcrypt, scrypt, or Argon2), or HMAC.
The function supports the same hash methods as calculate_file_hash: SHA224, SHA256 (default), SHA384, SHA512, SHA3 variants, BLAKE2 variants, and other algorithms available in hashlib, but rejects deprecated algorithms (MD5, SHA1).
Handle Calendar Dates
Check Date Existence
Does a specific date exist?
userprovided.date.date_exists(2020, 2, 31)
Normalize long form dates
Normalize German or English long form dates:
userprovided.date.date_en_long_to_iso('October 3, 1990')
userprovided.date.date_de_long_to_iso('3. Oktober 1990')
Validate Geographic Coordinates
Check if latitude and longitude values are within valid Earth ranges. This validates that coordinates are mathematically possible, not whether they point to land, sea, or a specific feature.
userprovided.geo.is_valid_coordinates(48.8566, 2.3522)
userprovided.geo.is_valid_coordinates(45, 181)
userprovided.geo.is_valid_coordinates('51.5074', '-0.1278')
Update and Deprecation Policy
- No breaking changes in micro-versions.
- It makes no sense to duplicate functionality already available in the Python Standard Library. Therefore, if this package contains functionality that becomes superseded by the Standard Library, it will start to log a deprecation warning. The functionality itself is planned to stay available for at least a major version of
userprovided and as long as Python versions not containing this functionality are supported.