Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
:warning: Ensure that your schema definitions come from internal or trusted sources. Yamale does not protect against intentionally malicious schemas. |
---|
A schema and validator for YAML.
What's YAML? See the current spec here and an introduction to the syntax here.
$ pip install yamale
NOTE: Some platforms, e.g., Mac OS, may ship with only Python 2 and may not have pip installed. Installation of Python 3 should also install pip. To preserve any system dependencies on default software, consider installing Python 3 as a local package. Please note replacing system-provided Python may disrupt other software. Mac OS users may wish to investigate MacPorts, homebrew, or building Python 3 from source; in all three cases, Apple's Command Line Tools (CLT) for Xcode may be required. See also developers, below.
python setup.py install
(may have to prepend sudo
)Yamale can be run from the command line to validate one or many YAML files. Yamale will search the directory you supply (current directory is default) for YAML files. Each YAML file it finds it will look in the same directory as that file for its schema, if there is no schema Yamale will keep looking up the directory tree until it finds one. If Yamale can not find a schema it will tell you.
Usage:
usage: yamale [-h] [-s SCHEMA] [-n CPU_NUM] [-p PARSER] [--no-strict] [PATH]
Validate yaml files.
positional arguments:
PATH folder to validate. Default is current directory.
optional arguments:
-h, --help show this help message and exit
-s SCHEMA, --schema SCHEMA
filename of schema. Default is schema.yaml.
-n CPU_NUM, --cpu-num CPU_NUM
number of CPUs to use. Default is 4.
-p PARSER, --parser PARSER
YAML library to load files. Choices are "ruamel" or
"pyyaml" (default).
--no-strict Disable strict mode, unexpected elements in the data
will be accepted.
There are several ways to feed Yamale schema and data files. The simplest way is to let Yamale take care of reading and parsing your YAML files.
All you need to do is supply the files' path:
# Import Yamale and make a schema object:
import yamale
schema = yamale.make_schema('./schema.yaml')
# Create a Data object
data = yamale.make_data('./data.yaml')
# Validate data against the schema. Throws a ValueError if data is invalid.
yamale.validate(schema, data)
You can pass a string of YAML to make_schema()
and make_data()
instead of passing a file path
by using the content=
parameter:
data = yamale.make_data(content="""
name: Bill
age: 26
height: 6.2
awesome: True
""")
If data
is valid, nothing will happen. However, if data
is invalid Yamale will throw a
YamaleError
with a message containing all the invalid nodes:
try:
yamale.validate(schema, data)
print('Validation success! 👍')
except ValueError as e:
print('Validation failed!\n%s' % str(e))
exit(1)
and an array of ValidationResult
.
try:
yamale.validate(schema, data)
print('Validation success! 👍')
except YamaleError as e:
print('Validation failed!\n')
for result in e.results:
print("Error validating data '%s' with '%s'\n\t" % (result.data, result.schema))
for error in result.errors:
print('\t%s' % error)
exit(1)
You can also specify an optional parser
if you'd like to use the ruamel.yaml
(YAML 1.2 support) instead:
# Import Yamale and make a schema object, make sure ruamel.yaml is installed already.
import yamale
schema = yamale.make_schema('./schema.yaml', parser='ruamel')
# Create a Data object
data = yamale.make_data('./data.yaml', parser='ruamel')
# Validate data against the schema same as before.
yamale.validate(schema, data)
:warning: Ensure that your schema definitions come from internal or trusted sources. Yamale does not protect against intentionally malicious schemas. |
---|
To use Yamale you must make a schema. A schema is a valid YAML file with one or more documents
inside. Each node terminates in a string which contains valid Yamale syntax. For example, str()
represents a String validator.
A basic schema:
name: str()
age: int(max=200)
height: num()
awesome: bool()
And some YAML that validates:
name: Bill
age: 26
height: 6.2
awesome: True
Take a look at the Examples section for more complex schema ideas.
Schema files may contain more than one YAML document (nodes separated by ---
). The first document
found will be the base schema. Any additional documents will be treated as Includes. Includes allow
you to define a valid structure once and use it several times. They also allow you to do recursion.
A schema with an Include validator:
person1: include('person')
person2: include('person')
---
person:
name: str()
age: int()
Some valid YAML:
person1:
name: Bill
age: 70
person2:
name: Jill
age: 20
Every root node not in the first YAML document will be treated like an include:
person: include('friend')
group: include('family')
---
friend:
name: str()
family:
name: str()
Is equivalent to:
person: include('friend')
group: include('family')
---
friend:
name: str()
---
family:
name: str()
You can get recursion using the Include validator.
This schema:
person: include('human')
---
human:
name: str()
age: int()
friend: include('human', required=False)
Will validate this data:
person:
name: Bill
age: 50
friend:
name: Jill
age: 20
friend:
name: Will
age: 10
After you construct a schema you can add extra, external include definitions by calling
schema.add_include(dict)
. This method takes a dictionary and adds each key as another include.
By default Yamale will provide errors for extra elements present in lists and maps that are not
covered by the schema. With strict mode disabled (using the --no-strict
command line option),
additional elements will not cause any errors. In the API, strict mode can be toggled by passing
the strict=True/False flag to the validate function.
It is possible to mix strict and non-strict mode by setting the strict=True/False flag in the include validator, setting the option only for the included validators.
Here are all the validators Yamale knows about. Every validator takes a required
keyword telling
Yamale whether or not that node must exist. By default every node is required. Example: str(required=False)
You can also require that an optional value is not None
by using the none
keyword. By default
Yamale will accept None
as a valid value for a key that's not required. Reject None
values
with none=False
in any validator. Example: str(required=False, none=False)
.
Some validators take keywords and some take arguments, some take both. For instance the enum()
validator takes one or more constants as arguments and the required
keyword:
enum('a string', 1, False, required=False)
str(min=int, max=int, equals=string, starts_with=string, ends_with=string, matches=regex, exclude=string, ignore_case=False, multiline=False, dotall=False)
Validates strings.
min
: len(string) >= minmax
: len(string) <= maxequals
: string == value (add ignore_case=True
for case-insensitive checking)starts_with
: Accepts only strings starting with given value (add ignore_case=True
for
case-insensitive checking)matches
: Validates the string against a given regex. Similar to the regex()
validator,
you can use ignore_case
, multiline
and dotall
)ends_with
: Accepts only strings ending with given value (add ignore_case=True
for case-insensitive checking)exclude
: Rejects strings that contains any character in the excluded valueignore_case
: Validates strings in a case-insensitive manner.multiline
: ^
and $
in a pattern match at the beginning and end of each line in a string
in addition to matching at the beginning and end of the entire string. (A pattern matches
at the beginning of a string even in
multiline mode; see below for a workaround.); only allowed in conjunction with a matches
keyword.dotall
: .
in a pattern matches newline characters in a validated string in addition to
matching every character that isn't a newline.; only allowed in conjunction with a matches
keyword.Examples:
str(max=10, exclude='?!')
: Allows only strings less than 11 characters that don't contain ?
or !
.regex([patterns], name=string, ignore_case=False, multiline=False, dotall=False)
Validates strings against one or more regular expressions.
name
: A friendly description for the patterns.ignore_case
: Validates strings in a case-insensitive manner.multiline
: ^
and $
in a pattern match at the beginning and end of each line in a string
in addition to matching at the beginning and end of the entire string. (A pattern matches
at the beginning of a string even in
multiline mode; see below for a workaround.)dotall
: .
in a pattern matches newline characters in a validated string in addition to
matching every character that isn't a newline.Examples:
regex('^[^?!]{,10}$')
: Allows only strings less than 11 characters that don't contain ?
or !
.regex(r'^(\d+)(\s\1)+$', name='repeated natural')
: Allows only strings that contain two or
more identical digit sequences, each separated by a whitespace character. Non-matching strings
like sugar
are rejected with a message like 'sugar' is not a repeated natural.
regex('.*^apples$', multiline=True, dotall=True)
: Allows the string apples
as well
as multiline strings that contain the line apples
.int(min=int, max=int)
Validates integers.
min
: int >= minmax
: int <= maxnum(min=float, max=float)
Validates integers and floats.
min
: num >= minmax
: num <= maxbool()
Validates booleans.
null()
Validates null values.
enum([primitives])
Validates from a list of constants.
Examples:
enum('a string', 1, False)
: a value can be either 'a string'
, 1
or False
day(min=date, max=date)
Validates a date in the form of YYYY-MM-DD.
min
: date >= minmax
: date <= maxExamples:
day(min='2001-01-01', max='2100-01-01')
: Only allows dates between 2001-01-01 and 2100-01-01.timestamp(min=time, max=time)
Validates a timestamp in the form of YYYY-MM-DD HH:MM:SS.
min
: time >= minmax
: time <= maxExamples:
timestamp(min='2001-01-01 01:00:00', max='2100-01-01 23:00:00')
: Only allows times between
2001-01-01 01:00:00 and 2100-01-01 23:00:00.list([validators], min=int, max=int)
Validates lists. If one or more validators are passed to list()
only nodes that pass at
least one of those validators will be accepted.
min
: len(list) >= minmax
: len(list) <= maxExamples:
list()
: Validates any listlist(include('custom'), int(), min=4)
: Only validates lists that contain the custom
include
or integers and contains a minimum of 4 items.map([validators], key=validator, min=int, max=int)
Validates maps. Use when you want a node to contain freeform data. Similar to List
, Map
takes
one or more validators to run against the values of its nodes, and only nodes that pass at least
one of those validators will be accepted. By default, only the values of nodes are validated and
the keys aren't checked.
key
: A validator for the keys of the map.min
: len(map) >= minmax
: len(map) <= maxExamples:
map()
: Validates any mapmap(str(), int())
: Only validates maps whose values are strings or integers.map(str(), key=int())
: Only validates maps whose keys are integers and values are strings. 1: one
would be valid but '1': one
would not.map(str(), min=1)
: Only validates a non-empty map.ip()
Validates IPv4 and IPv6 addresses.
version
: 4 or 6; explicitly force IPv4 or IPv6 validationExamples:
ip()
: Allows any valid IPv4 or IPv6 addressip(version=4)
: Allows any valid IPv4 addressip(version=6)
: Allows any valid IPv6 addressmac()
Validates MAC addresses.
Examples:
mac()
: Allows any valid MAC addresssemver()
Validates Semantic Versioning strings.
Examples:
semver()
: Allows any valid SemVer stringany([validators])
Validates against a union of types. Use when a node must contain one and only one of several types. It is valid if at least one of the listed validators is valid. If no validators are given, accept any value.
Examples:
any(int(), null())
: Validates either an integer or a null value.any(num(), include('vector'))
: Validates either a number or an included 'vector' type.any(str(min=3, max=3),str(min=5, max=5),str(min=7, max=7))
: validates to a string that is exactly 3, 5, or 7 characters longany()
: Allows any value.subset([validators], allow_empty=False)
Validates against a subset of types. Unlike the Any
validator, this validators allows one or more of several types.
As such, it automatically validates against a list. It is valid if all values can be validated against at least one
validator.
ValueError
exception will be raised)allow_empty
: Allow the subset to be empty (and is, therefore, also optional). This overrides the required
flag.Examples:
subset(int(), str())
: Validators against an integer, a string, or a list of either.subset(int(), str(), allow_empty=True)
: Same as above, but allows the empty set and makes the subset optional.include(include_name)
Validates included structures. Must supply the name of a valid include.
Examples:
include('person')
It is also possible to add your own custom validators. This is an advanced topic, but here is an
example of adding a Date
validator and using it in a schema as date()
import yamale
import datetime
from yamale.validators import DefaultValidators, Validator
class Date(Validator):
""" Custom Date validator """
tag = 'date'
def _is_valid(self, value):
return isinstance(value, datetime.date)
validators = DefaultValidators.copy() # This is a dictionary
validators[Date.tag] = Date
schema = yamale.make_schema('./schema.yaml', validators=validators)
# Then use `schema` as normal
:warning: Ensure that your schema definitions come from internal or trusted sources. Yamale does not protect against intentionally malicious schemas. |
---|
optional: str(required=False)
optional_min: int(min=1, required=False)
min: num(min=1.5)
max: int(max=100)
optional_min: 10
min: 1.6
max: 100
customerA: include('customer')
customerB: include('customer')
recursion: include('recurse')
---
customer:
name: str()
age: int()
custom: include('custom_type')
custom_type:
integer: int()
recurse:
level: int()
again: include('recurse', required=False)
customerA:
name: bob
age: 900
custom:
integer: 1
customerB:
name: jill
age: 1
custom:
integer: 3
recursion:
level: 1
again:
level: 2
again:
level: 3
again:
level: 4
list_with_two_types: list(str(), include('variant'))
questions: list(include('question'))
---
variant:
rsid: str()
name: str()
question:
choices: list(include('choices'))
questions: list(include('question'), required=False)
choices:
id: str()
list_with_two_types:
- 'some'
- rsid: 'rs123'
name: 'some SNP'
- 'thing'
- rsid: 'rs312'
name: 'another SNP'
questions:
- choices:
- id: 'id_str'
- id: 'id_str1'
questions:
- choices:
- id: 'id_str'
- id: 'id_str1'
list(include('human'), min=2, max=2)
---
human:
name: str()
age: int(max=200)
height: num()
awesome: bool()
- name: Bill
age: 26
height: 6.2
awesome: True
- name: Adrian
age: 23
height: 6.3
awesome: True
To validate YAML files when you run your program's tests use Yamale's YamaleTestCase
Example:
class TestYaml(YamaleTestCase):
base_dir = os.path.dirname(os.path.realpath(__file__))
schema = 'schema.yaml'
yaml = 'data.yaml'
# or yaml = ['data-*.yaml', 'some_data.yaml']
def runTest(self):
self.assertTrue(self.validate())
base_dir
: String path to prepend to all other paths. This is optional.
schema
: String of path to the schema file to use. One schema file per test case.
yaml
: String or list of yaml files to validate. Accepts globs.
Yamale is formatted with ruff. There is a github action enforcing
ruff formatting and linting rules. You can run this locally via make lint
or by installing
the pre-commit hooks via make install-hooks
Yamale uses Tox to run its tests against multiple Python
versions. To run tests, first checkout Yamale, install Tox, then run make test
in Yamale's root
directory. You may also have to install the correct Python versions to test with as well.
NOTE on Python versions: tox.ini
specifies the lowest and highest versions of Python supported by
Yamale. Unless your development environment is configured to support testing against multiple Python
versions, one or more of the test branches may fail. One method of enabling testing against multiple
versions of Python is to install pyenv
and tox-pyenv
and to use pyenv install
and pyenv local
to ensure that tox is able to locate appropriate Pythons.
Yamale uses Github Actions to upload new tags to PyPi. To release a new version:
yamale/VERSION
.make release
.Github Actions will take care of the rest.
FAQs
A schema and validator for YAML.
We found that yamale demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.