============
python-split
Functions to split and partition sequences.
Installation
::
pip install split
Usage
All functions in this module return iterators, and consume input
lazily. In the examples below, the results are forced using list
and dict
.
Chunks of equal size
To partition a sequence into chunks of equal size, use ``chop``::
>>> from split import chop
>>> list(chop(3, range(10)))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
If ``truncate=True`` keyword argument is given, then sequence length is
truncated to a multiple of chunk size, and all chunks have the same
size::
>>> list(chop(3, range(10), truncate=True))
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Subsequences by a predicate
To split a sequence into two by a given predicate, use partition
::
>>> from split import partition
>>> def odd(x): return x%2
>>> map(list, partition(odd, range(5)))
[[1, 3], [0, 2, 4]]
For more general partitioning, use groupby
::
>>> [(k, list(i)) for k,i in groupby(lambda x: x%3, range(7))]
[(0, [0, 3, 6]), (1, [1, 4]), (2, [2, 5])]
This function is different from itertools.groupby
: it returns only
one subsequence iterator per predicate value. Its return value can be
converted into dictionary.
When working with very long sequences, consider using
predicate_values
keyword argument to avoid scanning the entire
sequence. For example::
>>> longseq = xrange(int(1e9))
>>> pred = lambda x: x%3
>>> dict(groupby(pred, longseq, predicate_values=(0,1,2)))
{0: <generator object subsequence at 0x301b7d0>,
1: <generator object subsequence at 0x301b780>,
2: <generator object subsequence at 0x301b730>}
Breaking on separators
To break a sequence into chunks on some separators, use ``split``. For
example, breaking on zero elements::
>>> list(split(0, [1,2,3,0,4,5,0,0,6]))
[[1, 2, 3], [4, 5], [], [6]]
You can use a function as a predicate too::
>>> list(split(lambda x: x==5, range(10)))
[[0, 1, 2, 3, 4], [6, 7, 8, 9]]