Socket
Socket
Sign inDemoInstall

multisort

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

multisort

NoneType Safe Multi Column Sorting For Python


Maintainers
1

multisort - NoneType Safe Multi Column Sorting For Python

Simplified multi-column sorting of lists of tuples, dicts, lists or objects that are NoneType safe.

Installation

python3 -m pip install multisort

Dependencies

None

Performance

Average over 10 iterations with 1000 rows.

TestSecs
superfast0.0005
multisort0.0035
pandas0.0079
cmp_func0.0138
reversor0.037

Hands down the fastest is the superfast methdology shown below. You do not need this library to accomplish this as its just core python.

multisort from this library gives reasonable performance for large data sets; eg. its better than pandas up to about 5,500 records. It is also much simpler to read and write, and it has error handling that does its best to give useful error messages.

Note on NoneType and sorting

If your data may contain None, it would be wise to ensure your sort algorithm is tuned to handle them. This is because sorted uses < comparisons; which is not supported by NoneType. For example, the following error will result: TypeError: '>' not supported between instances of 'NoneType' and 'str'. All examples given on this page are tuned to handle None values.

Methodologies

MethodDescrNotes
multisortSimple one-liner designed after multisort example from python docsSecond fastest of the bunch but most configurable and easy to read.
cmp_funcMulti column sorting in the model java.util.ComparatorReasonable speed
superfastNoneType safe sample implementation of multi column sorting as mentioned in example from python docsFastest by orders of magnitude but a bit more complex to write.

Dictionary Examples

For data:

rows_before = [
     {'idx': 0, 'name': 'joh', 'grade': 'C', 'attend': 100}
    ,{'idx': 1, 'name': 'jan', 'grade': 'a', 'attend': 80}
    ,{'idx': 2, 'name': 'dav', 'grade': 'B', 'attend': 85}
    ,{'idx': 3, 'name': 'bob' , 'grade': 'C', 'attend': 85}
    ,{'idx': 4, 'name': 'jim' , 'grade': 'F', 'attend': 55}
    ,{'idx': 5, 'name': 'joe' , 'grade': None, 'attend': 55}
]

multisort

Sort rows_before by grade, descending, then attend, ascending and put None first in results:

from multisort import multisort, mscol
rows_sorted = multisort(rows_before, [
        mscol('grade', reverse=False),
        'attend'
])

-or- without mscol

from multisort import multisort
rows_sorted = multisort(rows_before, [
        ('grade', False),
        'attend'
])

Sort rows_before by grade, descending, then attend and call upper() for grade:

from multisort import multisort, mscol
rows_sorted = multisort(rows_before, [
        mscol('grade', reverse=False, clean=lambda s: None if s is None else s.upper()),
        'attend'
])

-or- without mscol

from multisort import multisort
rows_sorted = multisort(rows_before, [
        ('grade', False, lambda s: None if s is None else s.upper()),
        'attend'
])

multisort parameters:

optiondtypedescription
rowsint or strKey to access data. int for tuple or list
specstr, int, listSort specification. Can be as simple as a column key / index or mscol
reverseboolReverse order of final sort (defalt = False)

spec entry options:

optionpositiondtypedescription
key0int or strKey to access data. int for tuple or list
reverse1boolReverse sort of column
clean2funcFunction / lambda to clean the value. These calls can cause a significant slowdown.
default3anyValue to substitute if required==False and key does not exist or None is found. Can be used to achive similar functionality to pandas na_position
required4boolDefault True. If False, will substitute None or default if key not found (not applicable for list or tuple rows)

* spec entries can be passed as:

    typedescription
    StringColumn name
    tupleTuple of 1 or more spec options in their order as listed (see position)
    mscol()Importable helper to aid in readability. Suggested for three or more of the options.



sorted with cmp_func

Sort rows_before by grade, descending, then attend and call upper() for grade:

def cmp_student(a,b):
    k='grade'; va=a[k]; vb=b[k]
    if va != vb: 
        if va is None: return -1
        if vb is None: return 1
        return -1 if va > vb else 1
    k='attend'; va=a[k]; vb=b[k]; 
    if va != vb: return -1 if va < vb else 1
    return 0
rows_sorted = sorted(rows_before, key=cmp_func(cmp_student), reverse=True)

For reference: superfast methodology with list of dicts:

def key_grade(student):
    grade = student['grade']
    return grade is None, grade
def key_attend(student):
    attend = student['attend']
    return attend is None, attend
students_sorted = sorted(students, key=key_attend)
students_sorted.sort(key=key_grade, reverse=True)

Object Examples

For data:

class Student():
    def __init__(self, idx, name, grade, attend):
        self.idx = idx
        self.name = name
        self.grade = grade
        self.attend = attend
    def __str__(self): return f"name: {self.name}, grade: {self.grade}, attend: {self.attend}"
    def __repr__(self): return self.__str__()

rows_before = [
     Student(0, 'joh', 'C', 100)
    ,Student(1, 'jan', 'a', 80)
    ,Student(2, 'dav', 'B', 85)
    ,Student(3, 'bob', 'C', 85)
    ,Student(4, 'jim', 'F', 55)
    ,Student(5, 'joe', None, 55)
]

multisort

(Same syntax as with Dictionary example above)


sorted with cmp_func

Sort rows_before by grade, descending, then attend and call upper() for grade:

def cmp_student(a,b):
    if a.grade != b.grade: 
        if a.grade is None: return -1
        if b.grade is None: return 1
        return -1 if a.grade > b.grade else 1
    if a.attend != b.attend: 
        return -1 if a.attend < b.attend else 1
    return 0
rows_sorted = sorted(rows_before, key=cmp_func(cmp_student), reverse=True)

List / Tuple Examples

For data:

rows_before = [
     (0, 'joh', 'a'  , 100)
    ,(1, 'joe', 'B'  , 80)
    ,(2, 'dav', 'A'  , 85)
    ,(3, 'bob', 'C'  , 85)
    ,(4, 'jim', None , 55)
    ,(5, 'jan', 'B'  , 70)
]
(COL_IDX, COL_NAME, COL_GRADE, COL_ATTEND) = range(0,4)

multisort

(Same syntax as with Dictionary example above)


sorted with cmp_func

Sort rows_before by grade, descending, then attend and call upper() for grade:

def cmp_student(a,b):
    k=COL_GRADE; va=a[k]; vb=b[k]
    if va != vb: 
        if va is None: return -1
        if vb is None: return 1
        return -1 if va > vb else 1
    k=COL_ATTEND; va=a[k]; vb=b[k]; 
    if va != vb: 
        return -1 if va < vb else 1
    return 0
rows_sorted = sorted(rows_before, key=cmp_func(cmp_student), reverse=True)



Basic sorting

multisort can be used as a basic non-destructive sorter of lists where a traditional sort does so destructively:

_orig = [1, 4, 3, 6, 5]
_orig.sort(reverse=True)

This will sort _orig in-place

In builtin python, to do a non-destructive sort it takes two lines:

_orig = [1, 4, 3, 6, 5]
_clone = [:]
_clone.sort(reverse=True)

With Multisort its just one line:

_orig = [1, 4, 3, 6, 5]
_sorted = multisort(_orig, reverse=True)

Where _orig is left unchanged


multisort library Test / Sample files (/tests)

NameDescrOther
tests/test_multisort.pymultisort unit tests-
tests/performance_tests.pyTunable performance tests using asynciorequires pandas
tests/hand_test.pyHand testing-

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc