🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more →

Book a Demo Install Sign in

codext

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

codext

Native codecs extension

1.15.8

PyPI

Maintainers: 1

CodExt

Encode/decode anything.

CodExt is a (Python2-3 compatible) library that extends the native codecs library (namely for adding new custom encodings and character mappings) and provides 120+ new codecs, hence its name combining CODecs EXTension. It also features a guess mode for decoding multiple layers of encoding and CLI tools for convenience.

$ pip install codext

Want to contribute a new codec ?	Want to contribute a new macro ?
Check the documentation first Then PR your new codec	PR your updated version of `macros.json`

Demonstrations

Using CodExt from the command line

Using base tools from the command line

Using the unbase command line tool

//img.shields.io/badge/Tweet%20(codext)--lightgrey?logo=twitter&style=social" alt="Tweet on codext" height="20"/>

$ codext -i test.txt encode dna-1
GTGAGCGGGTATGTGA

$ echo -en "test" | codext encode morse
- . ... -

$ echo -en "test" | codext encode braille
⠞⠑⠎⠞

$ echo -en "test" | codext encode base100
👫👜👪👫

Chaining codecs

$ echo -en "Test string" | codext encode reverse
gnirts tseT

$ echo -en "Test string" | codext encode reverse morse
--. -. .. .-. - ... / - ... . -

$ echo -en "Test string" | codext encode reverse morse dna-2
AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC

$ echo -en "Test string" | codext encode reverse morse dna-2 octal
101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103

$ echo -en "AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC" | codext -d dna-2 morse reverse
test string

Using macros

$ codext add-macro my-encoding-chain gzip base63 lzma base64

$ codext list macros
example-macro, my-encoding-chain

$ echo -en "Test string" | codext encode my-encoding-chain
CQQFAF0AAIAAABuTgySPa7WaZC5Sunt6FS0ko71BdrYE8zHqg91qaqadZIR2LafUzpeYDBalvE///ug4AA==

$ codext remove-macro my-encoding-chain

$ codext list macros
example-macro

//img.shields.io/badge/Tweet%20(unbase)--lightgrey?logo=twitter&style=social" alt="Tweet on unbase" height="20"/>

$ echo "Test string !" | base122
*.7!ft9�-f9Â

$ echo "Test string !" | base91 
"ONK;WDZM%Z%xE7L

$ echo "Test string !" | base91 | base85
B2P|BJ6A+nO(j|-cttl%

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr
QVx5tvgjvCAkXaMSuKoQmCnjeCV1YyyR3WErUUErFf

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | base58-flickr -d | base36 -d | base85 -d | base91 -d
Test string !

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -m 3
Test string !

$ echo "Test string !" | base91 | base85 | base36 | base58-flickr | unbase -f Test
Test string !

Usage (Python)

Getting the list of available codecs:

>>> import codext

>>> codext.list()
['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']

>>> codext.encode("this is a test", "base58-bitcoin")
'jo91waLQA1NNeBmZKUF'

>>> codext.encode("this is a test", "base58-ripple")
'jo9rA2LQwr44eBmZK7E'

>>> codext.encode("this is a test", "base58-url")
'JN91Wzkpa1nnDbLyjtf'

>>> codecs.encode("this is a test", "base100")
'👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫'

>>> codecs.decode("👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫", "base100")
'this is a test'

>>> for i in range(8):
        print(codext.encode("this is a test", "dna-%d" % (i + 1)))
GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA
CTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCA
ACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAG
AGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGAC
TCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTG
TGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTC
GAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGT
CACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT
>>> codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA", "dna-1")
'this is a test'

>>> codecs.encode("this is a test", "morse")
'- .... .. ... / .. ... / .- / - . ... -'

>>> codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse")
'this is a test'

>>> with open("morse.txt", 'w', encoding="morse") as f:
	f.write("this is a test")
14

>>> with open("morse.txt",encoding="morse") as f:
	f.read()
'this is a test'

>>> codext.decode("""
      =            
              X         
   :            
      x         
  n  
    r 
        y   
      Y            
              y        
     p    
         a       
 `          
            n            
          |    
  a          
o    
       h        
          `            
          g               
           o 
   z      """, "whitespace-after+before")
'CSC{not_so_invisible}'

>>> print(codext.encode("An example test string", "baudot-tape"))
***.**
   . *
***.* 
*  .  
   .* 
*  .* 
   . *
** .* 
***.**
** .**
   .* 
*  .  
* *. *
   .* 
* *.  
* *. *
*  .  
* *.  
* *. *
***.  
  *.* 
***.* 
 * .*

List of codecs

BaseXX

This category also contains ascii85, adobe, [x]btoa, zeromq with the base85 codec.

Binary

Common

a1z26: keeps words whitespace-separated and uses a custom character separator
cases: set of case-related encodings (including camel-, kebab-, lower-, pascal-, upper-, snake- and swap-case, slugify, capitalize, title)
dummy: set of simple encodings (including integer, replace, reverse, word-reverse, substite and strip-spaces)
octal: dummy octal conversion (converts to 3-digits groups)
octal-spaced: variant of octal ; dummy octal conversion, handling whitespace separators
ordinal: dummy character ordinals conversion (converts to 3-digits groups)
ordinal-spaced: variant of ordinal ; dummy character ordinals conversion, handling whitespace separators

Compression

gzip: standard Gzip compression/decompression
lz77: compresses the given data with the algorithm of Lempel and Ziv of 1977
lz78: compresses the given data with the algorithm of Lempel and Ziv of 1978
pkzip_deflate: standard Zip-deflate compression/decompression
pkzip_bzip2: standard BZip2 compression/decompression
pkzip_lzma: standard LZMA compression/decompression

:warning: Compression functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Cryptography

:warning: Crypto functions are of course definitely NOT encoding functions ; they are implemented for leveraging the .encode(...) API from codecs.

Hashing

blake: includes BLAKE2b and BLAKE2s (Python 3 only ; relies on hashlib)
checksums: includes Adler32 and CRC32 (relies on zlib)
crypt: Unix's crypt hash for passwords (Python 3 and Unix only ; relies on crypt)
md: aka Message Digest ; includes MD4 and MD5 (relies on hashlib)
sha: aka Secure Hash Algorithms ; includes SHA1, 224, 256, 384, 512 (Python2/3) but also SHA3-224, -256, -384 and -512 (Python 3 only ; relies on hashlib)
shake: aka SHAKE hashing (Python 3 only ; relies on hashlib)

:warning: Hash functions are of course definitely NOT encoding functions ; they are implemented for convenience with the .encode(...) API from codecs and useful for chaning codecs.

Languages

Others

dna: implements the 8 rules of DNA sequences (N belongs to [1,8])
letter-indices: encodes consonants and/or vowels with their corresponding indices
markdown: unidirectional encoding from Markdown to HTML

Steganography

hexagram: uses Base64 and encodes the result to a charset of I Ching hexagrams (as implemented here)
klopf: aka Klopf code ; Polybius square with trivial alphabetical distribution
resistor: aka resistor color codes
rick: aka Rick cipher (in reference to Rick Astley's song "Never gonna give you up")
sms: also called T9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decoding
whitespace: replaces bits with whitespaces and tabs
whitespace_after_before: variant of whitespace ; encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before")

Web

html: implements entities according to this reference
url: aka URL encoding

Keywords

FAQs

What is codext?

Is codext well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install