
Security News
Software Engineering Daily Podcast: Feross on AI, Open Source, and Supply Chain Risk
Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.
stringlifier
Advanced tools
Python module for detecting password, api keys hashes and any other string that resembles a randomly generated character sequence.
String-classifier - is a python module for detecting random string and hashes text/code.
Typical usage scenarios include:
You can see Stringlifier in action by checking out this interactive notebook hosted on Colaboratory.
You can quickly use stringlifier via pip-installation:
$ pip install stringlifier
In case you are using the pip3 installation that comes with Python3, use pip3 instead of pip in the above command.
$ pip3 install stringlifier
API example:
from stringlifier.api import Stringlifier
stringlifier=Stringlifier()
s = stringlifier("com.docker.hyperkit -A -u -F vms/0/hyperkit.pid -c 8 -m 8192M -b 127.0.0.1 --pass=\"NlcXVpYWRvcg\" -s 0:0,hostbridge -s 31,lpc -s 1:0,virtio-vpnkit,path=vpnkit.eth.sock,uuid=45172425-08d1-41ec-9d13-437481803412 -U c6fb5010-a83e-4f74-9a5a-50d9086b9")
After this, s should be:
'com.docker.hyperkit -A -u -F vms/0/hyperkit.pid -c 8 -m 8192M -b <IP_ADDR> --pass="<RANDOM_STRING>" -s 0:0,hostbridge -s 31,lpc -s 1:0,virtio-vpnkit,path=vpnkit.eth.sock,uuid=<UUID> -U <UUID>'
You can also choose to see the full tokenization and classification output:
s, tokens = stringlifier("com.docker.hyperkit -A -u -F vms/0/hyperkit.pid -c 8 -m 8192M -b 127.0.0.1 --pass=\"NlcXVpYWRvcg\" -s 0:0,hostbridge -s 31,lpc -s 1:0,virtio-vpnkit,path=vpnkit.eth.sock,uuid=45172425-08d1-41ec-9d13-437481803412 -U c6fb5010-a83e-4f74-9a5a-50d9086b9", return_tokens=True)
s will be the same as before and tokens will contain the following data:
[[('0', 33, 34, '<NUMERIC>'),
('8', 51, 52, '<NUMERIC>'),
('8192', 56, 60, '<NUMERIC>'),
('127.0.0.1', 65, 74, '<IP_ADDR>'),
('NlcXVpYWRvcg', 83, 95, '<RANDOM_STRING>'),
('0', 100, 101, '<NUMERIC>'),
('0', 102, 103, '<NUMERIC>'),
('31', 118, 120, '<NUMERIC>'),
('1', 128, 129, '<NUMERIC>'),
('0', 130, 131, '<NUMERIC>'),
('45172425-08d1-41ec-9d13-437481803412', 172, 208, '<UUID>'),
('c6fb5010-a83e-4f74-9a5a-50d9086b9', 212, 244, '<UUID>')]]
You can also train your own model if you want to detect different types of strings. For this you can use the Command Line Interface for the string classifier:
$ python3 stringlifier/modules/stringc.py --help
Usage: stringc.py [options]
Options:
-h, --help show this help message and exit
--interactive
--train
--resume
--train-file=TRAIN_FILE
--dev-file=DEV_FILE
--store=OUTPUT_BASE
--patience=PATIENCE (default=20)
--batch-size=BATCH_SIZE
(default=32)
--device=DEVICE
For instructions on how to generate your training data, use this link.
Important note: This model might not scale if detecting a type of string depends on the surrounding tokens. In this case, you can look at a more advanced tool for sequence processing such as NLP-Cube
FAQs
Python module for detecting password, api keys hashes and any other string that resembles a randomly generated character sequence.
We found that stringlifier demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 3 open source maintainers collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.

Security News
GitHub has revoked npm classic tokens for publishing; maintainers must migrate, but OpenJS warns OIDC trusted publishing still has risky gaps for critical projects.

Security News
Rust’s crates.io team is advancing an RFC to add a Security tab that surfaces RustSec vulnerability and unsoundness advisories directly on crate pages.