Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Generic functions for dealing with and generating strings
strings-oc requires python 3.10 or higher
pip install strings-oc
Returns a human readable string using the number passed as a representation of bytes
>>> from strings import bytes_human
>>> bytes_human(1024)
'1.0KiB'
>>> bytes_human(1024*1024)
'1.0MiB'
>>> bytes_human(1024*1024*1024+1000000000)
'1.9GiB'
Shortens a string so that it ends on a full word instead of halfway between words.
>>> from strings import cut
>>> cut('12345 7890', 8)
'12345...'
>>> cut('12345 7890', 8, '…')
'12345…'
>>> cut('Hello, my name is Frank', 16, '…')
'Hello, my name…'
Returns only the digits, i.e. the numeric characters, in the given string as a new string. Be careful, as this does not return numbers, but number characters, and will strip out valid float/decimal characters
>>> from strings import digits
>>> digits('1234abcd')
'1234'
>>> digits('a1b2c3d4')
'1234'
>>> digits('3.1415')
'31415'
>>> digits('1e+7')
'17'
Returns the entire file as a string
>>> from strings import from_file
>>> from_file('version.dat')
'1.0.1\n'
Assuming version.dat
contained the following
1.0.1
If the file doesn't exist, from_file
returns None
. This can be changed by passing a second argument to be the default value.
>>> from strings import from_file
>>> from_file('doesnotexist', '1.0.0')
'1.0.0'
join
creates a single string from a list of keys that may or may not exist in the passed dict.
>>> from strings import join
>>> d = { 'title': 'Mr.', 'first': 'Homer', 'last': 'Simpson' }
>>> join(d, ['title', 'first', 'last', 'post'])
'Mr. Homer Simpson'
>>> d = { 'title': 'Dr.', 'first': 'Julius', 'last': 'Hibbert', 'post': 'MD' }
'Dr. Julius Hibbert MD'
Returns, as well as possible, a normalized string converted from another string containing characters with special accents. It does this by finding special characters and converting them into their simpler, single character, versions. This is useful for things like automaticlaly generating urls, or for generating from unicode into ascii.
>>> from strings import normalize
>>> normalize('Ȟěƚľỡ, Ẉợɽḷᶁ!')
'Hello, World!'
>>> normalize('ffiDzǼij')
'ffiDAEij'
Returns a random string based on set parameters.
>>> from strings import random
>>> random()
'NQFsxVTi'
>>> random()
'KFCMjKQg'
>>> random()
'HJEvCjlA'
random
can takes 3 optional parameters.
length
represents the number of random characters you wish to return.
>>> random(length = 10)
'PvIwnubCyN'
>>> random(length = 4)
'bGXE'
>>> random(16)
'WMLdawtSCEFeNtsg'
characters
represents the set of chars that are allowed in the string. There is no limit on this list, and no necessity the values be different. This allows for modifying the randomness if there are characters you want to make "more random"
>>> random(8, 'AAAAa')
'AAAAAAAA'
>>> random(length = 8, characters = 'AAAaa')
'aAAAAAaA'
>>> random(characters = 'AAaaa', length = 8)
'aaaaAAaA'
characters
can be set using special built in sets, and can be accessed by passing a list instead of a string
>>> random(16, ['aZ'])
'KwDNSoFPlVVTxwhj'
>>> random(characters = ['az*', '10'])
'a003jsut'
>>> random(characters = ['0x'], length = 32)
'9ce511ab223cef1d65c400ce2e836759'
name | characters |
---|---|
0x | 0123456789abcdef |
0 | 01234567 |
10 | 0123456789 |
az | abcdefghijklmnopqrstuvwxyz |
az* | abcdefghijkmnopqrstuvwxyz |
AZ | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
AZ* | ABCDEFGHJKLMNPQRSTUVWXYZ |
aZ | abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ |
aZ* | abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ |
! | !@#$%^&*-_+.? |
!* | !@$%^*-_. |
* sets denote removal of characters that might confuse, either systems or humans. &, #, etc for the former, and I, l, O, etc for the latter.
By default random
allows duplicate characters in a string, and doesn't see any issue with that. But it's possible you have an issue with it, and want a string made up completely of non-repeating characters. If so, set duplicates
to False
.
>>> random(16, ['az'], False)
'ifaunxgtzbywkmpr'
>>> random(26, ['az'], False)
'rmqghwsayntkpizfbeldvcxoju'
>>> random(27, ['az'], False)
ValueError: Can not generate random string with no duplicates from the given characters "abcdefghijklmnopqrstuvwxyz" in random
shorten_filename
allows for truncating filenames without losing or damaging the extension of the file. Useful for when you need to add a filename for an uploaded file to a database and you are limited by the length of the field.
>>> from strings import shorten_filename
>>> shorten_filename('hello_there_my_friend.txt', 16)
'hello_there_.txt'
As you can see, the name part of the file is shortened, but the .txt stays intact, avoiding potential problems with mime lookup.
strip_html
takes an HTML string and removes all the tags/elements while retaining the cdata. Useful for content that needs to be displayed without formatting.
>>> from strings import strip_html
>>> strip_html('<p>This is a test</p>')
'This is a test'
>>> strip_html('<p>Wanna see some <b>BOLD</b> text?</p>')
'Wanna see some BOLD text?'
strtr
is a partial copy of the PHP functon of the same name. This version does not support the singular use of one $from, and one $to, but the same can be achieved by using a dict with a single key and value. The primary purpose of this function is to be the actual workhorse of the normalize
function, but there's no reason other people can't make use of it.
>>> from strings import strtr
>>> strtr('Hello, World!', {'World': 'Chris'})
'Hello, Chris!'
to_bool
is useful for turning any string into a valid boolean. But will raise an exception if the value does not represent a bool as it sees it. First, it converts the string to lowercase, then it checks it against the following:
Valid True
values contain '1', 'on', 't', 'true', 'y', 'yes', 'x'
Valid False
values contain '0', 'f', 'false', 'n', 'no', 'off', ''
>>> from strings import to_bool
>>> to_bool('true')
True
>>> to_bool('F')
False
>>> to_bool('2')
ValueError: "2" is not a valid boolean representation in to_bool
Stores a string in a file, overwriting the existing contents, or creating the file if it didn't exist.
>>> from strings import to_file
>>> to_file('version.dat', '1.1.0')
True
The version.dat
file will now contain the following
1.1.0
Used to add dashes "-" to a string representation of a UUID that has none.
>>> from strings import uuid_add_dashes
>>> uuid_add_dashes('b22eb45ac98311eca05a80fa5b0d7c77')
'b22eb45a-c983-11ec-a05a-80fa5b0d7c77'
Used to strip dashes "-" from a string representation of a UUID that has them.
>>> from strings import uuid_strip_dashes
>>> uuid_strip_dashes('b22eb45a-c983-11ec-a05a-80fa5b0d7c77')
'b22eb45ac98311eca05a80fa5b0d7c77'
Compares to version strings and returns if the first is less than (-1), equal to (0) or greater than (1) the second
>>> from strings import version_compare
>>> version_compare('1.0.1', '1.0')
1
>>> version_compare('1.0.1', '1.0.1')
0
>>> version_compare('1.0.1', '1.1')
-1
FAQs
Generic functions for dealing with and generating strings
We found that strings-oc demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.