
Research
Two Malicious Rust Crates Impersonate Popular Logger to Steal Wallet Keys
Socket uncovers malicious Rust crates impersonating fast_log to steal Solana and Ethereum wallet keys from source code.
search-string-overvaagning
Advanced tools
SearchString is a custom implementation for searching strings for km24.dk.
You can install search-string
from PyPI:
$ pip install search-string-overvaagning
The package is supported on Python 3.9+. However, the package is only compiled using mypyc
starting from Python 3.10 which makes it about twice as fast. As such, it is strongly advised to run it on Python 3.10+.
This package implements the search string object that is used across km24.dk for different types of surveillance.
It is used for searching a text. For something to be deemed a match, the text must match the first_str
and if the second_str
is not empty, the text must also match the second_str
. If the not_str
is not empty, the text must not match the not_str
. A logical AND is used between the three conditions. The three strings can each be a collection of strings separated by semicolons wherein a match is deemed by logical OR. You can use '~' to make a word boundary. Finally, you can use !global
at the end of a string to signal that that part should check globally.
Quick examples:
>>> ss = SearchString('example;hello', 'text', 'elephant', data=None)
>>> ss.match('This is an example text')
True
>>> ss.match('This text says hello')
True
>>> ss.match('This is just an example')
False
Start by importing the SearchString
class:
>>> from search_string import SearchString
Construct a new search string by supplying the first_str
, second_str
, not_str
and any data
that can be useful to refer back to later, such as an ID:
>>> ss = SearchString('first', '', '', data=2)
Optionally, you can also supply a third_str
that works in the same was as first_str
and not_str
but has to be supplied as a keyword argument:
>>> ss = SearchString('first', '', '', data=2, third_str='third')
If you just need to find out whether a given search string matches a text, you can use the method .match
on a SearchString
instance.
Often, what you want to do, is to match a collection of search strings across a list of text, e.g. sentences. You can do that the following way:
>>> from search_string import SearchString
>>> search_strings = [
... SearchString('kan', '', 'ritzau', data=1),
... SearchString('kan', '', 'ritzau!global', data=2)
... ]
>>> sentences = [
... 'Du kan skrive din tekst her.',
... 'Den kan bestå af flere sætninger.',
... 'Dig og Ritzau kan bestemme hvordan det skal være.',
... 'Nogle kan være lange, andre kan være korte.'
... ]
>>> res = SearchString.find_all(sentences, search_strings)
>>> res
[SearchString(kan, -, ritzau, data=1)]
For each of the matched search strings (in the above example, only one), you can extract the data and the matched text as follows:
>>> res[0].data
1
>>> res[0].matched_text
'Du kan skrive din tekst her. Den kan bestå af flere sætninger. (...) Nogle kan være lange, andre kan være korte.'
>>> res[0].matched_text_highligthed
'Du <b>kan</b> skrive din tekst her. Den <b>kan</b> bestå af flere sætninger. (...) Nogle <b>kan</b> være lange, andre <b>kan</b> være korte.'
If you construct a completely empty search string, it will match everything (even the empty string), but the matched text will always be the empty string. This is to allow for "catch-all" search strings:
>>> ss = SearchString('', '', '', data=1)
>>> ss.match('')
True
>>> ss.match('Some random text')
True
>>> ss.matched_text
''
If you have a problem where you repeatedly will be matching new texts against the same collection of search strings, it is highly advised to use the SearchStringCollection
which behind the scenes uses a trie for efficient search when many search strings are present. There is some initial cost in building the trie. Thus, it is recommended that you initialize the collection once and then continue to use it.
The most important method on SearchStringCollection
is find_all
, which takes a sentence (str
) or list of sentences (list[str]
) and returns the matched search strings, very similar to the familiar SearchString.find_all
.
>>> from search_string import SearchString, SearchStringCollection
>>> search_strings = [
... SearchString('kan', '', 'ritzau', data=1),
... SearchString('kan', '', 'ritzau!global', data=2)
... ]
>>> sentences = ... # Same as before
>>> ss_collection = SearchStringCollection(search_strings)
>>> res = ss_collection.find_all()
>>> res
[SearchString(kan, -, ritzau, data=1)]
Importantly, SearchStringCollection
relies on the data
variable being set on the collection of search strings. If it is set to None
or multiple search strings have the same value, the behavior is undefined.
FAQs
SearchString is a custom implementation for searching strings for km24.dk.
We found that search-string-overvaagning demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Socket uncovers malicious Rust crates impersonating fast_log to steal Solana and Ethereum wallet keys from source code.
Research
A malicious package uses a QR code as steganography in an innovative technique.
Research
/Security News
Socket identified 80 fake candidates targeting engineering roles, including suspected North Korean operators, exposing the new reality of hiring as a security function.