Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Add this line to your application's Gemfile:
gem 'scrub_db'
And then execute:
$ bundle
Or install it yourself as:
$ gem install scrub_db
strings_criteria = {
pos_criteria: %w[your positive criteria here],
neg_criteria: %w[your negative criteria here]
}
strings_obj = ScrubDb::Strings.new(strings_criteria)
web_criteria = {
neg_urls: %w[aprov avis budget collis eat],
pos_urls: %w[acura audi bmw bentley],
neg_paths: %w[buy bye call cash cheap click collis cont distrib],
pos_paths: %w[team staff management],
neg_exts: %w[au ca edu es gov in ru uk us],
pos_exts: %w[com net]
}
scrub_web_obj = ScrubDb::Webs.new(web_criteria)
scrub_web_obj = ScrubDb::Webs.new(criteria)
scrubbed_web_hashes = scrub_web_obj.scrub_urls(urls)
strings_obj = ScrubDb::Strings.new(strings_criteria)
scrubbed_strings = strings_obj.scrub_strings(array_of_strings)
strings_obj = ScrubDb::Strings.new(strings_criteria)
scrubbed_prop_strings = strings_obj.scrub_proper_strings(array_of_props)
This is an example of scrubbing auto dealership urls. We only want URLs based in the US, and paths of the staff. Most of our URLs are good, but we want to confirm that they all meet our requirements.
First step is to load your Webs criteria in hash format. It's not required to enter all the keys below, but for those you are using, each key must be a symbol and be exactly the same as the ones below. The values must each be an array of strings.
criteria = {
neg_urls: %w[aprov avis budget collis eat],
pos_urls: %w[acura audi bmw bentley],
neg_paths: %w[buy bye call cash cheap click collis cont distrib],
pos_paths: %w[team staff management],
neg_exts: %w[au ca edu es gov in ru uk us],
pos_exts: %w[com net]
}
scrub_web_obj = ScrubDb::Webs.new(criteria)
Next, pass your list of URLs to scrub_urls(urls)
with the syntax below.
urls = %w[
austinchevrolet.not.real
smith_acura.com/staff
abcrepair.ca
hertzrentals.com/review
londonhyundai.uk/fleet
http://www.townbuick.net/staff
http://youtube.com/download
www.madridinfiniti.es/collision
www.mitsubishideals.sofake
www.dallassubaru.com.sofake
www.quickeats.net/contact_us
www.school.edu/teachers
www.www.nissancars/inventory
www.www.toyotatown.net/staff/management
www.www.yellowpages.com/business
]
scrubbed_web_hashes = scrub_web_obj.scrub_urls(urls)
Notice that the URLs in the list above are NOT uniformly formatted. ScrubDb leverages the Utf8Sanitizer gem
and CrmFormatter gem
to first format the URLs. Then, it passes the formatted URL hashes to be scrubbed based on the criteria passes in earlier. The results will be returned in the syntax below:
scrubbed_web_hashes = [
{
web_status: 'invalid',
url: 'austinchevrolet.not.real',
url_f: nil,
url_path: nil,
web_neg: 'error: ext.invalid [not, real]',
url_exts: [],
neg_exts: [],
pos_exts: [],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: []
},
{ web_status: 'formatted',
url: 'smith_acura.com/staff',
url_f: 'http://www.smith_acura.com',
url_path: '/staff',
web_neg: nil,
url_exts: ['com'],
neg_exts: [],
pos_exts: ['com'],
neg_paths: [],
pos_paths: ['staff'],
neg_urls: [],
pos_urls: ['acura']
},
{ web_status: 'formatted',
url: 'abcrepair.ca',
url_f: 'http://www.abcrepair.ca',
url_path: nil,
web_neg: nil,
url_exts: ['ca'],
neg_exts: ['ca'],
pos_exts: [],
neg_paths: [],
pos_paths: [],
neg_urls: ['repair'],
pos_urls: []
},
{ web_status: 'formatted',
url: 'hertzrentals.com/review',
url_f: 'http://www.hertzrentals.com',
url_path: '/review',
web_neg: nil,
url_exts: ['com'],
neg_exts: [],
pos_exts: ['com'],
neg_paths: ['review'],
pos_paths: [],
neg_urls: ['hertz, rent'],
pos_urls: []
},
{ web_status: 'formatted',
url: 'londonhyundai.uk/fleet',
url_f: 'http://www.londonhyundai.uk',
url_path: '/fleet',
web_neg: nil,
url_exts: ['uk'],
neg_exts: ['uk'],
pos_exts: [],
neg_paths: ['fleet'],
pos_paths: [],
neg_urls: [],
pos_urls: ['hyundai']
},
{ web_status: 'formatted',
url: 'http://www.townbuick.net/staff',
url_f: 'http://www.townbuick.net',
url_path: nil,
web_neg: nil,
url_exts: ['net'],
neg_exts: [],
pos_exts: ['net'],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: ['buick']
},
{ web_status: 'formatted',
url: 'http://youtube.com/download',
url_f: 'http://www.youtube.com',
url_path: nil,
web_neg: nil,
url_exts: ['com'],
neg_exts: [],
pos_exts: ['com'],
neg_paths: [],
pos_paths: [],
neg_urls: ['youtube'],
pos_urls: []
},
{ web_status: 'formatted',
url: 'www.madridinfiniti.es/collision',
url_f: 'http://www.madridinfiniti.es',
url_path: '/collision',
web_neg: nil,
url_exts: ['es'],
neg_exts: ['es'],
pos_exts: [],
neg_paths: ['collis'],
pos_paths: [],
neg_urls: [],
pos_urls: ['infiniti']
},
{ web_status: 'invalid',
url: 'www.mitsubishideals.sofake',
url_f: nil,
url_path: nil,
web_neg: 'error: ext.invalid [sofake]',
url_exts: [],
neg_exts: [],
pos_exts: [],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: []
},
{ web_status: 'formatted',
url: 'www.dallassubaru.com.sofake',
url_f: 'http://www.dallassubaru.com',
url_path: nil,
web_neg: nil,
url_exts: ['com'],
neg_exts: [],
pos_exts: ['com'],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: ['subaru']
},
{ web_status: 'formatted',
url: 'www.quickeats.net/contact_us',
url_f: 'http://www.quickeats.net',
url_path: '/contact_us',
web_neg: nil,
url_exts: ['net'],
neg_exts: [],
pos_exts: ['net'],
neg_paths: ['cont'],
pos_paths: [],
neg_urls: ['eat, quick'],
pos_urls: []
},
{ web_status: 'formatted',
url: 'www.school.edu/teachers',
url_f: 'http://www.school.edu',
url_path: '/teachers',
web_neg: nil,
url_exts: ['edu'],
neg_exts: ['edu'],
pos_exts: [],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: []
},
{ web_status: 'invalid',
url: 'www.www.nissancars/inventory',
url_f: nil,
url_path: nil,
web_neg: 'error: ext.none',
url_exts: [],
neg_exts: [],
pos_exts: [],
neg_paths: [],
pos_paths: [],
neg_urls: [],
pos_urls: []
},
{ web_status: 'formatted',
url: 'www.www.toyotatown.net/staff/management',
url_f: 'http://www.toyotatown.net',
url_path: '/staff/management',
web_neg: nil,
url_exts: ['net'],
neg_exts: [],
pos_exts: ['net'],
neg_paths: [],
pos_paths: ['management, staff'],
neg_urls: [],
pos_urls: ['toyota']
},
{ web_status: 'formatted',
url: 'www.www.yellowpages.com/business',
url_f: 'http://www.yellowpages.com',
url_path: '/business',
web_neg: nil,
url_exts: ['com'],
neg_exts: [],
pos_exts: ['com'],
neg_paths: ['business'],
pos_paths: [],
neg_urls: ['yellowpages'],
pos_urls: []
}
]
You can scrub an array of strings with or without formatting. For scrubbing proper strings (account and business names, job titles, article titles, brands, locations, etc.) like below, you might prefer the proper scrub method, but these examples will use the same criteria and same array of strings to illustrate the difference.
Continuing with the auto dealership example above, the following examples are to scrub the auto dealership account names. We want to prioritize our data based on those who match our positive criteria, those who match our negative criteria, and those who are neutral.
First step is to load your Strings criteria in hash format. It's not required to enter all the keys below, but for those you are using, each key must be a symbol and be exactly the same as the ones below. The values must each be an array of strings.
pos_criteria = ["acura", "alfa romeo", "aston martin", "audi", "bmw", "bentley", "bugatti", "buick", "cdjr", "cadillac", "chevrolet", "chrysler", "dodge", "ferrari", "fiat", "ford", "gmc", "group", "group", "honda", "hummer", "hyundai", "infiniti", "isuzu", "jaguar", "jeep", "kia", "lamborghini", "lexus", "lincoln", "lotus", "mini", "maserati", "mazda", "mclaren", "mercedes-benz", "mitsubishi", "nissan", "porsche", "ram", "rolls-royce", "saab", "scion", "smart", "subaru", "suzuki", "toyota", "volkswagen", "volvo"]
neg_criteria = %w[approv avis budget collis eat enterprise facebook financ food google gourmet hertz hotel hyatt insur invest loan lube mobility motel motorola parts quick rent repair restaur rv ryder service softwar travel twitter webhost yellowpages yelp youtube]
strings_criteria = {
pos_criteria: pos_criteria,
neg_criteria: neg_criteria
}
strings_obj = ScrubDb::Strings.new(strings_criteria)
Next, pass your list of strings to scrub_strings(strings)
with the syntax below.
array_of_strings = [
'quick auto approval, inc',
'the gmc and bmw-world of AUSTIN tx',
'DOWNTOWN CAR REPAIR, INC',
'TEXAS TRAVEL, CO',
'123 Car-world Kia OF CHICAGO IL',
'Main Street Ford in DALLAS tX',
'broad st fiat of houston',
'hot-deal auto insurance',
'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
'Young Gmc Trucks',
'youmans Chevrolet',
'yazell chevy',
'quick cAr LUBE',
'yAtEs AuTo maLL',
'YADKIN VALLEY COLLISION CO',
'XIT FORD INC'
]
scrubbed_strings = strings_obj.scrub_strings(array_of_strings)
scrubbed_strings = [
{
string: 'quick auto approval, inc',
pos_criteria: [],
neg_criteria: ['approv, quick']
},
{
string: 'the gmc and bmw-world of AUSTIN tx',
pos_criteria: ['bmw, gmc'],
neg_criteria: []
},
{
string: 'DOWNTOWN CAR REPAIR, INC',
pos_criteria: [],
neg_criteria: ['repair']
},
{
string: 'TEXAS TRAVEL, CO',
pos_criteria: [],
neg_criteria: ['travel']
},
{
string: '123 Car-world Kia OF CHICAGO IL',
pos_criteria: ['kia'],
neg_criteria: []
},
{
string: 'Main Street Ford in DALLAS tX',
pos_criteria: ['ford'],
neg_criteria: []
},
{
string: 'broad st fiat of houston',
pos_criteria: ['fiat'],
neg_criteria: []
},
{
string: 'hot-deal auto insurance',
pos_criteria: [],
neg_criteria: ['insur']
},
{
string: 'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
pos_criteria: [],
neg_criteria: ['budget']
},
{
string: 'Young Gmc Trucks',
pos_criteria: ['gmc'],
neg_criteria: []
},
{
string: 'youmans Chevrolet',
pos_criteria: ['chevrolet'],
neg_criteria: []
},
{
string: 'yazell chevy',
pos_criteria: [],
neg_criteria: []
},
{
string: 'quick cAr LUBE',
pos_criteria: [],
neg_criteria: ['lube, quick']
},
{
string: 'yAtEs AuTo maLL',
pos_criteria: [],
neg_criteria: []
},
{
string: 'YADKIN VALLEY COLLISION CO',
pos_criteria: [],
neg_criteria: ['collis']
},
{
string: 'XIT FORD INC',
pos_criteria: ['ford'],
neg_criteria: []
}
]
This method is designed for scrubbing proper strings, like account and business names, job titles, article titles, brands, locations, etc.
This method is identical to example 2 above (Scrub Array of Strings), except this method first formats the strings using the Utf8Sanitizer gem
and CrmFormatter gem
, then passes the results to the method above to scrub. So, this is a 2-in-1 method, Format + Scrub! Again, this method treats your strings as if they are proper nouns, so compare the results of these two methods to determine which is most suitable for your data.
First step is to load your Strings criteria in hash format. It's not required to enter all the keys below, but for those you are using, each key must be a symbol and be exactly the same as the ones below. The values must each be an array of strings.
pos_criteria = ["acura", "alfa romeo", "aston martin", "audi", "bmw", "bentley", "bugatti", "buick", "cdjr", "cadillac", "chevrolet", "chrysler", "dodge", "ferrari", "fiat", "ford", "gmc", "group", "group", "honda", "hummer", "hyundai", "infiniti", "isuzu", "jaguar", "jeep", "kia", "lamborghini", "lexus", "lincoln", "lotus", "mini", "maserati", "mazda", "mclaren", "mercedes-benz", "mitsubishi", "nissan", "porsche", "ram", "rolls-royce", "saab", "scion", "smart", "subaru", "suzuki", "toyota", "volkswagen", "volvo"]
neg_criteria = %w[approv avis budget collis eat enterprise facebook financ food google gourmet hertz hotel hyatt insur invest loan lube mobility motel motorola parts quick rent repair restaur rv ryder service softwar travel twitter webhost yellowpages yelp youtube]
strings_criteria = {
pos_criteria: pos_criteria,
neg_criteria: neg_criteria
}
strings_obj = ScrubDb::Strings.new(strings_criteria)
Next, pass your list of strings to scrub_proper_strings(strings)
with the syntax below.
array_of_strings = [
'quick auto approval, inc',
'the gmc and bmw-world of AUSTIN tx',
'DOWNTOWN CAR REPAIR, INC',
'TEXAS TRAVEL, CO',
'123 Car-world Kia OF CHICAGO IL',
'Main Street Ford in DALLAS tX',
'broad st fiat of houston',
'hot-deal auto insurance',
'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
'Young Gmc Trucks',
'youmans Chevrolet',
'yazell chevy',
'quick cAr LUBE',
'yAtEs AuTo maLL',
'YADKIN VALLEY COLLISION CO',
'XIT FORD INC'
]
scrubbed_proper_strings = strings_obj.scrub_proper_strings(array_of_strings)
scrubbed_proper_strings = [
{
proper_status: 'formatted',
proper: 'quick auto approval, inc',
proper_f: 'Quick Auto Approval, Inc',
pos_criteria: [],
neg_criteria: ['approv, quick']
},
{
proper_status: 'formatted',
proper: 'the gmc and bmw-world of AUSTIN tx',
proper_f: 'The GMC and BMW-World of Austin TX',
pos_criteria: ['bmw, gmc'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'DOWNTOWN CAR REPAIR, INC',
proper_f: 'Downtown Car Repair, Inc',
pos_criteria: [],
neg_criteria: ['repair']
},
{
proper_status: 'formatted',
proper: 'TEXAS TRAVEL, CO',
proper_f: 'Texas Travel, Co',
pos_criteria: [],
neg_criteria: ['travel']
},
{
proper_status: 'formatted',
proper: '123 Car-world Kia OF CHICAGO IL',
proper_f: '123 Car-World Kia of Chicago IL',
pos_criteria: ['kia'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'Main Street Ford in DALLAS tX',
proper_f: 'Main Street Ford in Dallas TX',
pos_criteria: ['ford'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'broad st fiat of houston',
proper_f: 'Broad St Fiat of Houston',
pos_criteria: ['fiat'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'hot-deal auto insurance',
proper_f: 'Hot-Deal Auto Insurance',
pos_criteria: [],
neg_criteria: ['insur']
},
{
proper_status: 'formatted',
proper: 'BUDGET - AUTOMOTORES ZONA & FRANCA, INC',
proper_f: 'Budget - Automotores Zona & Franca, Inc',
pos_criteria: [],
neg_criteria: ['budget']
},
{
proper_status: 'formatted',
proper: 'Young Gmc Trucks',
proper_f: 'Young GMC Trucks',
pos_criteria: ['gmc'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'youmans Chevrolet',
proper_f: 'Youmans Chevrolet',
pos_criteria: ['chevrolet'],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'yazell chevy',
proper_f: 'Yazell Chevy',
pos_criteria: [],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'quick cAr LUBE',
proper_f: 'Quick Car Lube',
pos_criteria: [],
neg_criteria: ['lube, quick']
},
{
proper_status: 'formatted',
proper: 'yAtEs AuTo maLL',
proper_f: 'Yates Auto Mall',
pos_criteria: [],
neg_criteria: []
},
{
proper_status: 'formatted',
proper: 'YADKIN VALLEY COLLISION CO',
proper_f: 'Yadkin Valley Collision Co',
pos_criteria: [],
neg_criteria: ['collis']
},
{
proper_status: 'formatted',
proper: 'XIT FORD INC',
proper_f: 'Xit Ford Inc',
pos_criteria: ['ford'],
neg_criteria: []
}
]
Adam J Booth - 4rlm
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/4rlm/scrub_db. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
The gem is available as open source under the terms of the MIT License.
Everyone interacting in the ScrubDb project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
FAQs
Unknown package
We found that scrub_db demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.