Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

ramparts

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ramparts

  • 0.3.1
  • Rubygems
  • Socket score

Version published
Maintainers
2
Created
Source

Ramparts - Spam Detection Build Status Maintainability Gem Version

Parses blocks of text to find phone numbers (including phonetic numbers), emails, and spammer urls

Example

Find obfuscated phone numbers

>> message = "Contact me directly ( FOUR ONE FIVE E I G H T 9 FOUR TWO EIGHT SIX FIVE  ). Hope you cracked that number code."
>> Ramparts.find_phone_numbers(message)
[{start_offset: 22, end_offset: 71, type: :phone, value: 'FOUR ONE FIVE E I G H T 9 FOUR TOO EIGHT SIX FIVE'}]

Find obfuscated emails.

>> message = "Looking for honest worker .. contact ashley73299 AT yahoo dot com for more info"
>> Ramparts.find_emails(message)
[{start_offset: 37, end_offset: 65, type: :email, value: 'ashley73299 AT yahoo dot com'}]

Find both obfuscated emails and phone numbers.

>> message = "Looking for honest worker .. contact ashley73299 AT yahoo dot com or FOUR FIVE ONE 456 8900 for more info"
>> Ramparts.find_phone_numbers_and_emails(message)
[{start_offset: 37, end_offset: 65, type: :email, value: 'ashley73299 AT yahoo dot com'}, {start_offset: 70, end_offset: 92, type: :phone, value: 'FOUR FIVE ONE 456 8900'}]

Count the occurrences of well known spam URLs and keywords

>> message = ""cialis vs viagra spam guestbook.php?action=http://cialiswalmart.shop""
>> Ramparts.count_urls(message)
3

Installation

In the root directory of your project

gem install ramparts

Remember to require ramparts as necessary

require 'ramparts'

API

count_phone_numbers(text, options = {})
  • Returns the count of the number of phone numbers in the text. Currently uses a map reduce paradigm, which incurs information loss but is cleaner to implement, achieves better results, and is ~2x faster than find_phone_numbers
  • Input:
    • text [String]
    • options [Hash]
      • parse_leet [Boolean][Default → True]
        • Parses phone numbers that contain l33t syntax. With this set to true eg. FivE 4 3 F0r On3 67 NiN3 would be caught.
      • remove_spaces [Boolean][Default → True]
        • Parses phone numbers that contain spaces between the numbers. With this set to true eg. F i v E 4 3 F 0 r O n 3 67 N i N 3 would be caught.
  • Output:
    • number of occurrences of phone numbers [Integer]
  • Example
    • Input:
      • text → "If you're interested in this position, do contact me directly on my phone number ( FOUR ONE FIVE E I G H T 9 FOUR TWO EIGHT SIX FIVE ). Hope you cracked that number code."
    • Output: 1
find_phone_numbers(text, options = {})
  • Description: Finds all occurrences of emails within a block of text. Even when l33t speak, phonetics and space variations are used.
  • Input:
    • text [String]
    • options [Hash]
      • To Be Implemented
  • Output:
    • [Array]
      • match [Hash]
        • offset: [Integer]
        • value: [String]
  • Example
    • Input:
      • text → "If you're interested in this position, do contact me directly on my phone number ( FOUR ONE FIVE E I G H T 9 FOUR TWO EIGHT SIX FIVE ). Hope you cracked that number code."
    • Output: [{start_offset: 84, end_offset: 133, type: :phone, value: 'FOUR ONE FIVE E I G H T 9 FOUR TOO EIGHT SIX FIVE'}]
replace_phone_numbers(text, options = {}, &block)
  • Description: Replaces all the occurrences of phone numbers within the text with what is returned in the block. Returns the redacted text. of text.
  • Input:
    • text [String]
    • insertable [String]
    • options [Hash]
      • To Be Implemented
  • Output:
    • updated text [String]
  • Example
    • Usage: altered_text = replace_phone_numbers(...) do CENSORED end
    • Input:
      • text → "If you're interested in this position, do contact me directly on my phone number ( FOUR ONE FIVE E I G H T 9 FOUR TWO EIGHT SIX FIVE ). Hope you cracked that number code."
    • Output: "If you're interested in this position, do contact me directly on my phone number ( CENSORED ). Hope you cracked that number code."
count_emails(text, options = {})
  • Description: Returns the count of the number of emails in the text. Currently uses a map reduce paradigm, which incurs information loss but is cleaner to implement, achieves better results, and is ~2x faster than find_emails
  • Input:
    • text [String]
    • options [Hash]
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
  • Output:
    • number of occurences of emails [Integer]
  • Example
    • Input:
      • text → "Hi, Are you seriously interested ..Looking for honest worker .. My e-mail is ashley73299 AT yahoo dot com, I repeat ashley73299 @ yahoo . com ?.. Ashley"
    • Output: 2
find_emails(text, options = {})
  • Description: Finds all occurrences of emails within a block of text. Even when l33t speak, phonetics are used.
  • Input:
    • text [String]
    • options [Hash]
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
      • check_for_at [Boolean] [Default → False]
        • checks for the word 'at' as '@', currently can result in algorithm being overly greedy as 'at' is such a common word
  • Output:
    • [Array]
      • match [Hash]
        • offset: [Integer]
        • value: [String]
  • Example
    • Input:
      • text → "Hi, Are you seriously interested ..Looking for honest worker .. My e-mail is ashley73299 AT yahoo dot com, I repeat ashley73299 @ yahoo . com ?.. Ashley"
    • Output: [{start_offset: 78, end_offset: 106, type: :email, value: 'ashley73299 AT yahoo dot com'}, {start_offset: 118, end_offset: 143, type: :email, value: 'ashley73299 @ yahoo . com'}]
replace_emails(text, options = {}, &block)
  • Description: Replaces all the occurrences of emails within the text with what is returned in the block. Returns the redacted text of text.
  • Input:
    • text [String]
    • options [Hash]
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
      • check_for_at [Boolean] [Default → False]
        • checks for the word 'at' as '@', currently can result in algorithm being overly greedy as 'at' is such a common word
  • Output:
    • updated text [String]
  • Example
    • Usage: altered_text = replace_emails(...) do CENSORED end
    • Input:
      • text → "My name is Cynthia, a friend of mine needs a nanny to watch her baby in your area, her contact is ( jbush042@gmail.com ) She will be waiting to hear from you kindly send her an email now!"
    • Output: My name is Cynthia, a friend of mine needs a nanny to watch her baby in your area, her contact is ( CENSORED ) She will be waiting to hear from you kindly send her an email now!
count_phone_numbers_and_emails(text, options = {})
  • Description: Returns the count of the number of emails in the text. Currently uses a map reduce paradigm, which incurs information loss but is cleaner to implement, achieves better results, and is ~2x faster than find_emails
  • Input:
    • text [String]
    • options [Hash]
      • parse_leet [Boolean][Default → True]
        • Parses phone numbers that contain l33t syntax. With this set to true eg. FivE 4 3 F0r On3 67 NiN3 would be caught.
      • remove_spaces [Boolean][Default → True]
        • Parses phone numbers that contain spaces between the numbers. With this set to true eg. F i v E 4 3 F 0 r O n 3 67 N i N 3 would be caught.
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
      • check_for_at [Boolean] [Default → False]
        • checks for the word 'at' as '@', currently can result in algorithm being overly greedy as 'at' is such a common word
  • Output:
    • number of occurences of emails [Integer]
  • Example
    • Input:
      • text → "Hi, Are you seriously interested ..Looking for honest worker .. My e-mail is ashley73299 AT yahoo dot com, phone 416 090 78 NINE 5 ?.. Ashley"
    • Output: 2
find_phone_numbers_and_emails(text, options = {})
  • Description: Finds all occurrences of phone numbers and emails within a block of text.
  • Input:
    • text [String]
    • options [Hash]
      • parse_leet [Boolean][Default → True]
        • Parses phone numbers that contain l33t syntax. With this set to true eg. FivE 4 3 F0r On3 67 NiN3 would be caught.
      • remove_spaces [Boolean][Default → True]
        • Parses phone numbers that contain spaces between the numbers. With this set to true eg. F i v E 4 3 F 0 r O n 3 67 N i N 3 would be caught.
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
      • check_for_at [Boolean] [Default → False]
        • checks for the word 'at' as '@', currently can result in algorithm being overly greedy as 'at' is such a common word
  • Output:
    • [Array]
      • match [Hash]
        • offset: [Integer]
        • value: [String]
  • Example
    • Input:
      • text → "Hi, Are you seriously interested ..Looking for honest worker .. My e-mail is ashley73299 AT yahoo dot com, phone 416 090 78 NINE 5 ?.. Ashley"
    • Output: [{start_offset: 78, end_offset: 106, type: :email, value: 'ashley73299 AT yahoo dot com'}, {start_offset: 115, end_offset: 132, type: :phone, value: 'FOUR FIVE ONE 456 8900'}]
replace_phone_numbers_and_emails(text, options = {}, &block)
  • Description: Replaces all the occurrences of phone numbers and emails within the text with what is returned from the block. Returns the redacted text of text.
  • Input:
    • text [String]
    • options [Hash]
      • parse_leet [Boolean][Default → True]
        • Parses phone numbers that contain l33t syntax. With this set to true eg. FivE 4 3 F0r On3 67 NiN3 would be caught.
      • remove_spaces [Boolean][Default → True]
        • Parses phone numbers that contain spaces between the numbers. With this set to true eg. F i v E 4 3 F 0 r O n 3 67 N i N 3 would be caught.
      • aggressive [Boolean] [Default → False]
        • doesn't require a . or dot + a TLD at the end, but instead compares the last word against a well known list of email domains (eg. contact ashley @ yandex for more info would be caught)
      • check_for_at [Boolean] [Default → False]
        • checks for the word 'at' as '@', currently can result in algorithm being overly greedy as 'at' is such a common word
  • Output:
    • updated text [String]
  • Example
    • Usage: altered_text = replace_phone_numbers_and_emails(...) do CENSORED end
    • Input:
      • text → "My name is Cynthia, a friend of mine needs a nanny to watch her baby in your area, her contact is ( jbush042@gmail.com or FOUR FIVE ONE 789 4568 ) She will be waiting to hear from you kindly send her an email now!"
    • Output: My name is Cynthia, a friend of mine needs a nanny to watch her baby in your area, her contact is ( CENSORED or CENSORED ) She will be waiting to hear from you kindly send her an email now!
count_urls(text, options = {})
  • Description: Simple union regex to find if the text contains bad urls eg. viagra/cialis. Returns a count of the number of occurrences. appear in the text.
  • Input:
    • text [String]
    • options [Hash]
      • To Be Implemented
  • Output:
    • number of occurences of matches [Integer]
  • Example
    • Input:
      • text → "cialis vs cialis spam guestbook.php?action=http://cialiswalmart.shop"
    • Output: 3

FAQs

Package last updated on 09 Oct 2018

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc