Socket
Socket
Sign inDemoInstall

@noscrape/noscrape

Package Overview
Dependencies
Maintainers
1
Versions
26
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@noscrape/noscrape

protect your content from scraping


Version published
Weekly downloads
8
decreased by-20%
Maintainers
1
Weekly downloads
 
Created
Source


GitHub release License issues - noscrape Known Vulnerabilities CodeQL



Project Goal

this project should help you to prevent anyone from scraping your content




Concept

The key behind is to use any true-type font from which noscrape generates a new version with shuffled unicodes and nothing what one can use to calculate them back. Strings and Integers become obfuscated and are only readable by using the generated obfuscation-font.



What we cannot remove from inside the font are the glyph-paths. At the moment the paths are obfuscated by shifting them randomly a little bit ( @see obfuscation strength multiplier ) that makes it hard to calculate them back but not impossible or maybe "guessable" by a ML-Algorithm.
Would be nice if someone come up with a better solution or help to improve this 😅




IMPORTANT NOTE

Bots are not able to process obfuscated text or it comes to unpredictable analytics results etc.
So please beware of using this technology on relevant content for indexed pages!

Doing the whole obfuscation stuff tooks time (something around 50-60ms on my machine 😉).
This should not be problem with prerendered pages. For API-Requests, one sould consider putting obfuscation logic into a cronjob like task and use them multiple times instead of calculate everything again for every request.


Example

// server-side obfuscation
const object = { title: "noscrape", text: "obfuscation" }
const { font, value }  = obfuscate(object, 'path/to/your/font.ttf')


⬇⬇⬇⬇     provide data     ⬇⬇⬇⬇


// font will be provided as buffer
const b64 = font.toString(`base64`)
<!-- client-side visualization-->


<style> 
    @font-face {        
        font-family: 'noscrape-obfuscated';        
        src: url('data:font/truetype;charset=utf-8;base64,${b64}');    
    }
</style>

...

<span style="font-family: noscrape-obfuscated">
    <div>{ value.title }</div>
    <div>{ value.text }</div>
</span>    

example-code

live demo


Options

strength

obfuscation strength multiplier ( default: 1 )
all under 0.1 makes no sense ( paths can be simply back calculated )
all over 10 makes no sense ( looks like 💩 )

characterRange

character range used for encryption
PRIVATE_USE_AREA       DEFAULT
LATIN
GREEK
CYRILLIC
HIRAGANA
KATAKANA

lowMemory

use only if you do not have a lot of memory and noscrape cannot load the given font file
DEFAULT: false



Contributions

Contributions, issues and feature requests are very welcome. If you are using this package and fixed a bug for yourself, please consider submitting a PR!




License

MIT @ Bernhard Schönberger

Keywords

FAQs

Package last updated on 28 Jul 2022

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc