
Research
Security News
Lazarus Strikes npm Again with New Wave of Malicious Packages
The Socket Research Team has discovered six new malicious npm packages linked to North Korea’s Lazarus Group, designed to steal credentials and deploy backdoors.
Snapcrawl is a command line utility for crawling a website and saving screenshots.
Using Docker
You can run Snapcrawl by using this docker image (which contains all the necessary prerequisites):
$ alias snapcrawl='docker run --rm -it --network host --volume "$PWD:/app" dannyben/snapcrawl'
For more information on the Docker image, refer to the docker-snapcrawl repository.
Using Ruby
$ gem install snapcrawl
Note that Snapcrawl requires PhantomJS and ImageMagick.
Snapcrawl can be configured either through a configuration file (YAML), or by specifying options in the command line.
$ snapcrawl
Usage:
snapcrawl URL [--config FILE] [SETTINGS...]
snapcrawl -h | --help
snapcrawl -v | --version
The default configuration filename is snapcrawl.yml
.
Using the --config
flag will create a template configuration file if it is not present:
$ snapcrawl example.com --config snapcrawl
All configuration options can be specified in the command line as key=value
pairs:
$ snapcrawl example.com log_level=0 depth=2 width=1024
# All values below are the default values
# log level (0-4) 0=DEBUG 1=INFO 2=WARN 3=ERROR 4=FATAL
log_level: 1
# log_color (yes, no, auto)
# yes = always show log color
# no = never use colors
# auto = only use colors when running in an interactive terminal
log_color: auto
# number of levels to crawl, 0 means capture only the root URL
depth: 1
# screenshot width in pixels
width: 1280
# screenshot height in pixels, 0 means the entire height
height: 0
# number of seconds to consider the page cache and its screenshot fresh
cache_life: 86400
# where to store the HTML page cache
cache_dir: cache
# where to store screenshots
snaps_dir: snaps
# screenshot filename template, where '%{url}' will be replaced with a
# slug version of the URL (no need to include the .png extension)
name_template: '%{url}'
# urls not matching this regular expression will be ignored
url_whitelist:
# urls matching this regular expression will be ignored
url_blacklist:
# take a screenshot of this CSS selector only
css_selector:
# when true, ignore SSL related errors
skip_ssl_verification: false
# set to any number of seconds to wait for the page to load before taking
# a screenshot, leave empty to not wait at all (only needed for pages with
# animations or other post-load events).
screenshot_delay:
If you experience any issue, have a question or a suggestion, or if you wish to contribute, feel free to open an issue.
FAQs
Unknown package
We found that snapcrawl demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
The Socket Research Team has discovered six new malicious npm packages linked to North Korea’s Lazarus Group, designed to steal credentials and deploy backdoors.
Security News
Socket CEO Feross Aboukhadijeh discusses the open web, open source security, and how Socket tackles software supply chain attacks on The Pair Program podcast.
Security News
Opengrep continues building momentum with the alpha release of its Playground tool, demonstrating the project's rapid evolution just two months after its initial launch.