Socket
Book a DemoInstallSign in
Socket

wasapi_client

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

wasapi_client

0.2.0
bundlerRubygems
Version published
Maintainers
1
Created
Source

CircleCI codecov

WasapiClient

WasapiClient is a Ruby gem that acts as a client to Internet Archive's WASAPI APIs. It gets information about WARCs and downloads them. It is a successor to wasapi-downloader but is not provider-generic and is intended for use with Archive-It collections.

Installation

Once the gem has been published, it will be possible to install the gem and add to the application's Gemfile by executing:

bundle add wasapi_client

If bundler is not being used to manage dependencies, install the gem by executing:

gem install wasapi_client

Usage

Each Archive-It account has its own username and password for downloading WARCs. An account includes many collections, which each have a numeric id. Since we have many accounts, when making requests we need to provide the username and password for the account to which the Archive-It collection belongs.

require 'wasapi_client'

# NOTE: The settings below live in the consumer, not in the gem.
client = WasapiClient.new(username: 'username', password: 'password')
client.fetch_warcs(
  output_dir: 'path/to/save/warcs',
  collection: '12345',
  crawl_start_after: '2023-01-01',
  crawl_start_before: '2023-01-31'
)

# Get filenames for a collection (used when auditing)
client.filenames(
  collection: '12345',
  crawl_start_after: '2025-01-01',
  crawl_start_before: '2025-06-30'
)

# Fetch a single WARC by URL
client.fetch_file(
  file: 'https://warcs.archive-it.org/webdatafile/ARCHIVEIT-123-example.warc.gz',
  output_dir: 'path/to/save/warcs'
)

# Fetch a single WARC by filename (used when auditing/remediating)
client.fetch_file(
  file: 'ARCHIVEIT-123-example.warc.gz',
  output_dir: 'path/to/save/warcs',
  base_url: 'https://other-archive-it-location.org'
)

# Get the URLs for WARCs meeting collection and crawl time criteria
client.get_locations(
  collection: '12345',
  crawl_start_after: '2025-01-01',
  crawl_start_before: '2025-06-30'
)

TODO

  • Add store-time- params to support usage with backfill downloads

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install.

To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

FAQs

Package last updated on 17 Jul 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

About

Packages

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc

U.S. Patent No. 12,346,443 & 12,314,394. Other pending.