
Research
Two Malicious Rust Crates Impersonate Popular Logger to Steal Wallet Keys
Socket uncovers malicious Rust crates impersonating fast_log to steal Solana and Ethereum wallet keys from source code.
instamancer
Advanced tools
Scrape Instagram's API with Puppeteer.
Instamancer is a new type of scraping tool that leverages Puppeteer's ability to intercept requests made by a webpage to an API.
Read more about how Instamancer works here.
Metadata that Instamancer is able to gather from posts:
Enable user namespace cloning:
sysctl -w kernel.unprivileged_userns_clone=1
Or run without a sandbox:
# WARNING: unsafe
export NO_SANDBOX=true
If you wish to install Instamancer without downloading chromium, enable the PUPPETEER_SKIP_CHROMIUM_DOWNLOAD
environment variable before installation
export PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
npm install -g instamancer
If you're using root to install globally, use the following command to install the Puppeteer dependency
sudo npm install -g instamancer --unsafe-perm=true
npx instamancer
git clone https://github.com/ScriptSmith/instamancer.git
cd instamancer
npm install
npm run build
npm install -g
$ instamancer
Usage: instamancer <command> [options]
Commands:
instamancer hashtag [id] Scrape a hashtag
instamancer user [id] Scrape a users posts
instamancer post [ids] Scrape a comma-separated list of posts
instamancer search [query] Perform a search of users, tags and places
instamancer batch [batchfile] Read newline-separated arguments from a file
Configuration
--count, -c Number of posts to download (0 for all) [number] [default: 0]
--full, -f Retrieve full post data [boolean] [default: false]
--sleep, -s Seconds to sleep between interactions [number] [default: 2]
--graft, -g Enable grafting [boolean] [default: true]
--browser, -b Browser path. Defaults to the puppeteer version [string]
--sameBrowser Use a single browser when grafting [boolean] [default: false]
Download
--download, -d Save images from posts [boolean] [default: false]
--downdir Download path [default: "downloads/[endpoint]/[id]"]
--video, -v Download videos (requires full) [boolean] [default: false]
--sync Force download between requests [boolean] [default: false]
--threads, -k Parallel download / depot threads [number] [default: 4]
--waitDownload, -w Download media after scraping [boolean] [default: false]
Upload
--bucket Upload files to an AWS S3 bucket [string]
--depot Upload files to a URL with a PUT request (depot) [string]
Output
--file, -o Output filename. '-' for stdout [string] [default: "[id]"]
--type, -t Filetype [choices: "csv", "json", "both"] [default: "json"]
--mediaPath, -m Add filepaths to _mediaPath [boolean] [default: false]
Display
--visible Show browser on the screen [boolean] [default: false]
--quiet, -q Disable progress output [boolean] [default: false]
Logging
--logging, -l [choices: "none", "error", "info", "debug"] [default: "none"]
--logfile Log file name [string] [default: "instamancer.log"]
Validation
--strict Throw an error on response type mismatch [boolean] [default: false]
Plugins
--plugin, -p Use a plugin from the plugins directory [array] [default: []]
Options:
--help Show help [boolean]
--version Show version number [boolean]
Examples:
instamancer hashtag instagood -fvd Download all the available posts,
and their media from #instagood
instamancer user arianagrande --type=csv Download Ariana Grande's posts to a
--logging=info --visible CSV file with a non-headless
browser, and log all events
Source code available at https://github.com/ScriptSmith/instamancer
ES2018 Typescript example:
import {createApi, IOptions} from "instamancer"
const options: IOptions = {
total: 10
};
const hashtag = createApi("hashtag", "beach", options);
(async () => {
for await (const post of hashtag.generator()) {
console.log(post);
}
})();
import {createApi} from "instamancer"
createApi("hashtag", id, options);
createApi("user", id, options);
createApi("post", ids, options);
createApi("search", query, options);
const options: Instamancer.IOptions = {
// Total posts to download. 0 for unlimited
total: number,
// Run Chrome in headless mode
headless: boolean,
// Logging events
logger: winston.Logger,
// Run without output to stdout
silent: boolean,
// Time to sleep between interactions with the page
sleepTime: number,
// Throw an error if type validation has been failed
strict: boolean,
// Time to sleep when rate-limited
hibernationTime: number,
// Enable the grafting process
enableGrafting: boolean,
// Extract the full amount of information from the API
fullAPI: boolean,
// Use a proxy in Chrome to connect to Instagram
proxyURL: string,
// Location of the chromium / chrome binary executable
executablePath: string,
// Custom io-ts validator
validator: Type<unknown>,
// Custom plugins
plugins: IPlugin[]
}
A comparison of Instagram scraping tools. Please suggest more tools and criteria through a pull request.
To see a speed comparison, visit this page
Tool | Hashtags | Users | Tagged posts | Locations | Posts | Stories | Login not required | Private feeds | Batch mode | Plugins | Command-line | Library/Module | Download media | Download metadata | Scraping method | Daily builds | Main language | Speed ____________________________ | License ____________________________ | Last commit ____________________________ | Open Issues ____________________________ | Closed Issues ____________________________ | Build status ____________________________ | Test coverage ____________________________ | Code quality ____________________________ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Instamancer | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API request interception | :heavy_check_mark: | Typescript | ||||||||
Instaphyte | :heavy_check_mark: | :x: | :x: | :x: | :x: | :x: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API simulation | :heavy_check_mark: | Python | ||||||||
Instaloader | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API simulation | :x: | Python | :question: | :question: | ||||||
Instalooter | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API simulation | :x: | Python | ||||||||
Instagram crawler | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | Web DOM reading | :x: | Python | :question: | :question: | :question: | |||||
Instagram Scraper | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API simulation | :x: | Python | :question: | :question: | ||||||
Instagram Private API | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | App and Web API simulation | :x: | Python | :question: | :question: | :question: | |||||
Instagram PHP Scraper | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | Web API simulation | :x: | PHP | :question: | :question: | :question: | :question: |
FAQs
Scrape the Instagram API with Puppeteer
We found that instamancer demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Socket uncovers malicious Rust crates impersonating fast_log to steal Solana and Ethereum wallet keys from source code.
Research
A malicious package uses a QR code as steganography in an innovative technique.
Research
/Security News
Socket identified 80 fake candidates targeting engineering roles, including suspected North Korean operators, exposing the new reality of hiring as a security function.