
Research
2025 Report: Destructive Malware in Open Source Packages
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.
image-crawler-utils
Advanced tools
A rather customizable image crawler structure, designed to download images with their information using multi-threading method. Besides, several wheels have been implemented to help better build a custom image crawler for yourself.
English | 简体中文
A rather customizable image crawler structure, designed to download images with their information using multi-threading method. This GIF depicts a sample run:

Besides, several classes and functions have been implemented to help better build a custom image crawler for yourself.
Please follow the rules of robots.txt, and set a low number of threads with high number of delay time when crawling images. Frequent requests and massive download traffic may result in IP addresses being banned or accounts being suspended.
It is recommended to install it by
pip install image-crawler-utils
Python >= 3.9.rich bars and logging messages to denote the progress of crawler (Jupyter Notebook support is included).Running this example will download the first 20 images from Danbooru with keyword / tag kuon_(utawarerumono) and rating:general into the "Danbooru" folder. Information of images will be stored in image_info_list.json at same the path of your program. Pay attention that the proxies may need to be changed manually.
from image_crawler_utils import CrawlerSettings, Downloader, save_image_infos
from image_crawler_utils.stations.booru import DanbooruKeywordParser
#======================================================================#
# This part prepares the settings for crawling and downloading images. #
#======================================================================#
crawler_settings = CrawlerSettings(
image_num=20,
# If you do not use system proxies, remove '#' and set the proxies manually.
# proxies={"https": "socks5://127.0.0.1:7890"},
)
#==================================================================#
# This part gets the URLs and information of images from Danbooru. #
#==================================================================#
parser = DanbooruKeywordParser(
crawler_settings=crawler_settings,
standard_keyword_string="kuon_(utawarerumono) AND rating:general",
)
image_info_list = parser.run()
# The information will be saved at image_info_list.json
save_image_infos(image_info_list, "image_info_list")
#===================================================================#
# This part downloads the images according to the image information #
# just collected in the image_info_list. #
#===================================================================#
downloader = Downloader(
store_path='Danbooru',
image_info_list=image_info_list,
crawler_settings=crawler_settings,
)
downloader.run()
FAQs
A rather customizable image crawler structure, designed to download images with their information using multi-threading method. Besides, several wheels have been implemented to help better build a custom image crawler for yourself.
We found that image-crawler-utils demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.

Security News
Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-led.

Research
/Security News
A five-month operation turned 27 npm packages into durable hosting for browser-run lures that mimic document-sharing portals and Microsoft sign-in, targeting 25 organizations across manufacturing, industrial automation, plastics, and healthcare for credential theft.