🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
Book a DemoInstallSign in
Socket

atai-gemma3-tool

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

atai-gemma3-tool

CLI tool for generating text from images using the Gemma 3 model.

0.0.3
PyPI
Maintainers
1

atai-gemma3-tool

atai-gemma3-tool is a command-line interface (CLI) tool that uses Google's Gemma 3 model to generate descriptive text from local image files. It leverages the power of a state-of-the-art multimodal model to process images and stream textual outputs in real time.

Features

  • Multimodal Processing: Accepts image input and produces text output.
  • Real-Time Streaming: Generates and streams tokens as they are produced.
  • Customizable Prompt: Allows users to define a custom prompt.
  • Easy Installation: Installable via pip with all dependencies handled.
  • Asynchronous Generation: Utilizes asynchronous token streaming for quick response times.

Installation

Clone the repository and install the package in editable mode:

pip install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

pip install atai-gemma3-tool

Usage

Run the CLI tool from your terminal by specifying the path to your image file and an optional custom prompt:

atai-gemma3-tool "path/to/your/local_image.jpg" --prompt "Describe this image in detail."

atai-gemma3-tool https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG

Command Line Arguments

  • image_path: The path to your local image file or a image url.
  • --prompt: (Optional) Custom prompt for text generation.
    Default: "Describe this image in detail."

The tool will load the image, process it using the Gemma 3 model, and output the generated text to your console in real time.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

  • Google DeepMind: For the Gemma 3 model.
  • Hugging Face: For the Transformers library and supporting tools.

Keywords

google

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts