nano-banana-pro

Introduction

The nano-banana-pro skill provides a streamlined interface for developers and creatives to leverage Google's advanced gemini-3-pro-image-preview model directly within their terminal-based workflow. This tool is designed to bridge the gap between complex multimodal generative AI and practical file-based operations. By automating the invocation of image generation processes via Python scripts and uv, it enables high-quality asset creation ranging from photorealistic images to technical infographics that require Google Search grounding for data accuracy. It is particularly effective for users who need to generate diagrams, charts, and illustrations on the fly or perform iterative image editing tasks such as style transfers, composition changes, or subject-specific transformations.

Advanced image generation using the gemini-3-pro-image-preview model with support for diverse aspect ratios from 1:1 to 21:9.
Data-accurate infographic creation utilizing real-time Google Search grounding to ensure visual information is factually grounded.
Multi-modal image editing and transformation capabilities, allowing users to provide input images as reference for stylistic or structural modifications.
High-fidelity text rendering within generated visuals, making it ideal for creating diagrams, localized marketing materials, and annotated charts.
Configurable output parameters including 1K, 2K, and 4K resolution options to meet specific project requirements.
Efficient, script-based execution environment managed via uv, ensuring all dependencies are handled inline without manual environment pollution.
Requires a valid GEMINI_API_KEY environment variable provided through Google AI Studio for authentication.
Accepts text prompts as primary input, with optional image file paths provided via flags for context-aware transformations or reference-based generation.
Outputs are typically saved as image files (e.g., PNG, JPG) to a user-defined path, providing flexibility in project organization.
Best suited for scenarios where rapid visual prototyping, data visualization, or creative design assets are needed during a development or research cycle.
Users should ensure input image paths are accessible to the local shell environment during execution for reliable processing.

Startup Courses

Online Courses

Physical Courses

Introduction

Repository Stats