Introduction

The baoyu-image-gen skill provides a unified interface for professional-grade AI image generation. It acts as a bridge between user prompts and various commercial image models, allowing creators and engineers to leverage top-tier APIs including OpenAI DALL-E, Azure OpenAI, Google Imagen, OpenRouter, DashScope (Qwen/Tongyi), Z.AI, MiniMax, Jimeng, Seedream, and Replicate without managing individual SDK complexities. It is designed for both rapid single-image prototyping and high-throughput batch generation.

Multi-provider support: Easily toggle between specialized models via command-line flags or project configuration.
Advanced prompt handling: Supports raw text input, file-based prompt reading, and concatenation for complex compositions.
Precision controls: Native support for aspect ratio (e.g., 16:9, 1:1), high-quality presets, explicit dimensions, and reference-image-guided generation.
Batch processing: Built-in parallel execution logic with concurrency control, making it ideal for creating large sets of marketing assets or dataset generation.
Configuration management: Implements a hierarchical setup system (Project/XDG/User) to manage API keys and default model parameters securely.
Prerequisites: Requires Bun runtime for optimized execution.
Configuration: The system mandates a one-time setup via EXTEND.md to define provider credentials, default models, and quality presets before generation can occur.
Usage constraints: Ensure appropriate API keys for the selected providers are set as environment variables (e.g., OPENAI_API_KEY, ARK_API_KEY, REPLICATE_API_TOKEN).
Workflow: Use --promptfiles for automated batching and --ref to provide visual context for models supporting multimodal image generation.
Integration: Designed to be used within AI agent workflows where natural language intent (generate, create, draw) is translated into specific image generation parameters.

Startup Courses

Online Courses

Physical Courses

baoyu-image-gen

Introduction

Repository Stats