baoyu-image-gen
API-based AI image generation supporting OpenAI, Azure, Google, OpenRouter, DashScope, Replicate, and more. Features text-to-image, reference image guidance, aspect ratio control, and batch processing.
Introduction
The baoyu-image-gen skill provides a unified interface for professional-grade AI image generation. It acts as a bridge between user prompts and various commercial image models, allowing creators and engineers to leverage top-tier APIs including OpenAI DALL-E, Azure OpenAI, Google Imagen, OpenRouter, DashScope (Qwen/Tongyi), Z.AI, MiniMax, Jimeng, Seedream, and Replicate without managing individual SDK complexities. It is designed for both rapid single-image prototyping and high-throughput batch generation.
-
Multi-provider support: Easily toggle between specialized models via command-line flags or project configuration.
-
Advanced prompt handling: Supports raw text input, file-based prompt reading, and concatenation for complex compositions.
-
Precision controls: Native support for aspect ratio (e.g., 16:9, 1:1), high-quality presets, explicit dimensions, and reference-image-guided generation.
-
Batch processing: Built-in parallel execution logic with concurrency control, making it ideal for creating large sets of marketing assets or dataset generation.
-
Configuration management: Implements a hierarchical setup system (Project/XDG/User) to manage API keys and default model parameters securely.
-
Prerequisites: Requires Bun runtime for optimized execution.
-
Configuration: The system mandates a one-time setup via EXTEND.md to define provider credentials, default models, and quality presets before generation can occur.
-
Usage constraints: Ensure appropriate API keys for the selected providers are set as environment variables (e.g., OPENAI_API_KEY, ARK_API_KEY, REPLICATE_API_TOKEN).
-
Workflow: Use
--promptfilesfor automated batching and--refto provide visual context for models supporting multimodal image generation. -
Integration: Designed to be used within AI agent workflows where natural language intent (generate, create, draw) is translated into specific image generation parameters.
Repository Stats
- Stars
- 16,787
- Forks
- 1,958
- Open Issues
- 1
- Language
- TypeScript
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 29, 2026, 01:00 PM