Content
baoyu-image-gen avatar

baoyu-image-gen

API-based AI image generation supporting OpenAI, Azure, Google, OpenRouter, DashScope, Replicate, and more. Features text-to-image, reference image guidance, aspect ratio control, and batch processing.

Introduction

The baoyu-image-gen skill provides a unified interface for professional-grade AI image generation. It acts as a bridge between user prompts and various commercial image models, allowing creators and engineers to leverage top-tier APIs including OpenAI DALL-E, Azure OpenAI, Google Imagen, OpenRouter, DashScope (Qwen/Tongyi), Z.AI, MiniMax, Jimeng, Seedream, and Replicate without managing individual SDK complexities. It is designed for both rapid single-image prototyping and high-throughput batch generation.

  • Multi-provider support: Easily toggle between specialized models via command-line flags or project configuration.

  • Advanced prompt handling: Supports raw text input, file-based prompt reading, and concatenation for complex compositions.

  • Precision controls: Native support for aspect ratio (e.g., 16:9, 1:1), high-quality presets, explicit dimensions, and reference-image-guided generation.

  • Batch processing: Built-in parallel execution logic with concurrency control, making it ideal for creating large sets of marketing assets or dataset generation.

  • Configuration management: Implements a hierarchical setup system (Project/XDG/User) to manage API keys and default model parameters securely.

  • Prerequisites: Requires Bun runtime for optimized execution.

  • Configuration: The system mandates a one-time setup via EXTEND.md to define provider credentials, default models, and quality presets before generation can occur.

  • Usage constraints: Ensure appropriate API keys for the selected providers are set as environment variables (e.g., OPENAI_API_KEY, ARK_API_KEY, REPLICATE_API_TOKEN).

  • Workflow: Use --promptfiles for automated batching and --ref to provide visual context for models supporting multimodal image generation.

  • Integration: Designed to be used within AI agent workflows where natural language intent (generate, create, draw) is translated into specific image generation parameters.

Repository Stats

Stars
16,787
Forks
1,958
Open Issues
1
Language
TypeScript
Default Branch
main
Sync Status
Idle
Last Synced
Apr 29, 2026, 01:00 PM
View on GitHub