chinese-learning-orchestrator
A ZFC-style Mandarin learning orchestrator providing LLM-led pedagogy, speech assessment via Azure, and structured fact-based progress tracking.
Introduction
The Chinese Learning Orchestrator (xuezh) is a sophisticated local-first engine designed to transform language acquisition into a measurable, Unix-style workflow. It functions as a smart endpoint for LLM agents, providing opinionated Mandarin pedagogy while offloading state, facts, and mechanical operations to a robust, deterministic backend. By treating language learning as a series of bounded events—such as speaking, reviewing, and HSK auditing—the system ensures that progress remains data-driven and actionable without relying on nebulous AI recommendations. Designed for students and educators who prioritize precision, this tool integrates seamlessly into agent runtimes like Clawdbot, facilitating high-fidelity pronunciation coaching and long-term retention tracking through spaced repetition system (SRS) mechanics.
-
LLM-first pedagogical orchestration: Use the LLM to design and adapt the lesson flow, learner mood adjustments, and content selection while maintaining a strict separation between the 'smart' agent and the 'dumb' engine.
-
Azure-backed pronunciation assessment: Perform fine-grained voice analysis using Azure Speech SDK, capturing accuracy, fluency, and completeness scores to provide traceable, evidence-based coaching.
-
ZFC (Zero Framework Cognition) compliance: Adheres to a strict boundary where the engine acts as a persistent data layer for facts and artifacts, preventing the LLM from hallucinating recommendation logic or database structures.
-
HSK-aligned reporting: Generate accurate progress reports and curriculum audits based on HSK standards, ensuring consistent coverage of vocabulary and grammar.
-
Artifact-driven learning: Materialize voice audio, transcripts, and study reports as first-class filesystem artifacts, making it easy to review performance history or share progress across multiple devices.
-
Use as a CLI tool or integrated agent backend to maintain local state, keeping all user data within your control rather than relying on opaque third-party cloud learning platforms.
-
Inputs include voice notes for pronunciation assessment, text for TTS-generated audio, and grade-based feedback for SRS reviews.
-
Outputs provide JSON-structured data for programmatic consumption, ensuring that agent runtimes can reliably parse feedback, artifacts, and progress snapshots.
-
Constraints: The system is designed to be lean and bounded; always prioritize word-level chunks over isolated characters to improve pedagogical efficiency.
-
Operational discipline: Ensure every user interaction, especially exposures and pronunciations, is logged to keep the internal database truthful and reflective of actual learning outcomes.
Repository Stats
- Stars
- 38
- Forks
- 6
- Open Issues
- 0
- Language
- Go
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 3, 2026, 04:58 PM