Education
chinese-learning-orchestrator avatar

chinese-learning-orchestrator

A ZFC-style Mandarin learning orchestrator providing LLM-led pedagogy, speech assessment via Azure, and structured fact-based progress tracking.

Introduction

The Chinese Learning Orchestrator (xuezh) is a sophisticated local-first engine designed to transform language acquisition into a measurable, Unix-style workflow. It functions as a smart endpoint for LLM agents, providing opinionated Mandarin pedagogy while offloading state, facts, and mechanical operations to a robust, deterministic backend. By treating language learning as a series of bounded events—such as speaking, reviewing, and HSK auditing—the system ensures that progress remains data-driven and actionable without relying on nebulous AI recommendations. Designed for students and educators who prioritize precision, this tool integrates seamlessly into agent runtimes like Clawdbot, facilitating high-fidelity pronunciation coaching and long-term retention tracking through spaced repetition system (SRS) mechanics.

  • LLM-first pedagogical orchestration: Use the LLM to design and adapt the lesson flow, learner mood adjustments, and content selection while maintaining a strict separation between the 'smart' agent and the 'dumb' engine.

  • Azure-backed pronunciation assessment: Perform fine-grained voice analysis using Azure Speech SDK, capturing accuracy, fluency, and completeness scores to provide traceable, evidence-based coaching.

  • ZFC (Zero Framework Cognition) compliance: Adheres to a strict boundary where the engine acts as a persistent data layer for facts and artifacts, preventing the LLM from hallucinating recommendation logic or database structures.

  • HSK-aligned reporting: Generate accurate progress reports and curriculum audits based on HSK standards, ensuring consistent coverage of vocabulary and grammar.

  • Artifact-driven learning: Materialize voice audio, transcripts, and study reports as first-class filesystem artifacts, making it easy to review performance history or share progress across multiple devices.

  • Use as a CLI tool or integrated agent backend to maintain local state, keeping all user data within your control rather than relying on opaque third-party cloud learning platforms.

  • Inputs include voice notes for pronunciation assessment, text for TTS-generated audio, and grade-based feedback for SRS reviews.

  • Outputs provide JSON-structured data for programmatic consumption, ensuring that agent runtimes can reliably parse feedback, artifacts, and progress snapshots.

  • Constraints: The system is designed to be lean and bounded; always prioritize word-level chunks over isolated characters to improve pedagogical efficiency.

  • Operational discipline: Ensure every user interaction, especially exposures and pronunciations, is logged to keep the internal database truthful and reflective of actual learning outcomes.

Repository Stats

Stars
38
Forks
6
Open Issues
0
Language
Go
Default Branch
main
Sync Status
Idle
Last Synced
May 3, 2026, 04:58 PM
View on GitHub