mls

Introduction

MLS (MLX Local Serving) provides a comprehensive, high-performance infrastructure for running multiple on-device machine learning models on macOS with Apple Silicon. Designed to keep all active models resident in the GPU memory, it eliminates cold-start latency and provides a unified HTTP interface for multimodal AI tasks. This system is ideal for developers, researchers, and power users who require reliable, private, and low-latency inference for local automation or creative workflows without relying on external cloud APIs.

Multi-modal capabilities including ASR (Qwen3), TTS (Qwen3-VoiceDesign), Neural Machine Translation (TranslateGemma), Image Generation (Z-Image-Turbo), and Vision (jina-vlm).
Unified API architecture utilizing HTTP/JSON for easy integration with tools like LangChain, OpenAI SDKs, and local automation wrappers like OpenClaw.
Real-time dashboard for monitoring GPU utilization, memory usage, inference queues, and live server logs.
File-based batch processing for long-form text translation and synthesis, with robust support for polling progress via API status endpoints.
Drop-in OpenAI-compatible vision completion endpoint for multimodal chat applications.
Requires macOS 14+ on Apple Silicon and Python 3.12+ with the uv package manager.
Operates by default on http://127.0.0.1:18321 for local access.
Inputs for ASR and translation tasks should be provided as absolute local file paths to ensure correct system access.
The system supports 70+ languages for translation and offers customizable voice instruction for TTS (VoiceDesign model) to control output characteristics such as tone and accent.
Model management is handled through the server control API, allowing individual service pause, resume, or restart cycles without disrupting the entire stack.
Performance optimization tips: Use high-quality 20-step configurations for image generation at the cost of latency, or stick to the 9-step default for real-time needs.

Startup Courses

Online Courses

Physical Courses

Introduction

Repository Stats