llm-integration
A toolkit for building robust LLM integrations: API patterns, streaming, function calling, RAG pipelines, and cost-effective model routing.
Introduction
This skill provides a comprehensive architecture for developers integrating Anthropic Claude and similar LLMs into production-grade TypeScript applications. It focuses on reliability, latency, and financial efficiency through structured design patterns. The toolkit is designed for software engineers and AI architects who need to move beyond simple prompt-response cycles into complex, stateful agentic workflows. By implementing these patterns, developers can ensure their applications handle context windows effectively, maintain consistent tool use accuracy, and optimize token spend without sacrificing performance.
-
Advanced API client patterns including retry logic and exponential backoff mechanisms.
-
Streaming response handlers with real-time UI integration support for lower perceived latency.
-
Function calling and tool use schemas to enable autonomous agent loops with multi-step database searching.
-
Modular RAG pipeline architecture including document chunking with overlap, vector database integration, and source citation.
-
Intelligent model routing strategies that select the most cost-effective model based on task complexity (e.g., Haiku for extraction vs. Opus for complex reasoning).
-
Input requirements include clean text documents, structured prompts, and vector store access for RAG functionality.
-
Output consists of optimized response streams, tool-use JSON schemas, and structured data extraction.
-
Ensure all LLM outputs are validated and sanitized before usage in downstream application code.
-
Monitor token usage closely to prevent runaway costs; implement caching for redundant embedding requests.
-
Follow the recommended anti-patterns guide to avoid common mistakes like sending full-document context when targeted chunks suffice.
Repository Stats
- Stars
- 1,520
- Forks
- 460
- Open Issues
- 48
- Language
- JavaScript
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 3, 2026, 05:33 AM