ai-llm-patterns

Introduction

This skill provides a robust framework for integrating Anthropic Claude into complex software systems. Designed for AI engineers and fullstack developers, it focuses on production-ready patterns for building scalable RAG pipelines, autonomous agents, and cost-effective LLM features. It emphasizes the balance between performance, user experience, and operational efficiency through technical architectural best practices.

Advanced RAG Architecture: Implementation of chunking strategies, vector search using pgvector with cosine similarity, and embedding pipelines leveraging text-embedding-3-small.
Anthropic SDK & Streaming: Best practices for implementing Server-Sent Events (SSE) streaming to reduce perceived latency and improve real-time user feedback.
Strategic Model Selection: A decision-making framework for selecting between Haiku, Sonnet, and Opus based on specific task requirements, latency, and throughput costs.
Tool Use & Agent Loops: Designing secure function-calling interfaces where the LLM orchestrates operations while maintaining safe boundary control for database writes and sensitive actions.
Context Optimization: Implementing prompt caching for frequently accessed documents, large system prompts, and RAG context windows to optimize token spend and responsiveness.
Structured Data Extraction: Utilizing Zod for schema enforcement, ensuring LLM outputs are deterministic and safe for programmatic consumption.
Use for building production-grade AI features, document retrieval systems, or autonomous agent workflows.
Follow the core constraint: never trust LLM outputs directly for database mutations; always implement deterministic validation.
Input requirements include target document datasets and function schemas; output typically results in optimized API responses, retrieved context chunks, or tool execution plans.
Adhere to token budget management; always cache prompts exceeding 1024 tokens to maximize efficiency.
Refer to the provided documentation in references/ for specific implementation guides on SSE, RAG pipelines, and LLM-ops error handling.

Startup Courses

Online Courses

Physical Courses

Introduction

Repository Stats