Inside Small LLM Coding Agents: Sub-8B Architecture

AI Tools Kit

AI Tools Kit provides free developer tools for working with AI language models. Built by developers, for developers.

Multi-Agent Fan-Out: Scatter-Gather, Map-Reduce, and DAG Patterns

Fan-out multiplies agent throughput — and cost and failure surface with it. Scatter-gather, map-reduce, and DAG orchestration with state machines, error models, and token budget math.

Architecture Deep Dive

Inside Google's A2A Protocol: Agent Discovery and Delegation

Google's A2A protocol defines how AI agents discover each other and delegate tasks over HTTP. Full architecture walkthrough: Agent Cards, task state machine, SSE streaming, and key design tradeoffs.

Architecture Deep Dive

Inside QLoRA: How 4-Bit Fine-Tuning Fits LLMs on One GPU

QLoRA fine-tunes 65B-parameter LLMs on a single 48GB GPU using NF4 quantization, double quantization, and paged optimizers. Deep-dive on each technique and its production trade-offs.

Inside Small LLM Coding Agents: Sub-8B Architecture

Related Articles

Multi-Agent Fan-Out: Scatter-Gather, Map-Reduce, and DAG Patterns

Inside Google's A2A Protocol: Agent Discovery and Delegation

Inside QLoRA: How 4-Bit Fine-Tuning Fits LLMs on One GPU