AI
AI Tools Kit
AI Tools Kit provides free developer tools for working with AI language models. Built by developers, for developers.
Learn more about us →Related Articles
Architecture Deep Dive
Inside QLoRA: How 4-Bit Fine-Tuning Fits LLMs on One GPU
QLoRA fine-tunes 65B-parameter LLMs on a single 48GB GPU using NF4 quantization, double quantization, and paged optimizers. Deep-dive on each technique and its production trade-offs.
Architecture Deep Dive
Inside vLLM: How PagedAttention Enables High-Throughput LLM Serving
vLLM's PagedAttention algorithm achieves 24x higher throughput than HuggingFace Transformers by applying OS virtual memory concepts to KV cache management. Here's how the architecture actually works.