Just Think AIStart thinking

GlossaryTerm

LoRA (Low-Rank Adaptation)

A lightweight way to fine-tune by training small adapter weights instead of the whole model.

LoRA is a fine-tuning technique that freezes the base model and trains tiny "adapter" matrices alongside it. The adapter weights are 100-1,000× smaller than the full model, so training is faster, cheaper, and produces a tiny artifact (often <100MB) you can swap in and out.

Why it matters in production: you can have one base model and dozens of LoRA adapters for different tenants or use cases. Loading a LoRA at inference time costs almost nothing. It's the standard approach for nearly all open-source fine-tuning today (QLoRA combines it with quantization for even more savings).

Bring this to your business

Knowing the term is one thing. Shipping it is another.

We do two-week AI Sprints — one term, one workflow, into production by Day 10.