Parameter-Efficient Fine-Tuning
A family of methods that fine-tune large models by updating only a small fraction of their parameters.
Full Definition
Parameter-efficient fine-tuning (PEFT) encompasses techniques that adapt large pretrained models to new tasks while updating only a small subset of parameters, dramatically reducing memory and compute requirements. The family includes LoRA, QLoRA, prefix tuning (prepending learnable tokens to the input), prompt tuning (adding soft trainable prompt vectors), and adapter layers (adding bottleneck modules between transformer layers). PEFT methods make it practical to personalise or specialise very large models on limited hardware and enable multi-task deployment where many task-specific adapters share one base model. Hugging Face's PEFT library standardises these methods for easy use.
Examples
Using prefix tuning to adapt a 13B parameter model for sentiment analysis by training only 0.1% of parameters.
Deploying 50 customer-specific LoRA adapters on a shared Llama 3 base, switching adapters per tenant without loading separate model copies.
Apply this in your prompts
PromptITIN automatically uses techniques like Parameter-Efficient Fine-Tuning to build better prompts for you.
Related Terms
LoRA (Low-Rank Adaptation)
A parameter-efficient fine-tuning method that updates only small low-rank matric…
View →QLoRA
A memory-efficient fine-tuning method combining quantisation with LoRA adapters.…
View →Fine-Tuning
Continuing training of a pretrained model on a smaller, task-specific dataset to…
View →