Parameter-efficient finetuning (PEFT) methods have been instrumental in rapid adoption of fine tuned domain specific LLMs. PEFTs not only reduced memory usage and time during LLMs training but also shown to achieve similar performance to full finetuning LLMs in many practical settings. Recent PEFT adapters such as LoRA and QLoRA have further fuelled this growth.
A hallmark of current state-of-the-art PEFTs is that they modify weights rather than representations. It has been shown that representations encode rich semantic information, suggesting that editing representations might be a more powerful alternative.
So instead of weight updates can we learn interventions to modify a small fraction of model representations ?
Introducing Representation Finetuning (ReFT), a family of intervention-based representation finetuning methods. Typically, an intervention I is a tuple ⟨Φ, P, L⟩ that encapsulates a single inference-time modification of the representations computed by a Transformer-based LM. The three components of an intervention are * The intervention function Φ ∶ R d → R d with learned parameters ϕ. * A set of input positions P ⊆ {1, . . . , n} that the intervention is applied to. * The layer L ∈ {1, . . . , m} at which the intervention is applied. A ReFT method is a constrained set of f non-overlapping interventions I = {I1, . . . , If }
Further, researchers have also introduced a strong instance of the ReFT family called Low-rank Linear Subspace ReFT (LoReFT). LoReFT is a parameterization of ReFT that intervenes on hidden representations in the linear subspace spanned by a low-rank projection matrix, building directly on the distributed alignment search (DAS) method.
During evaluation of LoReFT on LLaMA-family models against existing PEFTs on standard benchmarks from four domains: commonsense reasoning, arithmetic reasoning, instruction following, and NLU it was found that LoReFT uses 10×–50× times fewer parameters while achieving state-of-the-art performance. These findings indicate that ReFT methods may emerge as more efficient and effective alternatives to weight-based PEFTs.