This diagram illustrates the LoRA (Low-Rank Adaptation) process for parameter-efficient fine-tuning, showing how it modifies a pre-trained model by replaci
%%{init: {"theme": "base", "themeVariables": {"background": "#ffffff", "primaryColor": "#fdf4ff", "primaryBorderColor": "#d946ef", "primaryTextColor": "#0f172a", "lineColor": "#64748b"}}}%%
flowchart TB
L1["预训练模型<br>冻结原始权重"] --> L2["apply_lora_to_model()<br>替换 q/k/v/output proj"]
L2 --> L3["LoRALinear<br>W(x) + scale · B(A(x))"]
L3 --> L4["仅训练 A/B 矩阵"]
L4 --> L5["保存 adaptor state_dict"]
L5 --> L6["训练态:独立 adaptor"]
L5 --> L7["推理态:merge_lora()"]
classDef lora fill:#fdf4ff,stroke:#d946ef,color:#0f172a;
classDef out fill:#f0fdf4,stroke:#22c55e,color:#0f172a;
class L1,L2,L3,L4,L5 lora;
class L6,L7 out;
This flowchart details the LoRA (Low-Rank Adaptation) process, from applying LoRA to a frozen pre-trained model to training only the A/B matrices, saving the adaptor, and its usage in training (independent adaptor) and inference (merged adaptor).
Use this diagram to explain or implement parameter-efficient fine-tuning for large language models, especially when resources are limited or multiple task-specific adaptors are needed without modifying the base model.
This diagram can be adapted to show different parameter-efficient fine-tuning techniques (e.g., Prompt Tuning, P-tuning), illustrate specific model architectures (e.g., Transformer layers where LoRA is applied), or detail the merging process with specific code examples.