LoRA Integration and Parameter-Efficient Fine-tuning Workflow

ML & AI · flowchart diagram · MIT

Illustrates the LoRA (Low-Rank Adaptation) process for parameter-efficient fine-tuning, showing how it modifies a pre-trained model to train only a small f

Source: https://github.com/jiaran-king/MicroLM/blob/782ae02f10c14b484a317f22115a066b3b10b91d/Readme/%E9%A1%B9%E7%9B%AE%E5%85%A8%E6%99%AF%E5%9B%BE/00-%E5%85%A8%E6%B5%81%E7%A8%8B%E5%88%86%E6%9E%90%EF%BC%88%E8%AE%AD%E7%BB%83%E3%80%81%E6%8E%A8%E7%90%86%E3%80%81%E8%AF%84%E6%B5%8B%E4%B8%8E%E9%83%A8%E7%BD%B2%EF%BC%89.md
Curated by jiaran-king
LoRA PEFT Fine-tuning Machine Learning Deep Learning LLM Model Training

Mermaid source

%%{init: {"theme": "base", "themeVariables": {"background": "#ffffff", "primaryColor": "#fdf4ff", "primaryBorderColor": "#d946ef", "primaryTextColor": "#0f172a", "lineColor": "#64748b"}}}%%
flowchart TB
    L1["预训练模型<br>冻结原始权重"] --> L2["apply_lora_to_model()<br>替换 q/k/v/output proj"]
    L2 --> L3["LoRALinear<br>W(x) + scale · B(A(x))"]
    L3 --> L4["仅训练 A/B 矩阵"]
    L4 --> L5["保存 adaptor state_dict"]
    L5 --> L6["训练态:独立 adaptor"]
    L5 --> L7["推理态:merge_lora()"]

    classDef lora fill:#fdf4ff,stroke:#d946ef,color:#0f172a;
    classDef out fill:#f0fdf4,stroke:#22c55e,color:#0f172a;
    class L1,L2,L3,L4,L5 lora;
    class L6,L7 out;

What this diagram shows

This flowchart details the LoRA (Low-Rank Adaptation) workflow. It begins with a frozen pre-trained model, proceeds to apply LoRA by replacing q/k/v/output projection layers with LoRALinear modules, and then focuses on training only the A/B matrices. The diagram also shows how the adaptor state_dict is saved for independent use during training or merged during inference.

When to use it

Use this diagram when explaining or implementing parameter-efficient fine-tuning methods for large language models, especially when aiming to reduce computational resources, memory footprint, and storage requirements for task-specific adaptations.

How to adapt it for your project

Adapt this diagram by specifying different projection layers (e.g., only 'q' and 'v' or all linear layers) where LoRA is applied, adjusting the rank of the A/B matrices for different performance/parameter trade-offs, or integrating other PEFT methods alongside LoRA.

Key concepts