Illustrates the Qwen LoRA fine-tuning process, from data tokenization and loss mask construction to PEFT LoRA injection and training loop.
%%{init: {"theme": "base", "themeVariables": {"background": "#ffffff", "primaryColor": "#f0fdf4", "primaryBorderColor": "#22c55e", "primaryTextColor": "#0f172a", "lineColor": "#64748b"}}}%%
flowchart TB
W1["train.jsonl / valid.jsonl"] --> W2["HF AutoTokenizer + ChatML template"]
W2 --> W3["样本转为 chat-format tokens"]
W3 --> W4["构造 loss mask<br>prefix 对比法定位 assistant 区间"]
W4 --> W5["Qwen2.5-1.5B-Instruct"]
W5 --> W6["PEFT LoRA 注入 target modules"]
W6 --> W7["训练循环<br>forward → masked loss → backward → eval"]
W7 --> W8["checkpoint / best adaptor / final adaptor"]
classDef data fill:#eff6ff,stroke:#60a5fa,color:#0f172a;
classDef train fill:#f0fdf4,stroke:#22c55e,color:#0f172a;
classDef loss fill:#fff7ed,stroke:#fb923c,color:#0f172a;
classDef out fill:#fdf4ff,stroke:#d946ef,color:#0f172a;
class W1,W2,W3 data;
class W5,W6,W7 train;
class W4 loss;
class W8 out;
This diagram details the fine-tuning workflow for Qwen models using LoRA. It covers the preparation of training data with HF AutoTokenizer and ChatML, the construction of loss masks using a prefix comparison method to identify the assistant's response section, the injection of PEFT LoRA into target modules of the Qwen2.5-1.5B-Instruct model, and the iterative training loop leading to the generation of checkpoints and adaptors.
Use this diagram when understanding or implementing LoRA fine-tuning for large language models like Qwen, especially when working with Hugging Face tokenizers, ChatML templates, and PEFT for efficient model adaptation. It's suitable for researchers and engineers developing custom LLM applications.
To adapt this diagram, one could replace Qwen with other base models, modify the tokenizer or chat template for different model architectures (e.g., Llama, Mistral), or experiment with various PEFT methods beyond LoRA. The loss mask construction method could also be adjusted for different instruction tuning strategies or output formats. Different training loop optimizations or evaluation metrics could also be incorporated.