Qwen LoRA Fine-tuning Path

ML & AI · flowchart diagram · MIT

Illustrates the Qwen LoRA fine-tuning process, from data tokenization and loss mask construction to PEFT LoRA injection and training loop.

Source: https://github.com/jiaran-king/MicroLM/blob/782ae02f10c14b484a317f22115a066b3b10b91d/Readme/%E9%A1%B9%E7%9B%AE%E5%85%A8%E6%99%AF%E5%9B%BE/00-%E5%85%A8%E6%B5%81%E7%A8%8B%E5%88%86%E6%9E%90%EF%BC%88%E8%AE%AD%E7%BB%83%E3%80%81%E6%8E%A8%E7%90%86%E3%80%81%E8%AF%84%E6%B5%8B%E4%B8%8E%E9%83%A8%E7%BD%B2%EF%BC%89.md
Curated by jiaran-king
LLM Fine-tuning LoRA Qwen PEFT Hugging Face Machine Learning

Mermaid source

%%{init: {"theme": "base", "themeVariables": {"background": "#ffffff", "primaryColor": "#f0fdf4", "primaryBorderColor": "#22c55e", "primaryTextColor": "#0f172a", "lineColor": "#64748b"}}}%%
flowchart TB
    W1["train.jsonl / valid.jsonl"] --> W2["HF AutoTokenizer + ChatML template"]
    W2 --> W3["样本转为 chat-format tokens"]
    W3 --> W4["构造 loss mask<br>prefix 对比法定位 assistant 区间"]
    W4 --> W5["Qwen2.5-1.5B-Instruct"]
    W5 --> W6["PEFT LoRA 注入 target modules"]
    W6 --> W7["训练循环<br>forward → masked loss → backward → eval"]
    W7 --> W8["checkpoint / best adaptor / final adaptor"]

    classDef data fill:#eff6ff,stroke:#60a5fa,color:#0f172a;
    classDef train fill:#f0fdf4,stroke:#22c55e,color:#0f172a;
    classDef loss fill:#fff7ed,stroke:#fb923c,color:#0f172a;
    classDef out fill:#fdf4ff,stroke:#d946ef,color:#0f172a;
    class W1,W2,W3 data;
    class W5,W6,W7 train;
    class W4 loss;
    class W8 out;

What this diagram shows

This diagram details the fine-tuning workflow for Qwen models using LoRA. It covers the preparation of training data with HF AutoTokenizer and ChatML, the construction of loss masks using a prefix comparison method to identify the assistant's response section, the injection of PEFT LoRA into target modules of the Qwen2.5-1.5B-Instruct model, and the iterative training loop leading to the generation of checkpoints and adaptors.

When to use it

Use this diagram when understanding or implementing LoRA fine-tuning for large language models like Qwen, especially when working with Hugging Face tokenizers, ChatML templates, and PEFT for efficient model adaptation. It's suitable for researchers and engineers developing custom LLM applications.

How to adapt it for your project

To adapt this diagram, one could replace Qwen with other base models, modify the tokenizer or chat template for different model architectures (e.g., Llama, Mistral), or experiment with various PEFT methods beyond LoRA. The loss mask construction method could also be adjusted for different instruction tuning strategies or output formats. Different training loop optimizations or evaluation metrics could also be incorporated.

Key concepts