Iluvatar GPU ERNIE-4.5-21B-A3B-Base & ERNIE-4.5-21B-A3B Training Quick Start¶
🚀 Quick Start🚀¶
(0)Before starting, you need a Iluvatar GPU machine, and the system requirements for this machine are as follows:¶
| Chip type | Driver version |
|---|---|
| BI150 | 4.3.0 |
Environment Description¶
- Machine: BI150 64GB 8-card machine
- Docker image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-ixuca:latest
- GCC path: /usr/bin/gcc (9.4)
- python version: 3.10
Note: This example uses an 8-card machine: To verify if your machine is Iluvatar GPU, simply enter the command in the system environment and see if there is any output:
ixsmi
#example:$ ixsmi
Timestamp Thu Jul 10 16:59:37 2025
+-----------------------------------------------------------------------------+
| IX-ML: 4.3.0 Driver Version: 4.3.0 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------|
| GPU Name | Bus-Id | Clock-SM Clock-Mem |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Iluvatar BI-V150 | 00000000:13:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Iluvatar BI-V150 | 00000000:16:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 103W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Iluvatar BI-V150 | 00000000:1C:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Iluvatar BI-V150 | 00000000:1F:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 106W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Iluvatar BI-V150 | 00000000:27:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 Iluvatar BI-V150 | 00000000:2A:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 105W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 Iluvatar BI-V150 | 00000000:34:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 Iluvatar BI-V150 | 00000000:37:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 106W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 8 Iluvatar BI-V150 | 00000000:3D:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 9 Iluvatar BI-V150 | 00000000:40:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 107W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 10 Iluvatar BI-V150 | 00000000:48:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 11 Iluvatar BI-V150 | 00000000:4B:00.0 | 1500MHz 1600MHz |
| N/A 33C P0 103W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 12 Iluvatar BI-V150 | 00000000:54:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 13 Iluvatar BI-V150 | 00000000:57:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 104W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 14 Iluvatar BI-V150 | 00000000:64:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 15 Iluvatar BI-V150 | 00000000:67:00.0 | 1500MHz 1600MHz |
| N/A 36C P0 107W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Process name Usage(MiB) |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(1) Environment Preparation: (This will take you 5-15 minutes)¶
-
Pull the Image
-
Install the driver kmd on host
-
Start the Container
-
Install paddlepaddle & paddle-iluvatar-gpu
ln -sf /usr/local/bin/python3 /usr/local/bin/python # Install PaddlePaddle CPU package python -m pip install https://paddle-whl.bj.bcebos.com/nightly/cpu/paddlepaddle/paddlepaddle-3.3.0.dev20251023-cp310-cp310-linux_x86_64.whl # Install PaddlePaddle iluvatar-gpu plugin package python -m pip install https://paddle-whl.bj.bcebos.com/nightly/ixuca/paddle-iluvatar-gpu/paddle_iluvatar_gpu-3.0.0.dev20251023-cp310-cp310-linux_x86_64.whl Nightly version link: https://www.paddlepaddle.org.cn/packages/nightly/ixuca/ -
Install ERNIEKit
(2) Start post-traning:(This will take a relatively long time)¶
SFT fine-tuning
export PATH=/usr/local/corex/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/corex/lib64:$LD_LIBRARY_PATH
export LD_PRELOAD=/usr/local/corex/lib64/libcuda.so.1
export PADDLE_XCCL_BACKEND=iluvatar_gpu
rm -rf erniekit_dist_log/ output/ vdl_log/
# ERNIE-4.5-0.3B sft
erniekit train examples/configs/iluvatar_gpu/ERNIE-4.5-0.3B/sft/run_sft_8k.yaml
# ERNIE-4.5-21B sft
erniekit train examples/configs/iluvatar_gpu/ERNIE-4.5-21B-A3B/sft/run_sft_8k.yaml
SFT-LoRA fine-tuning
export PATH=/usr/local/corex/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/corex/lib64:$LD_LIBRARY_PATH
export LD_PRELOAD=/usr/local/corex/lib64/libcuda.so.1
export PADDLE_XCCL_BACKEND=iluvatar_gpu
rm -rf erniekit_dist_log/ output/ vdl_log/
# ERNIE-4.5-0.3B sft-lora
erniekit train examples/configs/iluvatar_gpu/ERNIE-4.5-0.3B/sft/run_sft_lora_8k.yaml
# ERNIE-4.5-21B sft-lora
erniekit train examples/configs/iluvatar_gpu/ERNIE-4.5-21B-A3B/sft/run_sft_lora_8k.yaml
Iluvatar GPU ERNIE-4.5-VL-28B-A3B Training Quick Start¶
🚀 Quick Start🚀¶
(0)Before starting, you need a Iluvatar GPU machine, and the system requirements for this machine are as follows:¶
| Chip type | Driver version |
|---|---|
| BI150 | 4.3.0 |
Environment Description¶
- Machine: BI150 64GB 8-card machine
- Docker image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/device/paddle-ixuca:latest
- GCC path: /usr/bin/gcc (9.4)
- python version: 3.10
Note: This example uses an 8-card machine: To verify if your machine is Iluvatar GPU, simply enter the command in the system environment and see if there is any output:
ixsmi
#example:$ ixsmi
Timestamp Thu Jul 10 16:59:37 2025
+-----------------------------------------------------------------------------+
| IX-ML: 4.3.0 Driver Version: 4.3.0 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------|
| GPU Name | Bus-Id | Clock-SM Clock-Mem |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Iluvatar BI-V150 | 00000000:13:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Iluvatar BI-V150 | 00000000:16:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 103W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Iluvatar BI-V150 | 00000000:1C:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Iluvatar BI-V150 | 00000000:1F:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 106W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 Iluvatar BI-V150 | 00000000:27:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 5 Iluvatar BI-V150 | 00000000:2A:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 105W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 Iluvatar BI-V150 | 00000000:34:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 Iluvatar BI-V150 | 00000000:37:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 106W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 8 Iluvatar BI-V150 | 00000000:3D:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 9 Iluvatar BI-V150 | 00000000:40:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 107W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 10 Iluvatar BI-V150 | 00000000:48:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 11 Iluvatar BI-V150 | 00000000:4B:00.0 | 1500MHz 1600MHz |
| N/A 33C P0 103W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 12 Iluvatar BI-V150 | 00000000:54:00.0 | 1500MHz 1600MHz |
| N/A 34C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 13 Iluvatar BI-V150 | 00000000:57:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 104W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 14 Iluvatar BI-V150 | 00000000:64:00.0 | 1500MHz 1600MHz |
| N/A 35C P0 N/A / N/A | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 15 Iluvatar BI-V150 | 00000000:67:00.0 | 1500MHz 1600MHz |
| N/A 36C P0 107W / 350W | 64MiB / 32768MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Process name Usage(MiB) |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(1) Environment Preparation: (This will take you 5-15 minutes)¶
-
Pull the Image
-
Install the driver kmd on host
-
Start the Container
-
Install paddlepaddle & paddle-iluvatar-gpu & FastDeploy
Nightly version link: https://www.paddlepaddle.org.cn/packages/nightly/ixuca/# Install PaddlePaddle CPU package pip3 install paddlepaddle==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/ # Install PaddlePaddle iluvatar-gpu plugin package pip3 install --pre paddle-iluvatar-gpu==3.0.0.dev20250926 -i https://www.paddlepaddle.org.cn/packages/nightly/ixuca/ # Install FastDeploy package pip3 install fastdeploy_iluvatar_gpu==2.3.0.dev0 -i https://www.paddlepaddle.org.cn/packages/stable/ixuca/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simplels -
Install ERNIEKit
-
Install requirements
-
Download DoclingMatix Datasets
(2) Start post-training:(This will take a relatively long time)¶
SFT-LoRA fine-tuning
export PATH=/usr/local/corex/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/corex/lib64:$LD_LIBRARY_PATH
export LD_PRELOAD=/usr/local/corex/lib64/libcuda.so.1
export PADDLE_XCCL_BACKEND=iluvatar_gpu
rm -rf erniekit_dist_log/ output/ vdl_log/
erniekit train examples/configs/iluvatar_gpu/ERNIE-4.5-VL-28B-A3B/sft/run_sft_lora_8k.yaml