| [[34m2026-02-28 21:09:36[0m] Experiment directory created at results-sh-ceph/stage2/rae_lightningdit_b |
| [[34m2026-02-28 21:10:12[0m] Model Parameters: 677.25M |
| [[34m2026-02-28 21:10:23[0m] Evaluation dataset loaded from /home/jovyan/liushanyuan-sh-ceph/project/sub_project/gongyue/data/ILSVRC2012/ILSVRC2012_validation/data, containing 50000 images. |
| [[34m2026-02-28 21:10:23[0m] Saved training worktree and config to results-sh-ceph/stage2/rae_lightningdit_b. |
| [[34m2026-02-28 21:10:23[0m] Stage-1 RAE parameters: 501.93M |
| [[34m2026-02-28 21:10:23[0m] Stage-2 Model parameters: 677.25M |
| [[34m2026-02-28 21:10:23[0m] Clipping gradients to max norm 1.0. |
| [[34m2026-02-28 21:10:23[0m] Optimizer: AdamW with lr=0.0002, betas=(0.9, 0.95), weight_decay=0.0, eps=1e-08 |
| [[34m2026-02-28 21:10:23[0m] Training for 1400 epochs, batch size 64 per GPU. |
| [[34m2026-02-28 21:10:23[0m] Dataset contains 1281167 samples, 1251 steps per epoch. |
| [[34m2026-02-28 21:10:23[0m] Running with world size 16, starting from epoch 0 to 1400. |
| [[34m2026-02-28 21:12:37[0m] [Epoch 0 | Step 0] train/loss: 0.0230, train/lr: 0.0002 |
| [[34m2026-02-28 21:12:37[0m] Generating EMA samples... |
| [[34m2026-02-28 21:13:28[0m] Generating EMA samples done. |
| [[34m2026-02-28 21:13:28[0m] Starting evaluation... |
| [[34m2026-02-28 21:18:18[0m] eval_ema/fid: 247.4102015958618 |
| [[34m2026-02-28 21:22:15[0m] eval_model/fid: 247.78087604473595 |
| [[34m2026-02-28 21:22:15[0m] Evaluation done. |
| [[34m2026-02-28 21:22:39[0m] [Epoch 0 | Step 100] train/loss: 1.7093, train/lr: 0.0002 |
| [[34m2026-02-28 21:23:03[0m] [Epoch 0 | Step 200] train/loss: 1.2537, train/lr: 0.0002 |
| [[34m2026-02-28 21:23:27[0m] [Epoch 0 | Step 300] train/loss: 1.0328, train/lr: 0.0002 |
| [[34m2026-02-28 21:23:51[0m] [Epoch 0 | Step 400] train/loss: 0.9122, train/lr: 0.0002 |
| [[34m2026-02-28 21:24:15[0m] [Epoch 0 | Step 500] train/loss: 0.8334, train/lr: 0.0002 |
| [[34m2026-02-28 21:24:39[0m] [Epoch 0 | Step 600] train/loss: 0.7778, train/lr: 0.0002 |
| [[34m2026-02-28 21:25:04[0m] [Epoch 0 | Step 700] train/loss: 0.7388, train/lr: 0.0002 |
| [[34m2026-02-28 21:25:28[0m] [Epoch 0 | Step 800] train/loss: 0.7093, train/lr: 0.0002 |
| [[34m2026-02-28 21:25:52[0m] [Epoch 0 | Step 900] train/loss: 0.6856, train/lr: 0.0002 |
| [[34m2026-02-28 21:26:16[0m] [Epoch 0 | Step 1000] train/loss: 0.6611, train/lr: 0.0002 |
| [[34m2026-02-28 21:26:40[0m] [Epoch 0 | Step 1100] train/loss: 0.6447, train/lr: 0.0002 |
| [[34m2026-02-28 21:27:04[0m] [Epoch 0 | Step 1200] train/loss: 0.6331, train/lr: 0.0002 |
| [[34m2026-02-28 21:27:16[0m] [Epoch 0] epoch/loss: 0.8735 |
| [[34m2026-02-28 21:27:31[0m] [Epoch 1 | Step 1300] train/loss: 0.6225, train/lr: 0.0002 |
| [[34m2026-02-28 21:27:55[0m] [Epoch 1 | Step 1400] train/loss: 0.6095, train/lr: 0.0002 |
| [[34m2026-02-28 21:28:19[0m] [Epoch 1 | Step 1500] train/loss: 0.6028, train/lr: 0.0002 |
| [[34m2026-02-28 21:28:43[0m] [Epoch 1 | Step 1600] train/loss: 0.5906, train/lr: 0.0002 |
| [[34m2026-02-28 21:29:07[0m] [Epoch 1 | Step 1700] train/loss: 0.5826, train/lr: 0.0002 |
| [[34m2026-02-28 21:29:31[0m] [Epoch 1 | Step 1800] train/loss: 0.5765, train/lr: 0.0002 |
| [[34m2026-02-28 21:29:55[0m] [Epoch 1 | Step 1900] train/loss: 0.5681, train/lr: 0.0002 |
| [[34m2026-02-28 21:30:22[0m] [Epoch 1 | Step 2000] train/loss: 0.5693, train/lr: 0.0002 |
| [[34m2026-02-28 21:30:46[0m] [Epoch 1 | Step 2100] train/loss: 0.5579, train/lr: 0.0002 |
| [[34m2026-02-28 21:31:11[0m] [Epoch 1 | Step 2200] train/loss: 0.5580, train/lr: 0.0002 |
| [[34m2026-02-28 21:31:39[0m] [Epoch 1 | Step 2300] train/loss: 0.5523, train/lr: 0.0002 |
| [[34m2026-02-28 21:32:08[0m] [Epoch 1 | Step 2400] train/loss: 0.5478, train/lr: 0.0002 |
| [[34m2026-02-28 21:32:34[0m] [Epoch 1 | Step 2500] train/loss: 0.5440, train/lr: 0.0002 |
| [[34m2026-02-28 21:32:35[0m] [Epoch 1] epoch/loss: 0.5735 |
| [[34m2026-02-28 21:33:01[0m] [Epoch 2 | Step 2600] train/loss: 0.5384, train/lr: 0.0002 |
| [[34m2026-02-28 21:33:25[0m] [Epoch 2 | Step 2700] train/loss: 0.5326, train/lr: 0.0002 |
| [[34m2026-02-28 21:33:49[0m] [Epoch 2 | Step 2800] train/loss: 0.5309, train/lr: 0.0002 |
| [[34m2026-02-28 21:34:13[0m] [Epoch 2 | Step 2900] train/loss: 0.5298, train/lr: 0.0002 |
| [[34m2026-02-28 21:34:37[0m] [Epoch 2 | Step 3000] train/loss: 0.5258, train/lr: 0.0002 |
| [[34m2026-02-28 21:35:01[0m] [Epoch 2 | Step 3100] train/loss: 0.5222, train/lr: 0.0002 |
| [[34m2026-02-28 21:35:25[0m] [Epoch 2 | Step 3200] train/loss: 0.5188, train/lr: 0.0002 |
| [[34m2026-02-28 21:35:50[0m] [Epoch 2 | Step 3300] train/loss: 0.5153, train/lr: 0.0002 |
| [[34m2026-02-28 21:36:14[0m] [Epoch 2 | Step 3400] train/loss: 0.5152, train/lr: 0.0002 |
| [[34m2026-02-28 21:36:38[0m] [Epoch 2 | Step 3500] train/loss: 0.5160, train/lr: 0.0002 |
| [[34m2026-02-28 21:37:02[0m] [Epoch 2 | Step 3600] train/loss: 0.5091, train/lr: 0.0002 |
| [[34m2026-02-28 21:37:26[0m] [Epoch 2 | Step 3700] train/loss: 0.5069, train/lr: 0.0002 |
| [[34m2026-02-28 21:37:38[0m] [Epoch 2] epoch/loss: 0.5211 |
| [[34m2026-02-28 21:37:52[0m] [Epoch 3 | Step 3800] train/loss: 0.5050, train/lr: 0.0002 |
| [[34m2026-02-28 21:38:16[0m] [Epoch 3 | Step 3900] train/loss: 0.5042, train/lr: 0.0002 |
| [[34m2026-02-28 21:38:41[0m] [Epoch 3 | Step 4000] train/loss: 0.5015, train/lr: 0.0002 |
| [[34m2026-02-28 21:39:05[0m] [Epoch 3 | Step 4100] train/loss: 0.5016, train/lr: 0.0002 |
| [[34m2026-02-28 21:39:29[0m] [Epoch 3 | Step 4200] train/loss: 0.4952, train/lr: 0.0002 |
| [[34m2026-02-28 21:39:54[0m] [Epoch 3 | Step 4300] train/loss: 0.4958, train/lr: 0.0002 |
| [[34m2026-02-28 21:40:19[0m] [Epoch 3 | Step 4400] train/loss: 0.4948, train/lr: 0.0002 |
| [[34m2026-02-28 21:40:44[0m] [Epoch 3 | Step 4500] train/loss: 0.4954, train/lr: 0.0002 |
| [[34m2026-02-28 21:41:09[0m] [Epoch 3 | Step 4600] train/loss: 0.4937, train/lr: 0.0002 |
| [[34m2026-02-28 21:41:35[0m] [Epoch 3 | Step 4700] train/loss: 0.4902, train/lr: 0.0002 |
| [[34m2026-02-28 21:42:02[0m] [Epoch 3 | Step 4800] train/loss: 0.4911, train/lr: 0.0002 |
| [[34m2026-02-28 21:42:28[0m] [Epoch 3 | Step 4900] train/loss: 0.4896, train/lr: 0.0002 |
| [[34m2026-02-28 21:42:52[0m] [Epoch 3 | Step 5000] train/loss: 0.4878, train/lr: 0.0002 |
| [[34m2026-02-28 21:42:53[0m] [Epoch 3] epoch/loss: 0.4953 |
| [[34m2026-02-28 21:43:19[0m] [Epoch 4 | Step 5100] train/loss: 0.4857, train/lr: 0.0002 |
| [[34m2026-02-28 21:43:43[0m] [Epoch 4 | Step 5200] train/loss: 0.4842, train/lr: 0.0002 |
| [[34m2026-02-28 21:44:07[0m] [Epoch 4 | Step 5300] train/loss: 0.4843, train/lr: 0.0002 |
| [[34m2026-02-28 21:44:31[0m] [Epoch 4 | Step 5400] train/loss: 0.4785, train/lr: 0.0002 |
| [[34m2026-02-28 21:44:55[0m] [Epoch 4 | Step 5500] train/loss: 0.4794, train/lr: 0.0002 |
| [[34m2026-02-28 21:45:20[0m] [Epoch 4 | Step 5600] train/loss: 0.4779, train/lr: 0.0002 |
| [[34m2026-02-28 21:45:44[0m] [Epoch 4 | Step 5700] train/loss: 0.4795, train/lr: 0.0002 |
| [[34m2026-02-28 21:46:08[0m] [Epoch 4 | Step 5800] train/loss: 0.4769, train/lr: 0.0002 |
| [[34m2026-02-28 21:46:32[0m] [Epoch 4 | Step 5900] train/loss: 0.4770, train/lr: 0.0002 |
| [[34m2026-02-28 21:46:56[0m] [Epoch 4 | Step 6000] train/loss: 0.4749, train/lr: 0.0002 |
| [[34m2026-02-28 21:47:20[0m] [Epoch 4 | Step 6100] train/loss: 0.4746, train/lr: 0.0002 |
| [[34m2026-02-28 21:47:44[0m] [Epoch 4 | Step 6200] train/loss: 0.4717, train/lr: 0.0002 |
| [[34m2026-02-28 21:47:57[0m] [Epoch 4] epoch/loss: 0.4784 |
| [[34m2026-02-28 21:48:11[0m] [Epoch 5 | Step 6300] train/loss: 0.4702, train/lr: 0.0002 |
| [[34m2026-02-28 21:48:35[0m] [Epoch 5 | Step 6400] train/loss: 0.4722, train/lr: 0.0002 |
| [[34m2026-02-28 21:48:59[0m] [Epoch 5 | Step 6500] train/loss: 0.4704, train/lr: 0.0002 |
| [[34m2026-02-28 21:49:24[0m] [Epoch 5 | Step 6600] train/loss: 0.4677, train/lr: 0.0002 |
| [[34m2026-02-28 21:49:49[0m] [Epoch 5 | Step 6700] train/loss: 0.4698, train/lr: 0.0002 |
| [[34m2026-02-28 21:50:14[0m] [Epoch 5 | Step 6800] train/loss: 0.4685, train/lr: 0.0002 |
| [[34m2026-02-28 21:50:39[0m] [Epoch 5 | Step 6900] train/loss: 0.4693, train/lr: 0.0002 |
| [[34m2026-02-28 21:51:06[0m] [Epoch 5 | Step 7000] train/loss: 0.4690, train/lr: 0.0002 |
| [[34m2026-02-28 21:51:35[0m] [Epoch 5 | Step 7100] train/loss: 0.4675, train/lr: 0.0002 |
| [[34m2026-02-28 21:51:59[0m] [Epoch 5 | Step 7200] train/loss: 0.4635, train/lr: 0.0002 |
| [[34m2026-02-28 21:52:24[0m] [Epoch 5 | Step 7300] train/loss: 0.4629, train/lr: 0.0002 |
| [[34m2026-02-28 21:52:48[0m] [Epoch 5 | Step 7400] train/loss: 0.4635, train/lr: 0.0002 |
| [[34m2026-02-28 21:53:12[0m] [Epoch 5 | Step 7500] train/loss: 0.4628, train/lr: 0.0002 |
| [[34m2026-02-28 21:53:14[0m] [Epoch 5] epoch/loss: 0.4672 |
| [[34m2026-02-28 21:53:39[0m] [Epoch 6 | Step 7600] train/loss: 0.4607, train/lr: 0.0002 |
| [[34m2026-02-28 21:54:03[0m] [Epoch 6 | Step 7700] train/loss: 0.4621, train/lr: 0.0002 |
| [[34m2026-02-28 21:54:27[0m] [Epoch 6 | Step 7800] train/loss: 0.4604, train/lr: 0.0002 |
| [[34m2026-02-28 21:54:51[0m] [Epoch 6 | Step 7900] train/loss: 0.4614, train/lr: 0.0002 |
| [[34m2026-02-28 21:55:16[0m] [Epoch 6 | Step 8000] train/loss: 0.4583, train/lr: 0.0002 |
| [[34m2026-02-28 21:55:40[0m] [Epoch 6 | Step 8100] train/loss: 0.4587, train/lr: 0.0002 |
| [[34m2026-02-28 21:56:04[0m] [Epoch 6 | Step 8200] train/loss: 0.4593, train/lr: 0.0002 |
| [[34m2026-02-28 21:56:28[0m] [Epoch 6 | Step 8300] train/loss: 0.4582, train/lr: 0.0002 |
| [[34m2026-02-28 21:56:52[0m] [Epoch 6 | Step 8400] train/loss: 0.4572, train/lr: 0.0002 |
| [[34m2026-02-28 21:57:16[0m] [Epoch 6 | Step 8500] train/loss: 0.4566, train/lr: 0.0002 |
| [[34m2026-02-28 21:57:40[0m] [Epoch 6 | Step 8600] train/loss: 0.4552, train/lr: 0.0002 |
| [[34m2026-02-28 21:58:04[0m] [Epoch 6 | Step 8700] train/loss: 0.4569, train/lr: 0.0002 |
| [[34m2026-02-28 21:58:18[0m] [Epoch 6] epoch/loss: 0.4585 |
| [[34m2026-02-28 21:58:31[0m] [Epoch 7 | Step 8800] train/loss: 0.4528, train/lr: 0.0002 |
| [[34m2026-02-28 21:58:56[0m] [Epoch 7 | Step 8900] train/loss: 0.4479, train/lr: 0.0002 |
| [[34m2026-02-28 21:59:21[0m] [Epoch 7 | Step 9000] train/loss: 0.4524, train/lr: 0.0002 |
| [[34m2026-02-28 21:59:46[0m] [Epoch 7 | Step 9100] train/loss: 0.4486, train/lr: 0.0002 |
| [[34m2026-02-28 22:00:11[0m] [Epoch 7 | Step 9200] train/loss: 0.4519, train/lr: 0.0002 |
| [[34m2026-02-28 22:00:37[0m] [Epoch 7 | Step 9300] train/loss: 0.4502, train/lr: 0.0002 |
| [[34m2026-02-28 22:01:03[0m] [Epoch 7 | Step 9400] train/loss: 0.4493, train/lr: 0.0002 |
| [[34m2026-02-28 22:01:32[0m] [Epoch 7 | Step 9500] train/loss: 0.4500, train/lr: 0.0002 |
| [[34m2026-02-28 22:01:57[0m] [Epoch 7 | Step 9600] train/loss: 0.4487, train/lr: 0.0002 |
| [[34m2026-02-28 22:02:21[0m] [Epoch 7 | Step 9700] train/loss: 0.4480, train/lr: 0.0002 |
| [[34m2026-02-28 22:02:45[0m] [Epoch 7 | Step 9800] train/loss: 0.4475, train/lr: 0.0002 |
| [[34m2026-02-28 22:03:09[0m] [Epoch 7 | Step 9900] train/loss: 0.4484, train/lr: 0.0002 |
| [[34m2026-02-28 22:03:33[0m] [Epoch 7 | Step 10000] train/loss: 0.4455, train/lr: 0.0002 |
| [[34m2026-02-28 22:03:33[0m] Generating EMA samples... |
| [[34m2026-02-28 22:03:34[0m] Generating EMA samples done. |
| [[34m2026-02-28 22:03:36[0m] [Epoch 7] epoch/loss: 0.4491 |
| [[34m2026-02-28 22:04:01[0m] [Epoch 8 | Step 10100] train/loss: 0.4494, train/lr: 0.0002 |
| [[34m2026-02-28 22:04:25[0m] [Epoch 8 | Step 10200] train/loss: 0.4449, train/lr: 0.0002 |
| [[34m2026-02-28 22:04:49[0m] [Epoch 8 | Step 10300] train/loss: 0.4448, train/lr: 0.0002 |
| [[34m2026-02-28 22:05:13[0m] [Epoch 8 | Step 10400] train/loss: 0.4448, train/lr: 0.0002 |
| [[34m2026-02-28 22:05:37[0m] [Epoch 8 | Step 10500] train/loss: 0.4448, train/lr: 0.0002 |
| [[34m2026-02-28 22:06:01[0m] [Epoch 8 | Step 10600] train/loss: 0.4459, train/lr: 0.0002 |
| [[34m2026-02-28 22:06:25[0m] [Epoch 8 | Step 10700] train/loss: 0.4466, train/lr: 0.0002 |
| [[34m2026-02-28 22:06:49[0m] [Epoch 8 | Step 10800] train/loss: 0.4442, train/lr: 0.0002 |
| [[34m2026-02-28 22:07:13[0m] [Epoch 8 | Step 10900] train/loss: 0.4442, train/lr: 0.0002 |
| [[34m2026-02-28 22:07:37[0m] [Epoch 8 | Step 11000] train/loss: 0.4441, train/lr: 0.0002 |
| [[34m2026-02-28 22:08:02[0m] [Epoch 8 | Step 11100] train/loss: 0.4410, train/lr: 0.0002 |
| [[34m2026-02-28 22:08:26[0m] [Epoch 8 | Step 11200] train/loss: 0.4436, train/lr: 0.0002 |
| [[34m2026-02-28 22:08:40[0m] [Epoch 8] epoch/loss: 0.4448 |
| [[34m2026-02-28 22:08:53[0m] [Epoch 9 | Step 11300] train/loss: 0.4450, train/lr: 0.0002 |
| [[34m2026-02-28 22:09:19[0m] [Epoch 9 | Step 11400] train/loss: 0.4416, train/lr: 0.0002 |
| [[34m2026-02-28 22:09:43[0m] [Epoch 9 | Step 11500] train/loss: 0.4400, train/lr: 0.0002 |
| [[34m2026-02-28 22:10:08[0m] [Epoch 9 | Step 11600] train/loss: 0.4400, train/lr: 0.0002 |
| [[34m2026-02-28 22:10:34[0m] [Epoch 9 | Step 11700] train/loss: 0.4415, train/lr: 0.0002 |
| [[34m2026-02-28 22:11:05[0m] [Epoch 9 | Step 11800] train/loss: 0.4388, train/lr: 0.0002 |
| [[34m2026-02-28 22:11:30[0m] [Epoch 9 | Step 11900] train/loss: 0.4386, train/lr: 0.0002 |
| [[34m2026-02-28 22:11:56[0m] [Epoch 9 | Step 12000] train/loss: 0.4404, train/lr: 0.0002 |
| [[34m2026-02-28 22:12:20[0m] [Epoch 9 | Step 12100] train/loss: 0.4346, train/lr: 0.0002 |
| [[34m2026-02-28 22:12:44[0m] [Epoch 9 | Step 12200] train/loss: 0.4365, train/lr: 0.0002 |
| [[34m2026-02-28 22:13:08[0m] [Epoch 9 | Step 12300] train/loss: 0.4391, train/lr: 0.0002 |
| [[34m2026-02-28 22:13:32[0m] [Epoch 9 | Step 12400] train/loss: 0.4386, train/lr: 0.0002 |
| [[34m2026-02-28 22:13:56[0m] [Epoch 9 | Step 12500] train/loss: 0.4361, train/lr: 0.0002 |
| [[34m2026-02-28 22:13:59[0m] [Epoch 9] epoch/loss: 0.4390 |
| [[34m2026-02-28 22:13:59[0m] Saving checkpoint at epoch 10... |
| [[34m2026-02-28 22:14:34[0m] [Epoch 10 | Step 12600] train/loss: 0.4372, train/lr: 0.0002 |
| [[34m2026-02-28 22:14:58[0m] [Epoch 10 | Step 12700] train/loss: 0.4363, train/lr: 0.0002 |
| [[34m2026-02-28 22:15:22[0m] [Epoch 10 | Step 12800] train/loss: 0.4354, train/lr: 0.0002 |
| [[34m2026-02-28 22:15:46[0m] [Epoch 10 | Step 12900] train/loss: 0.4391, train/lr: 0.0002 |
| [[34m2026-02-28 22:16:10[0m] [Epoch 10 | Step 13000] train/loss: 0.4361, train/lr: 0.0002 |
| [[34m2026-02-28 22:16:34[0m] [Epoch 10 | Step 13100] train/loss: 0.4326, train/lr: 0.0002 |
| [[34m2026-02-28 22:16:59[0m] [Epoch 10 | Step 13200] train/loss: 0.4380, train/lr: 0.0002 |
| [[34m2026-02-28 22:17:23[0m] [Epoch 10 | Step 13300] train/loss: 0.4323, train/lr: 0.0002 |
| [[34m2026-02-28 22:17:47[0m] [Epoch 10 | Step 13400] train/loss: 0.4309, train/lr: 0.0002 |
| [[34m2026-02-28 22:18:11[0m] [Epoch 10 | Step 13500] train/loss: 0.4362, train/lr: 0.0002 |
| [[34m2026-02-28 22:18:36[0m] [Epoch 10 | Step 13600] train/loss: 0.4359, train/lr: 0.0002 |
| [[34m2026-02-28 22:19:00[0m] [Epoch 10 | Step 13700] train/loss: 0.4312, train/lr: 0.0002 |
| [[34m2026-02-28 22:19:17[0m] [Epoch 10] epoch/loss: 0.4348 |
| [[34m2026-02-28 22:19:29[0m] [Epoch 11 | Step 13800] train/loss: 0.4310, train/lr: 0.0002 |
| [[34m2026-02-28 22:19:54[0m] [Epoch 11 | Step 13900] train/loss: 0.4302, train/lr: 0.0002 |
| [[34m2026-02-28 22:20:19[0m] [Epoch 11 | Step 14000] train/loss: 0.4293, train/lr: 0.0002 |
| [[34m2026-02-28 22:20:47[0m] [Epoch 11 | Step 14100] train/loss: 0.4299, train/lr: 0.0002 |
| [[34m2026-02-28 22:21:14[0m] [Epoch 11 | Step 14200] train/loss: 0.4290, train/lr: 0.0002 |
| [[34m2026-02-28 22:21:40[0m] [Epoch 11 | Step 14300] train/loss: 0.4307, train/lr: 0.0002 |
| [[34m2026-02-28 22:22:04[0m] [Epoch 11 | Step 14400] train/loss: 0.4312, train/lr: 0.0002 |
| [[34m2026-02-28 22:22:28[0m] [Epoch 11 | Step 14500] train/loss: 0.4297, train/lr: 0.0002 |
| [[34m2026-02-28 22:22:53[0m] [Epoch 11 | Step 14600] train/loss: 0.4328, train/lr: 0.0002 |
| [[34m2026-02-28 22:23:17[0m] [Epoch 11 | Step 14700] train/loss: 0.4295, train/lr: 0.0002 |
| [[34m2026-02-28 22:23:41[0m] [Epoch 11 | Step 14800] train/loss: 0.4280, train/lr: 0.0002 |
| [[34m2026-02-28 22:24:05[0m] [Epoch 11 | Step 14900] train/loss: 0.4293, train/lr: 0.0002 |
| [[34m2026-02-28 22:24:29[0m] [Epoch 11 | Step 15000] train/loss: 0.4287, train/lr: 0.0002 |
| [[34m2026-02-28 22:24:32[0m] [Epoch 11] epoch/loss: 0.4300 |
| [[34m2026-02-28 22:24:56[0m] [Epoch 12 | Step 15100] train/loss: 0.4272, train/lr: 0.0002 |
| [[34m2026-02-28 22:25:20[0m] [Epoch 12 | Step 15200] train/loss: 0.4312, train/lr: 0.0002 |
| [[34m2026-02-28 22:25:44[0m] [Epoch 12 | Step 15300] train/loss: 0.4282, train/lr: 0.0002 |
| [[34m2026-02-28 22:26:08[0m] [Epoch 12 | Step 15400] train/loss: 0.4262, train/lr: 0.0002 |
| [[34m2026-02-28 22:26:32[0m] [Epoch 12 | Step 15500] train/loss: 0.4283, train/lr: 0.0002 |
| [[34m2026-02-28 22:26:56[0m] [Epoch 12 | Step 15600] train/loss: 0.4287, train/lr: 0.0002 |
| [[34m2026-02-28 22:27:20[0m] [Epoch 12 | Step 15700] train/loss: 0.4257, train/lr: 0.0002 |
| [[34m2026-02-28 22:27:45[0m] [Epoch 12 | Step 15800] train/loss: 0.4234, train/lr: 0.0002 |
| [[34m2026-02-28 22:28:10[0m] [Epoch 12 | Step 15900] train/loss: 0.4274, train/lr: 0.0002 |
| [[34m2026-02-28 22:28:34[0m] [Epoch 12 | Step 16000] train/loss: 0.4242, train/lr: 0.0002 |
| [[34m2026-02-28 22:28:59[0m] [Epoch 12 | Step 16100] train/loss: 0.4254, train/lr: 0.0002 |
| [[34m2026-02-28 22:29:24[0m] [Epoch 12 | Step 16200] train/loss: 0.4255, train/lr: 0.0002 |
| [[34m2026-02-28 22:29:39[0m] [Epoch 12] epoch/loss: 0.4267 |
| [[34m2026-02-28 22:29:51[0m] [Epoch 13 | Step 16300] train/loss: 0.4263, train/lr: 0.0002 |
| [[34m2026-02-28 22:30:18[0m] [Epoch 13 | Step 16400] train/loss: 0.4240, train/lr: 0.0002 |
| [[34m2026-02-28 22:30:47[0m] [Epoch 13 | Step 16500] train/loss: 0.4248, train/lr: 0.0002 |
| [[34m2026-02-28 22:31:13[0m] [Epoch 13 | Step 16600] train/loss: 0.4199, train/lr: 0.0002 |
| [[34m2026-02-28 22:31:38[0m] [Epoch 13 | Step 16700] train/loss: 0.4205, train/lr: 0.0002 |
| [[34m2026-02-28 22:32:02[0m] [Epoch 13 | Step 16800] train/loss: 0.4265, train/lr: 0.0002 |
| [[34m2026-02-28 22:32:26[0m] [Epoch 13 | Step 16900] train/loss: 0.4211, train/lr: 0.0002 |
| [[34m2026-02-28 22:32:50[0m] [Epoch 13 | Step 17000] train/loss: 0.4249, train/lr: 0.0002 |
| [[34m2026-02-28 22:33:14[0m] [Epoch 13 | Step 17100] train/loss: 0.4206, train/lr: 0.0002 |
| [[34m2026-02-28 22:33:38[0m] [Epoch 13 | Step 17200] train/loss: 0.4220, train/lr: 0.0002 |
| [[34m2026-02-28 22:34:02[0m] [Epoch 13 | Step 17300] train/loss: 0.4279, train/lr: 0.0002 |
| [[34m2026-02-28 22:34:27[0m] [Epoch 13 | Step 17400] train/loss: 0.4242, train/lr: 0.0002 |
| [[34m2026-02-28 22:34:51[0m] [Epoch 13 | Step 17500] train/loss: 0.4207, train/lr: 0.0002 |
| [[34m2026-02-28 22:34:54[0m] [Epoch 13] epoch/loss: 0.4232 |
| [[34m2026-02-28 22:35:17[0m] [Epoch 14 | Step 17600] train/loss: 0.4227, train/lr: 0.0002 |
| [[34m2026-02-28 22:35:41[0m] [Epoch 14 | Step 17700] train/loss: 0.4219, train/lr: 0.0002 |
| [[34m2026-02-28 22:36:05[0m] [Epoch 14 | Step 17800] train/loss: 0.4197, train/lr: 0.0002 |
| [[34m2026-02-28 22:36:30[0m] [Epoch 14 | Step 17900] train/loss: 0.4201, train/lr: 0.0002 |
| [[34m2026-02-28 22:36:54[0m] [Epoch 14 | Step 18000] train/loss: 0.4212, train/lr: 0.0002 |
| [[34m2026-02-28 22:37:18[0m] [Epoch 14 | Step 18100] train/loss: 0.4224, train/lr: 0.0002 |
| [[34m2026-02-28 22:37:42[0m] [Epoch 14 | Step 18200] train/loss: 0.4220, train/lr: 0.0002 |
| [[34m2026-02-28 22:38:07[0m] [Epoch 14 | Step 18300] train/loss: 0.4217, train/lr: 0.0002 |
| [[34m2026-02-28 22:38:31[0m] [Epoch 14 | Step 18400] train/loss: 0.4201, train/lr: 0.0002 |
| [[34m2026-02-28 22:38:57[0m] [Epoch 14 | Step 18500] train/loss: 0.4214, train/lr: 0.0002 |
| [[34m2026-02-28 22:39:22[0m] [Epoch 14 | Step 18600] train/loss: 0.4184, train/lr: 0.0002 |
| [[34m2026-02-28 22:39:47[0m] [Epoch 14 | Step 18700] train/loss: 0.4212, train/lr: 0.0002 |
| [[34m2026-02-28 22:40:05[0m] [Epoch 14] epoch/loss: 0.4210 |
| [[34m2026-02-28 22:40:18[0m] [Epoch 15 | Step 18800] train/loss: 0.4216, train/lr: 0.0002 |
| [[34m2026-02-28 22:40:44[0m] [Epoch 15 | Step 18900] train/loss: 0.4186, train/lr: 0.0002 |
| [[34m2026-02-28 22:41:10[0m] [Epoch 15 | Step 19000] train/loss: 0.4217, train/lr: 0.0002 |
| [[34m2026-02-28 22:41:34[0m] [Epoch 15 | Step 19100] train/loss: 0.4208, train/lr: 0.0002 |
| [[34m2026-02-28 22:41:58[0m] [Epoch 15 | Step 19200] train/loss: 0.4175, train/lr: 0.0002 |
| [[34m2026-02-28 22:42:23[0m] [Epoch 15 | Step 19300] train/loss: 0.4196, train/lr: 0.0002 |
| [[34m2026-02-28 22:42:47[0m] [Epoch 15 | Step 19400] train/loss: 0.4191, train/lr: 0.0002 |
| [[34m2026-02-28 22:43:11[0m] [Epoch 15 | Step 19500] train/loss: 0.4187, train/lr: 0.0002 |
| [[34m2026-02-28 22:43:35[0m] [Epoch 15 | Step 19600] train/loss: 0.4197, train/lr: 0.0002 |
| [[34m2026-02-28 22:43:59[0m] [Epoch 15 | Step 19700] train/loss: 0.4197, train/lr: 0.0002 |
| [[34m2026-02-28 22:44:23[0m] [Epoch 15 | Step 19800] train/loss: 0.4190, train/lr: 0.0002 |
| [[34m2026-02-28 22:44:47[0m] [Epoch 15 | Step 19900] train/loss: 0.4177, train/lr: 0.0002 |
| [[34m2026-02-28 22:45:11[0m] [Epoch 15 | Step 20000] train/loss: 0.4207, train/lr: 0.0002 |
| [[34m2026-02-28 22:45:11[0m] Generating EMA samples... |
| [[34m2026-02-28 22:45:12[0m] Generating EMA samples done. |
| [[34m2026-02-28 22:45:16[0m] [Epoch 15] epoch/loss: 0.4197 |
| [[34m2026-02-28 22:45:39[0m] [Epoch 16 | Step 20100] train/loss: 0.4160, train/lr: 0.0002 |
| [[34m2026-02-28 22:46:03[0m] [Epoch 16 | Step 20200] train/loss: 0.4153, train/lr: 0.0002 |
| [[34m2026-02-28 22:46:27[0m] [Epoch 16 | Step 20300] train/loss: 0.4144, train/lr: 0.0002 |
| [[34m2026-02-28 22:46:51[0m] [Epoch 16 | Step 20400] train/loss: 0.4152, train/lr: 0.0002 |
| [[34m2026-02-28 22:47:15[0m] [Epoch 16 | Step 20500] train/loss: 0.4181, train/lr: 0.0002 |
| [[34m2026-02-28 22:47:40[0m] [Epoch 16 | Step 20600] train/loss: 0.4145, train/lr: 0.0002 |
| [[34m2026-02-28 22:48:04[0m] [Epoch 16 | Step 20700] train/loss: 0.4156, train/lr: 0.0002 |
| [[34m2026-02-28 22:48:29[0m] [Epoch 16 | Step 20800] train/loss: 0.4179, train/lr: 0.0002 |
| [[34m2026-02-28 22:48:54[0m] [Epoch 16 | Step 20900] train/loss: 0.4145, train/lr: 0.0002 |
| [[34m2026-02-28 22:49:19[0m] [Epoch 16 | Step 21000] train/loss: 0.4180, train/lr: 0.0002 |
| [[34m2026-02-28 22:49:44[0m] [Epoch 16 | Step 21100] train/loss: 0.4121, train/lr: 0.0002 |
| [[34m2026-02-28 22:50:13[0m] [Epoch 16 | Step 21200] train/loss: 0.4128, train/lr: 0.0002 |
| [[34m2026-02-28 22:50:30[0m] [Epoch 16] epoch/loss: 0.4151 |
| [[34m2026-02-28 22:50:42[0m] [Epoch 17 | Step 21300] train/loss: 0.4132, train/lr: 0.0002 |
| [[34m2026-02-28 22:51:07[0m] [Epoch 17 | Step 21400] train/loss: 0.4135, train/lr: 0.0002 |
| [[34m2026-02-28 22:51:31[0m] [Epoch 17 | Step 21500] train/loss: 0.4140, train/lr: 0.0002 |
| [[34m2026-02-28 22:51:55[0m] [Epoch 17 | Step 21600] train/loss: 0.4120, train/lr: 0.0002 |
| [[34m2026-02-28 22:52:19[0m] [Epoch 17 | Step 21700] train/loss: 0.4175, train/lr: 0.0002 |
| [[34m2026-02-28 22:52:43[0m] [Epoch 17 | Step 21800] train/loss: 0.4147, train/lr: 0.0002 |
| [[34m2026-02-28 22:53:08[0m] [Epoch 17 | Step 21900] train/loss: 0.4128, train/lr: 0.0002 |
| [[34m2026-02-28 22:53:32[0m] [Epoch 17 | Step 22000] train/loss: 0.4102, train/lr: 0.0002 |
| [[34m2026-02-28 22:53:56[0m] [Epoch 17 | Step 22100] train/loss: 0.4141, train/lr: 0.0002 |
| [[34m2026-02-28 22:54:20[0m] [Epoch 17 | Step 22200] train/loss: 0.4128, train/lr: 0.0002 |
| [[34m2026-02-28 22:54:44[0m] [Epoch 17 | Step 22300] train/loss: 0.4152, train/lr: 0.0002 |
| [[34m2026-02-28 22:55:08[0m] [Epoch 17 | Step 22400] train/loss: 0.4118, train/lr: 0.0002 |
| [[34m2026-02-28 22:55:32[0m] [Epoch 17 | Step 22500] train/loss: 0.4103, train/lr: 0.0002 |
| [[34m2026-02-28 22:55:36[0m] [Epoch 17] epoch/loss: 0.4133 |
| [[34m2026-02-28 22:55:59[0m] [Epoch 18 | Step 22600] train/loss: 0.4135, train/lr: 0.0002 |
| [[34m2026-02-28 22:56:23[0m] [Epoch 18 | Step 22700] train/loss: 0.4127, train/lr: 0.0002 |
| [[34m2026-02-28 22:56:47[0m] [Epoch 18 | Step 22800] train/loss: 0.4108, train/lr: 0.0002 |
| [[34m2026-02-28 22:57:11[0m] [Epoch 18 | Step 22900] train/loss: 0.4136, train/lr: 0.0002 |
| [[34m2026-02-28 22:57:36[0m] [Epoch 18 | Step 23000] train/loss: 0.4113, train/lr: 0.0002 |
| [[34m2026-02-28 22:58:00[0m] [Epoch 18 | Step 23100] train/loss: 0.4123, train/lr: 0.0002 |
| [[34m2026-02-28 22:58:25[0m] [Epoch 18 | Step 23200] train/loss: 0.4132, train/lr: 0.0002 |
| [[34m2026-02-28 22:58:51[0m] [Epoch 18 | Step 23300] train/loss: 0.4123, train/lr: 0.0002 |
| [[34m2026-02-28 22:59:17[0m] [Epoch 18 | Step 23400] train/loss: 0.4116, train/lr: 0.0002 |
| [[34m2026-02-28 22:59:44[0m] [Epoch 18 | Step 23500] train/loss: 0.4119, train/lr: 0.0002 |
| [[34m2026-02-28 23:00:12[0m] [Epoch 18 | Step 23600] train/loss: 0.4127, train/lr: 0.0002 |
| [[34m2026-02-28 23:00:38[0m] [Epoch 18 | Step 23700] train/loss: 0.4097, train/lr: 0.0002 |
| [[34m2026-02-28 23:00:55[0m] [Epoch 18] epoch/loss: 0.4120 |
| [[34m2026-02-28 23:01:05[0m] [Epoch 19 | Step 23800] train/loss: 0.4089, train/lr: 0.0002 |
| [[34m2026-02-28 23:01:29[0m] [Epoch 19 | Step 23900] train/loss: 0.4099, train/lr: 0.0002 |
| [[34m2026-02-28 23:01:53[0m] [Epoch 19 | Step 24000] train/loss: 0.4116, train/lr: 0.0002 |
| [[34m2026-02-28 23:02:17[0m] [Epoch 19 | Step 24100] train/loss: 0.4119, train/lr: 0.0002 |
| [[34m2026-02-28 23:02:41[0m] [Epoch 19 | Step 24200] train/loss: 0.4104, train/lr: 0.0002 |
| [[34m2026-02-28 23:03:05[0m] [Epoch 19 | Step 24300] train/loss: 0.4111, train/lr: 0.0002 |
| [[34m2026-02-28 23:03:29[0m] [Epoch 19 | Step 24400] train/loss: 0.4101, train/lr: 0.0002 |
| [[34m2026-02-28 23:03:53[0m] [Epoch 19 | Step 24500] train/loss: 0.4102, train/lr: 0.0002 |
| [[34m2026-02-28 23:04:17[0m] [Epoch 19 | Step 24600] train/loss: 0.4110, train/lr: 0.0002 |
| [[34m2026-02-28 23:04:42[0m] [Epoch 19 | Step 24700] train/loss: 0.4078, train/lr: 0.0002 |
| [[34m2026-02-28 23:05:05[0m] [Epoch 19 | Step 24800] train/loss: 0.4119, train/lr: 0.0002 |
| [[34m2026-02-28 23:05:30[0m] [Epoch 19 | Step 24900] train/loss: 0.4088, train/lr: 0.0002 |
| [[34m2026-02-28 23:05:54[0m] [Epoch 19 | Step 25000] train/loss: 0.4077, train/lr: 0.0002 |
| [[34m2026-02-28 23:05:59[0m] [Epoch 19] epoch/loss: 0.4100 |
| [[34m2026-02-28 23:05:59[0m] Saving checkpoint at epoch 20... |
| [[34m2026-02-28 23:06:29[0m] [Epoch 20 | Step 25100] train/loss: 0.4101, train/lr: 0.0002 |
| [[34m2026-02-28 23:06:53[0m] [Epoch 20 | Step 25200] train/loss: 0.4044, train/lr: 0.0002 |
| [[34m2026-02-28 23:07:18[0m] [Epoch 20 | Step 25300] train/loss: 0.4106, train/lr: 0.0002 |
| [[34m2026-02-28 23:07:42[0m] [Epoch 20 | Step 25400] train/loss: 0.4079, train/lr: 0.0002 |
| [[34m2026-02-28 23:08:07[0m] [Epoch 20 | Step 25500] train/loss: 0.4096, train/lr: 0.0002 |
| [[34m2026-02-28 23:08:31[0m] [Epoch 20 | Step 25600] train/loss: 0.4073, train/lr: 0.0002 |
| [[34m2026-02-28 23:08:57[0m] [Epoch 20 | Step 25700] train/loss: 0.4075, train/lr: 0.0002 |
| [[34m2026-02-28 23:09:23[0m] [Epoch 20 | Step 25800] train/loss: 0.4069, train/lr: 0.0002 |
| [[34m2026-02-28 23:09:52[0m] [Epoch 20 | Step 25900] train/loss: 0.4128, train/lr: 0.0002 |
| [[34m2026-02-28 23:10:18[0m] [Epoch 20 | Step 26000] train/loss: 0.4095, train/lr: 0.0002 |
| [[34m2026-02-28 23:10:43[0m] [Epoch 20 | Step 26100] train/loss: 0.4067, train/lr: 0.0002 |
| [[34m2026-02-28 23:11:07[0m] [Epoch 20 | Step 26200] train/loss: 0.4089, train/lr: 0.0002 |
| [[34m2026-02-28 23:11:24[0m] [Epoch 20] epoch/loss: 0.4085 |
| [[34m2026-02-28 23:11:34[0m] [Epoch 21 | Step 26300] train/loss: 0.4082, train/lr: 0.0002 |
| [[34m2026-02-28 23:11:58[0m] [Epoch 21 | Step 26400] train/loss: 0.4092, train/lr: 0.0002 |
| [[34m2026-02-28 23:12:22[0m] [Epoch 21 | Step 26500] train/loss: 0.4093, train/lr: 0.0002 |
| [[34m2026-02-28 23:12:46[0m] [Epoch 21 | Step 26600] train/loss: 0.4067, train/lr: 0.0002 |
| [[34m2026-02-28 23:13:11[0m] [Epoch 21 | Step 26700] train/loss: 0.4070, train/lr: 0.0002 |
| [[34m2026-02-28 23:13:35[0m] [Epoch 21 | Step 26800] train/loss: 0.4068, train/lr: 0.0002 |
| [[34m2026-02-28 23:13:59[0m] [Epoch 21 | Step 26900] train/loss: 0.4047, train/lr: 0.0002 |
| [[34m2026-02-28 23:14:23[0m] [Epoch 21 | Step 27000] train/loss: 0.4087, train/lr: 0.0002 |
| [[34m2026-02-28 23:14:47[0m] [Epoch 21 | Step 27100] train/loss: 0.4071, train/lr: 0.0002 |
| [[34m2026-02-28 23:15:12[0m] [Epoch 21 | Step 27200] train/loss: 0.4071, train/lr: 0.0002 |
| [[34m2026-02-28 23:15:36[0m] [Epoch 21 | Step 27300] train/loss: 0.4050, train/lr: 0.0002 |
| [[34m2026-02-28 23:16:00[0m] [Epoch 21 | Step 27400] train/loss: 0.4079, train/lr: 0.0002 |
| [[34m2026-02-28 23:16:24[0m] [Epoch 21 | Step 27500] train/loss: 0.4100, train/lr: 0.0002 |
| [[34m2026-02-28 23:16:29[0m] [Epoch 21] epoch/loss: 0.4076 |
| [[34m2026-02-28 23:16:51[0m] [Epoch 22 | Step 27600] train/loss: 0.4056, train/lr: 0.0002 |
| [[34m2026-02-28 23:17:16[0m] [Epoch 22 | Step 27700] train/loss: 0.4043, train/lr: 0.0002 |
| [[34m2026-02-28 23:17:40[0m] [Epoch 22 | Step 27800] train/loss: 0.4041, train/lr: 0.0002 |
| [[34m2026-02-28 23:18:05[0m] [Epoch 22 | Step 27900] train/loss: 0.4056, train/lr: 0.0002 |
| [[34m2026-02-28 23:18:31[0m] [Epoch 22 | Step 28000] train/loss: 0.4057, train/lr: 0.0002 |
| [[34m2026-02-28 23:18:58[0m] [Epoch 22 | Step 28100] train/loss: 0.4073, train/lr: 0.0002 |
| [[34m2026-02-28 23:19:26[0m] [Epoch 22 | Step 28200] train/loss: 0.4065, train/lr: 0.0002 |
| [[34m2026-02-28 23:19:54[0m] [Epoch 22 | Step 28300] train/loss: 0.4071, train/lr: 0.0002 |
| [[34m2026-02-28 23:20:20[0m] [Epoch 22 | Step 28400] train/loss: 0.4047, train/lr: 0.0002 |
| [[34m2026-02-28 23:20:44[0m] [Epoch 22 | Step 28500] train/loss: 0.4018, train/lr: 0.0002 |
| [[34m2026-02-28 23:21:08[0m] [Epoch 22 | Step 28600] train/loss: 0.4062, train/lr: 0.0002 |
| [[34m2026-02-28 23:21:32[0m] [Epoch 22 | Step 28700] train/loss: 0.4070, train/lr: 0.0002 |
| [[34m2026-02-28 23:21:50[0m] [Epoch 22] epoch/loss: 0.4055 |
| [[34m2026-02-28 23:21:59[0m] [Epoch 23 | Step 28800] train/loss: 0.4064, train/lr: 0.0002 |
| [[34m2026-02-28 23:22:23[0m] [Epoch 23 | Step 28900] train/loss: 0.4050, train/lr: 0.0002 |
| [[34m2026-02-28 23:22:47[0m] [Epoch 23 | Step 29000] train/loss: 0.4033, train/lr: 0.0002 |
| [[34m2026-02-28 23:23:12[0m] [Epoch 23 | Step 29100] train/loss: 0.4032, train/lr: 0.0002 |
| [[34m2026-02-28 23:23:36[0m] [Epoch 23 | Step 29200] train/loss: 0.4079, train/lr: 0.0002 |
| [[34m2026-02-28 23:24:00[0m] [Epoch 23 | Step 29300] train/loss: 0.4008, train/lr: 0.0002 |
| [[34m2026-02-28 23:24:24[0m] [Epoch 23 | Step 29400] train/loss: 0.4032, train/lr: 0.0002 |
| [[34m2026-02-28 23:24:48[0m] [Epoch 23 | Step 29500] train/loss: 0.4046, train/lr: 0.0002 |
| [[34m2026-02-28 23:25:12[0m] [Epoch 23 | Step 29600] train/loss: 0.4067, train/lr: 0.0002 |
| [[34m2026-02-28 23:25:37[0m] [Epoch 23 | Step 29700] train/loss: 0.4034, train/lr: 0.0002 |
| [[34m2026-02-28 23:26:01[0m] [Epoch 23 | Step 29800] train/loss: 0.4047, train/lr: 0.0002 |
| [[34m2026-02-28 23:26:25[0m] [Epoch 23 | Step 29900] train/loss: 0.4045, train/lr: 0.0002 |
| [[34m2026-02-28 23:26:50[0m] [Epoch 23 | Step 30000] train/loss: 0.4082, train/lr: 0.0002 |
| [[34m2026-02-28 23:26:50[0m] Generating EMA samples... |
| [[34m2026-02-28 23:26:51[0m] Generating EMA samples done. |
| [[34m2026-02-28 23:26:57[0m] [Epoch 23] epoch/loss: 0.4046 |
| [[34m2026-02-28 23:27:18[0m] [Epoch 24 | Step 30100] train/loss: 0.4045, train/lr: 0.0002 |
| [[34m2026-02-28 23:27:43[0m] [Epoch 24 | Step 30200] train/loss: 0.4028, train/lr: 0.0002 |
| [[34m2026-02-28 23:28:08[0m] [Epoch 24 | Step 30300] train/loss: 0.4022, train/lr: 0.0002 |
| [[34m2026-02-28 23:28:33[0m] [Epoch 24 | Step 30400] train/loss: 0.3997, train/lr: 0.0002 |
| [[34m2026-02-28 23:29:01[0m] [Epoch 24 | Step 30500] train/loss: 0.4038, train/lr: 0.0002 |
| [[34m2026-02-28 23:29:29[0m] [Epoch 24 | Step 30600] train/loss: 0.4032, train/lr: 0.0002 |
| [[34m2026-02-28 23:29:55[0m] [Epoch 24 | Step 30700] train/loss: 0.4019, train/lr: 0.0002 |
| [[34m2026-02-28 23:30:21[0m] [Epoch 24 | Step 30800] train/loss: 0.3985, train/lr: 0.0002 |
| [[34m2026-02-28 23:30:45[0m] [Epoch 24 | Step 30900] train/loss: 0.4020, train/lr: 0.0002 |
| [[34m2026-02-28 23:31:09[0m] [Epoch 24 | Step 31000] train/loss: 0.4036, train/lr: 0.0002 |
| [[34m2026-02-28 23:31:33[0m] [Epoch 24 | Step 31100] train/loss: 0.4051, train/lr: 0.0002 |
| [[34m2026-02-28 23:31:57[0m] [Epoch 24 | Step 31200] train/loss: 0.4000, train/lr: 0.0002 |
| [[34m2026-02-28 23:32:15[0m] [Epoch 24] epoch/loss: 0.4021 |
| [[34m2026-02-28 23:32:24[0m] [Epoch 25 | Step 31300] train/loss: 0.4016, train/lr: 0.0002 |
| [[34m2026-02-28 23:32:48[0m] [Epoch 25 | Step 31400] train/loss: 0.3977, train/lr: 0.0002 |
| [[34m2026-02-28 23:33:13[0m] [Epoch 25 | Step 31500] train/loss: 0.4029, train/lr: 0.0002 |
| [[34m2026-02-28 23:33:37[0m] [Epoch 25 | Step 31600] train/loss: 0.4009, train/lr: 0.0002 |
| [[34m2026-02-28 23:34:01[0m] [Epoch 25 | Step 31700] train/loss: 0.4005, train/lr: 0.0002 |
| [[34m2026-02-28 23:34:25[0m] [Epoch 25 | Step 31800] train/loss: 0.4056, train/lr: 0.0002 |
| [[34m2026-02-28 23:34:49[0m] [Epoch 25 | Step 31900] train/loss: 0.3996, train/lr: 0.0002 |
| [[34m2026-02-28 23:35:13[0m] [Epoch 25 | Step 32000] train/loss: 0.4025, train/lr: 0.0002 |
| [[34m2026-02-28 23:35:37[0m] [Epoch 25 | Step 32100] train/loss: 0.4031, train/lr: 0.0002 |
| [[34m2026-02-28 23:36:01[0m] [Epoch 25 | Step 32200] train/loss: 0.3997, train/lr: 0.0002 |
| [[34m2026-02-28 23:36:25[0m] [Epoch 25 | Step 32300] train/loss: 0.4022, train/lr: 0.0002 |
| [[34m2026-02-28 23:36:50[0m] [Epoch 25 | Step 32400] train/loss: 0.4033, train/lr: 0.0002 |
| [[34m2026-02-28 23:37:14[0m] [Epoch 25 | Step 32500] train/loss: 0.4011, train/lr: 0.0002 |
| [[34m2026-02-28 23:37:21[0m] [Epoch 25] epoch/loss: 0.4016 |
| [[34m2026-02-28 23:37:42[0m] [Epoch 26 | Step 32600] train/loss: 0.4022, train/lr: 0.0002 |
| [[34m2026-02-28 23:38:08[0m] [Epoch 26 | Step 32700] train/loss: 0.4017, train/lr: 0.0002 |
| [[34m2026-02-28 23:38:34[0m] [Epoch 26 | Step 32800] train/loss: 0.3962, train/lr: 0.0002 |
| [[34m2026-02-28 23:39:02[0m] [Epoch 26 | Step 32900] train/loss: 0.4027, train/lr: 0.0002 |
| [[34m2026-02-28 23:39:30[0m] [Epoch 26 | Step 33000] train/loss: 0.3992, train/lr: 0.0002 |
| [[34m2026-02-28 23:39:55[0m] [Epoch 26 | Step 33100] train/loss: 0.4010, train/lr: 0.0002 |
| [[34m2026-02-28 23:40:19[0m] [Epoch 26 | Step 33200] train/loss: 0.3978, train/lr: 0.0002 |
| [[34m2026-02-28 23:40:44[0m] [Epoch 26 | Step 33300] train/loss: 0.3978, train/lr: 0.0002 |
| [[34m2026-02-28 23:41:08[0m] [Epoch 26 | Step 33400] train/loss: 0.4005, train/lr: 0.0002 |
| [[34m2026-02-28 23:41:32[0m] [Epoch 26 | Step 33500] train/loss: 0.3997, train/lr: 0.0002 |
| [[34m2026-02-28 23:41:56[0m] [Epoch 26 | Step 33600] train/loss: 0.3959, train/lr: 0.0002 |
| [[34m2026-02-28 23:42:20[0m] [Epoch 26 | Step 33700] train/loss: 0.4002, train/lr: 0.0002 |
| [[34m2026-02-28 23:42:39[0m] [Epoch 26] epoch/loss: 0.3995 |
| [[34m2026-02-28 23:42:47[0m] [Epoch 27 | Step 33800] train/loss: 0.3979, train/lr: 0.0002 |
| [[34m2026-02-28 23:43:11[0m] [Epoch 27 | Step 33900] train/loss: 0.3993, train/lr: 0.0002 |
| [[34m2026-02-28 23:43:35[0m] [Epoch 27 | Step 34000] train/loss: 0.3963, train/lr: 0.0002 |
| [[34m2026-02-28 23:43:59[0m] [Epoch 27 | Step 34100] train/loss: 0.4013, train/lr: 0.0002 |
| [[34m2026-02-28 23:44:23[0m] [Epoch 27 | Step 34200] train/loss: 0.4007, train/lr: 0.0002 |
| [[34m2026-02-28 23:44:47[0m] [Epoch 27 | Step 34300] train/loss: 0.4001, train/lr: 0.0002 |
| [[34m2026-02-28 23:45:11[0m] [Epoch 27 | Step 34400] train/loss: 0.3999, train/lr: 0.0002 |
| [[34m2026-02-28 23:45:35[0m] [Epoch 27 | Step 34500] train/loss: 0.3999, train/lr: 0.0002 |
| [[34m2026-02-28 23:45:59[0m] [Epoch 27 | Step 34600] train/loss: 0.3996, train/lr: 0.0002 |
| [[34m2026-02-28 23:46:24[0m] [Epoch 27 | Step 34700] train/loss: 0.3980, train/lr: 0.0002 |
| [[34m2026-02-28 23:46:48[0m] [Epoch 27 | Step 34800] train/loss: 0.3968, train/lr: 0.0002 |
| [[34m2026-02-28 23:47:13[0m] [Epoch 27 | Step 34900] train/loss: 0.4011, train/lr: 0.0002 |
| [[34m2026-02-28 23:47:38[0m] [Epoch 27 | Step 35000] train/loss: 0.3974, train/lr: 0.0002 |
| [[34m2026-02-28 23:47:45[0m] [Epoch 27] epoch/loss: 0.3991 |
| [[34m2026-02-28 23:48:07[0m] [Epoch 28 | Step 35100] train/loss: 0.3984, train/lr: 0.0002 |
| [[34m2026-02-28 23:48:33[0m] [Epoch 28 | Step 35200] train/loss: 0.3966, train/lr: 0.0002 |
| [[34m2026-02-28 23:49:02[0m] [Epoch 28 | Step 35300] train/loss: 0.3994, train/lr: 0.0002 |
| [[34m2026-02-28 23:49:27[0m] [Epoch 28 | Step 35400] train/loss: 0.3995, train/lr: 0.0002 |
| [[34m2026-02-28 23:49:53[0m] [Epoch 28 | Step 35500] train/loss: 0.3984, train/lr: 0.0002 |
| [[34m2026-02-28 23:50:17[0m] [Epoch 28 | Step 35600] train/loss: 0.4002, train/lr: 0.0002 |
| [[34m2026-02-28 23:50:41[0m] [Epoch 28 | Step 35700] train/loss: 0.3981, train/lr: 0.0002 |
| [[34m2026-02-28 23:51:05[0m] [Epoch 28 | Step 35800] train/loss: 0.3981, train/lr: 0.0002 |
| [[34m2026-02-28 23:51:29[0m] [Epoch 28 | Step 35900] train/loss: 0.3966, train/lr: 0.0002 |
| [[34m2026-02-28 23:51:53[0m] [Epoch 28 | Step 36000] train/loss: 0.3986, train/lr: 0.0002 |
| [[34m2026-02-28 23:52:17[0m] [Epoch 28 | Step 36100] train/loss: 0.3980, train/lr: 0.0002 |
| [[34m2026-02-28 23:52:41[0m] [Epoch 28 | Step 36200] train/loss: 0.3979, train/lr: 0.0002 |
| [[34m2026-02-28 23:53:00[0m] [Epoch 28] epoch/loss: 0.3981 |
| [[34m2026-02-28 23:53:08[0m] [Epoch 29 | Step 36300] train/loss: 0.3959, train/lr: 0.0002 |
| [[34m2026-02-28 23:53:32[0m] [Epoch 29 | Step 36400] train/loss: 0.3983, train/lr: 0.0002 |
| [[34m2026-02-28 23:53:56[0m] [Epoch 29 | Step 36500] train/loss: 0.3982, train/lr: 0.0002 |
| [[34m2026-02-28 23:54:20[0m] [Epoch 29 | Step 36600] train/loss: 0.3984, train/lr: 0.0002 |
| [[34m2026-02-28 23:54:44[0m] [Epoch 29 | Step 36700] train/loss: 0.3928, train/lr: 0.0002 |
| [[34m2026-02-28 23:55:08[0m] [Epoch 29 | Step 36800] train/loss: 0.3947, train/lr: 0.0002 |
| [[34m2026-02-28 23:55:32[0m] [Epoch 29 | Step 36900] train/loss: 0.3996, train/lr: 0.0002 |
| [[34m2026-02-28 23:55:56[0m] [Epoch 29 | Step 37000] train/loss: 0.3985, train/lr: 0.0002 |
| [[34m2026-02-28 23:56:21[0m] [Epoch 29 | Step 37100] train/loss: 0.3955, train/lr: 0.0002 |
| [[34m2026-02-28 23:56:45[0m] [Epoch 29 | Step 37200] train/loss: 0.3958, train/lr: 0.0002 |
| [[34m2026-02-28 23:57:10[0m] [Epoch 29 | Step 37300] train/loss: 0.3978, train/lr: 0.0002 |
| [[34m2026-02-28 23:57:35[0m] [Epoch 29 | Step 37400] train/loss: 0.4001, train/lr: 0.0002 |
| [[34m2026-02-28 23:58:01[0m] [Epoch 29 | Step 37500] train/loss: 0.4005, train/lr: 0.0002 |
| [[34m2026-02-28 23:58:11[0m] [Epoch 29] epoch/loss: 0.3974 |
| [[34m2026-02-28 23:58:11[0m] Saving checkpoint at epoch 30... |
| [[34m2026-02-28 23:58:40[0m] [Epoch 30 | Step 37600] train/loss: 0.3947, train/lr: 0.0002 |
| [[34m2026-02-28 23:59:07[0m] [Epoch 30 | Step 37700] train/loss: 0.3952, train/lr: 0.0002 |
| [[34m2026-02-28 23:59:33[0m] [Epoch 30 | Step 37800] train/loss: 0.3955, train/lr: 0.0002 |
| [[34m2026-02-28 23:59:58[0m] [Epoch 30 | Step 37900] train/loss: 0.3956, train/lr: 0.0002 |
| [[34m2026-03-01 00:00:22[0m] [Epoch 30 | Step 38000] train/loss: 0.3939, train/lr: 0.0002 |
| [[34m2026-03-01 00:00:46[0m] [Epoch 30 | Step 38100] train/loss: 0.3959, train/lr: 0.0002 |
| [[34m2026-03-01 00:01:10[0m] [Epoch 30 | Step 38200] train/loss: 0.3952, train/lr: 0.0002 |
| [[34m2026-03-01 00:01:34[0m] [Epoch 30 | Step 38300] train/loss: 0.3968, train/lr: 0.0002 |
| [[34m2026-03-01 00:01:58[0m] [Epoch 30 | Step 38400] train/loss: 0.3961, train/lr: 0.0002 |
| [[34m2026-03-01 00:02:22[0m] [Epoch 30 | Step 38500] train/loss: 0.3947, train/lr: 0.0002 |
| [[34m2026-03-01 00:02:47[0m] [Epoch 30 | Step 38600] train/loss: 0.3965, train/lr: 0.0002 |
| [[34m2026-03-01 00:03:11[0m] [Epoch 30 | Step 38700] train/loss: 0.3960, train/lr: 0.0002 |
| [[34m2026-03-01 00:03:30[0m] [Epoch 30] epoch/loss: 0.3954 |
| [[34m2026-03-01 00:03:37[0m] [Epoch 31 | Step 38800] train/loss: 0.3924, train/lr: 0.0002 |
| [[34m2026-03-01 00:04:01[0m] [Epoch 31 | Step 38900] train/loss: 0.3929, train/lr: 0.0002 |
| [[34m2026-03-01 00:04:25[0m] [Epoch 31 | Step 39000] train/loss: 0.3976, train/lr: 0.0002 |
| [[34m2026-03-01 00:04:49[0m] [Epoch 31 | Step 39100] train/loss: 0.3968, train/lr: 0.0002 |
| [[34m2026-03-01 00:05:14[0m] [Epoch 31 | Step 39200] train/loss: 0.3930, train/lr: 0.0002 |
| [[34m2026-03-01 00:05:38[0m] [Epoch 31 | Step 39300] train/loss: 0.3948, train/lr: 0.0002 |
| [[34m2026-03-01 00:06:02[0m] [Epoch 31 | Step 39400] train/loss: 0.3930, train/lr: 0.0002 |
| [[34m2026-03-01 00:06:27[0m] [Epoch 31 | Step 39500] train/loss: 0.3945, train/lr: 0.0002 |
| [[34m2026-03-01 00:06:51[0m] [Epoch 31 | Step 39600] train/loss: 0.3973, train/lr: 0.0002 |
| [[34m2026-03-01 00:07:16[0m] [Epoch 31 | Step 39700] train/loss: 0.3955, train/lr: 0.0002 |
| [[34m2026-03-01 00:07:41[0m] [Epoch 31 | Step 39800] train/loss: 0.3957, train/lr: 0.0002 |
| [[34m2026-03-01 00:08:08[0m] [Epoch 31 | Step 39900] train/loss: 0.3991, train/lr: 0.0002 |
| [[34m2026-03-01 00:08:36[0m] [Epoch 31 | Step 40000] train/loss: 0.3939, train/lr: 0.0002 |
| [[34m2026-03-01 00:08:36[0m] Generating EMA samples... |
| [[34m2026-03-01 00:08:37[0m] Generating EMA samples done. |
| [[34m2026-03-01 00:08:46[0m] [Epoch 31] epoch/loss: 0.3952 |
| [[34m2026-03-01 00:09:07[0m] [Epoch 32 | Step 40100] train/loss: 0.3934, train/lr: 0.0002 |
| [[34m2026-03-01 00:09:32[0m] [Epoch 32 | Step 40200] train/loss: 0.3936, train/lr: 0.0002 |
| [[34m2026-03-01 00:09:56[0m] [Epoch 32 | Step 40300] train/loss: 0.3932, train/lr: 0.0002 |
| [[34m2026-03-01 00:10:20[0m] [Epoch 32 | Step 40400] train/loss: 0.3945, train/lr: 0.0002 |
| [[34m2026-03-01 00:10:44[0m] [Epoch 32 | Step 40500] train/loss: 0.3954, train/lr: 0.0002 |
| [[34m2026-03-01 00:11:09[0m] [Epoch 32 | Step 40600] train/loss: 0.3939, train/lr: 0.0002 |
| [[34m2026-03-01 00:11:33[0m] [Epoch 32 | Step 40700] train/loss: 0.3913, train/lr: 0.0002 |
| [[34m2026-03-01 00:11:57[0m] [Epoch 32 | Step 40800] train/loss: 0.3950, train/lr: 0.0002 |
| [[34m2026-03-01 00:12:21[0m] [Epoch 32 | Step 40900] train/loss: 0.3897, train/lr: 0.0002 |
| [[34m2026-03-01 00:12:45[0m] [Epoch 32 | Step 41000] train/loss: 0.3939, train/lr: 0.0002 |
| [[34m2026-03-01 00:13:09[0m] [Epoch 32 | Step 41100] train/loss: 0.3947, train/lr: 0.0002 |
| [[34m2026-03-01 00:13:33[0m] [Epoch 32 | Step 41200] train/loss: 0.3908, train/lr: 0.0002 |
| [[34m2026-03-01 00:13:53[0m] [Epoch 32] epoch/loss: 0.3933 |
| [[34m2026-03-01 00:14:00[0m] [Epoch 33 | Step 41300] train/loss: 0.3942, train/lr: 0.0002 |
| [[34m2026-03-01 00:14:24[0m] [Epoch 33 | Step 41400] train/loss: 0.3946, train/lr: 0.0002 |
| [[34m2026-03-01 00:14:48[0m] [Epoch 33 | Step 41500] train/loss: 0.3931, train/lr: 0.0002 |
| [[34m2026-03-01 00:15:12[0m] [Epoch 33 | Step 41600] train/loss: 0.3927, train/lr: 0.0002 |
| [[34m2026-03-01 00:15:36[0m] [Epoch 33 | Step 41700] train/loss: 0.3921, train/lr: 0.0002 |
| [[34m2026-03-01 00:16:01[0m] [Epoch 33 | Step 41800] train/loss: 0.3921, train/lr: 0.0002 |
| [[34m2026-03-01 00:16:25[0m] [Epoch 33 | Step 41900] train/loss: 0.3887, train/lr: 0.0002 |
| [[34m2026-03-01 00:16:51[0m] [Epoch 33 | Step 42000] train/loss: 0.3944, train/lr: 0.0002 |
| [[34m2026-03-01 00:17:16[0m] [Epoch 33 | Step 42100] train/loss: 0.3937, train/lr: 0.0002 |
| [[34m2026-03-01 00:17:44[0m] [Epoch 33 | Step 42200] train/loss: 0.3937, train/lr: 0.0002 |
| [[34m2026-03-01 00:18:09[0m] [Epoch 33 | Step 42300] train/loss: 0.3926, train/lr: 0.0002 |
| [[34m2026-03-01 00:18:37[0m] [Epoch 33 | Step 42400] train/loss: 0.3931, train/lr: 0.0002 |
| [[34m2026-03-01 00:19:04[0m] [Epoch 33 | Step 42500] train/loss: 0.3901, train/lr: 0.0002 |
| [[34m2026-03-01 00:19:12[0m] [Epoch 33] epoch/loss: 0.3925 |
| [[34m2026-03-01 00:19:31[0m] [Epoch 34 | Step 42600] train/loss: 0.3907, train/lr: 0.0002 |
| [[34m2026-03-01 00:19:55[0m] [Epoch 34 | Step 42700] train/loss: 0.3914, train/lr: 0.0002 |
| [[34m2026-03-01 00:20:19[0m] [Epoch 34 | Step 42800] train/loss: 0.3911, train/lr: 0.0002 |
| [[34m2026-03-01 00:20:43[0m] [Epoch 34 | Step 42900] train/loss: 0.3923, train/lr: 0.0002 |
| [[34m2026-03-01 00:21:07[0m] [Epoch 34 | Step 43000] train/loss: 0.3953, train/lr: 0.0002 |
| [[34m2026-03-01 00:21:31[0m] [Epoch 34 | Step 43100] train/loss: 0.3903, train/lr: 0.0002 |
| [[34m2026-03-01 00:21:55[0m] [Epoch 34 | Step 43200] train/loss: 0.3927, train/lr: 0.0002 |
| [[34m2026-03-01 00:22:20[0m] [Epoch 34 | Step 43300] train/loss: 0.3880, train/lr: 0.0002 |
| [[34m2026-03-01 00:22:44[0m] [Epoch 34 | Step 43400] train/loss: 0.3912, train/lr: 0.0002 |
| [[34m2026-03-01 00:23:08[0m] [Epoch 34 | Step 43500] train/loss: 0.3893, train/lr: 0.0002 |
| [[34m2026-03-01 00:23:32[0m] [Epoch 34 | Step 43600] train/loss: 0.3917, train/lr: 0.0002 |
| [[34m2026-03-01 00:23:56[0m] [Epoch 34 | Step 43700] train/loss: 0.3945, train/lr: 0.0002 |
| [[34m2026-03-01 00:24:16[0m] [Epoch 34] epoch/loss: 0.3916 |
| [[34m2026-03-01 00:24:23[0m] [Epoch 35 | Step 43800] train/loss: 0.3920, train/lr: 0.0002 |
| [[34m2026-03-01 00:24:47[0m] [Epoch 35 | Step 43900] train/loss: 0.3932, train/lr: 0.0002 |
| [[34m2026-03-01 00:25:11[0m] [Epoch 35 | Step 44000] train/loss: 0.3911, train/lr: 0.0002 |
| [[34m2026-03-01 00:25:36[0m] [Epoch 35 | Step 44100] train/loss: 0.3925, train/lr: 0.0002 |
| [[34m2026-03-01 00:26:00[0m] [Epoch 35 | Step 44200] train/loss: 0.3902, train/lr: 0.0002 |
| [[34m2026-03-01 00:26:25[0m] [Epoch 35 | Step 44300] train/loss: 0.3898, train/lr: 0.0002 |
| [[34m2026-03-01 00:26:49[0m] [Epoch 35 | Step 44400] train/loss: 0.3948, train/lr: 0.0002 |
| [[34m2026-03-01 00:27:14[0m] [Epoch 35 | Step 44500] train/loss: 0.3924, train/lr: 0.0002 |
| [[34m2026-03-01 00:27:43[0m] [Epoch 35 | Step 44600] train/loss: 0.3884, train/lr: 0.0002 |
| [[34m2026-03-01 00:28:11[0m] [Epoch 35 | Step 44700] train/loss: 0.3899, train/lr: 0.0002 |
| [[34m2026-03-01 00:28:37[0m] [Epoch 35 | Step 44800] train/loss: 0.3948, train/lr: 0.0002 |
| [[34m2026-03-01 00:29:03[0m] [Epoch 35 | Step 44900] train/loss: 0.3905, train/lr: 0.0002 |
| [[34m2026-03-01 00:29:27[0m] [Epoch 35 | Step 45000] train/loss: 0.3927, train/lr: 0.0002 |
| [[34m2026-03-01 00:29:35[0m] [Epoch 35] epoch/loss: 0.3916 |
| [[34m2026-03-01 00:29:54[0m] [Epoch 36 | Step 45100] train/loss: 0.3905, train/lr: 0.0002 |
| [[34m2026-03-01 00:30:18[0m] [Epoch 36 | Step 45200] train/loss: 0.3915, train/lr: 0.0002 |
| [[34m2026-03-01 00:30:42[0m] [Epoch 36 | Step 45300] train/loss: 0.3887, train/lr: 0.0002 |
| [[34m2026-03-01 00:31:06[0m] [Epoch 36 | Step 45400] train/loss: 0.3880, train/lr: 0.0002 |
| [[34m2026-03-01 00:31:30[0m] [Epoch 36 | Step 45500] train/loss: 0.3927, train/lr: 0.0002 |
| [[34m2026-03-01 00:31:54[0m] [Epoch 36 | Step 45600] train/loss: 0.3920, train/lr: 0.0002 |
| [[34m2026-03-01 00:32:18[0m] [Epoch 36 | Step 45700] train/loss: 0.3914, train/lr: 0.0002 |
| [[34m2026-03-01 00:32:42[0m] [Epoch 36 | Step 45800] train/loss: 0.3930, train/lr: 0.0002 |
| [[34m2026-03-01 00:33:07[0m] [Epoch 36 | Step 45900] train/loss: 0.3900, train/lr: 0.0002 |
| [[34m2026-03-01 00:33:31[0m] [Epoch 36 | Step 46000] train/loss: 0.3881, train/lr: 0.0002 |
| [[34m2026-03-01 00:33:55[0m] [Epoch 36 | Step 46100] train/loss: 0.3912, train/lr: 0.0002 |
| [[34m2026-03-01 00:34:19[0m] [Epoch 36 | Step 46200] train/loss: 0.3909, train/lr: 0.0002 |
| [[34m2026-03-01 00:34:40[0m] [Epoch 36] epoch/loss: 0.3908 |
| [[34m2026-03-01 00:34:46[0m] [Epoch 37 | Step 46300] train/loss: 0.3908, train/lr: 0.0002 |
| [[34m2026-03-01 00:35:10[0m] [Epoch 37 | Step 46400] train/loss: 0.3898, train/lr: 0.0002 |
| [[34m2026-03-01 00:35:35[0m] [Epoch 37 | Step 46500] train/loss: 0.3894, train/lr: 0.0002 |
| [[34m2026-03-01 00:35:59[0m] [Epoch 37 | Step 46600] train/loss: 0.3913, train/lr: 0.0002 |
| [[34m2026-03-01 00:36:25[0m] [Epoch 37 | Step 46700] train/loss: 0.3901, train/lr: 0.0002 |
| [[34m2026-03-01 00:36:49[0m] [Epoch 37 | Step 46800] train/loss: 0.3899, train/lr: 0.0002 |
| [[34m2026-03-01 00:37:17[0m] [Epoch 37 | Step 46900] train/loss: 0.3922, train/lr: 0.0002 |
| [[34m2026-03-01 00:37:42[0m] [Epoch 37 | Step 47000] train/loss: 0.3902, train/lr: 0.0002 |
| [[34m2026-03-01 00:38:10[0m] [Epoch 37 | Step 47100] train/loss: 0.3907, train/lr: 0.0002 |
| [[34m2026-03-01 00:38:38[0m] [Epoch 37 | Step 47200] train/loss: 0.3893, train/lr: 0.0002 |
| [[34m2026-03-01 00:39:02[0m] [Epoch 37 | Step 47300] train/loss: 0.3925, train/lr: 0.0002 |
| [[34m2026-03-01 00:39:26[0m] [Epoch 37 | Step 47400] train/loss: 0.3880, train/lr: 0.0002 |
| [[34m2026-03-01 00:39:50[0m] [Epoch 37 | Step 47500] train/loss: 0.3889, train/lr: 0.0002 |
| [[34m2026-03-01 00:39:59[0m] [Epoch 37] epoch/loss: 0.3903 |
| [[34m2026-03-01 00:40:17[0m] [Epoch 38 | Step 47600] train/loss: 0.3911, train/lr: 0.0002 |
| [[34m2026-03-01 00:40:41[0m] [Epoch 38 | Step 47700] train/loss: 0.3885, train/lr: 0.0002 |
| [[34m2026-03-01 00:41:05[0m] [Epoch 38 | Step 47800] train/loss: 0.3891, train/lr: 0.0002 |
| [[34m2026-03-01 00:41:29[0m] [Epoch 38 | Step 47900] train/loss: 0.3897, train/lr: 0.0002 |
| [[34m2026-03-01 00:41:53[0m] [Epoch 38 | Step 48000] train/loss: 0.3894, train/lr: 0.0002 |
| [[34m2026-03-01 00:42:17[0m] [Epoch 38 | Step 48100] train/loss: 0.3917, train/lr: 0.0002 |
| [[34m2026-03-01 00:42:41[0m] [Epoch 38 | Step 48200] train/loss: 0.3895, train/lr: 0.0002 |
| [[34m2026-03-01 00:43:05[0m] [Epoch 38 | Step 48300] train/loss: 0.3893, train/lr: 0.0002 |
| [[34m2026-03-01 00:43:30[0m] [Epoch 38 | Step 48400] train/loss: 0.3857, train/lr: 0.0002 |
| [[34m2026-03-01 00:43:54[0m] [Epoch 38 | Step 48500] train/loss: 0.3914, train/lr: 0.0002 |
| [[34m2026-03-01 00:44:18[0m] [Epoch 38 | Step 48600] train/loss: 0.3925, train/lr: 0.0002 |
| [[34m2026-03-01 00:44:42[0m] [Epoch 38 | Step 48700] train/loss: 0.3886, train/lr: 0.0002 |
| [[34m2026-03-01 00:45:04[0m] [Epoch 38] epoch/loss: 0.3894 |
| [[34m2026-03-01 00:45:09[0m] [Epoch 39 | Step 48800] train/loss: 0.3878, train/lr: 0.0002 |
| [[34m2026-03-01 00:45:33[0m] [Epoch 39 | Step 48900] train/loss: 0.3880, train/lr: 0.0002 |
| [[34m2026-03-01 00:45:58[0m] [Epoch 39 | Step 49000] train/loss: 0.3883, train/lr: 0.0002 |
| [[34m2026-03-01 00:46:22[0m] [Epoch 39 | Step 49100] train/loss: 0.3883, train/lr: 0.0002 |
| [[34m2026-03-01 00:46:47[0m] [Epoch 39 | Step 49200] train/loss: 0.3891, train/lr: 0.0002 |
| [[34m2026-03-01 00:47:16[0m] [Epoch 39 | Step 49300] train/loss: 0.3885, train/lr: 0.0002 |
| [[34m2026-03-01 00:47:43[0m] [Epoch 39 | Step 49400] train/loss: 0.3889, train/lr: 0.0002 |
| [[34m2026-03-01 00:48:09[0m] [Epoch 39 | Step 49500] train/loss: 0.3900, train/lr: 0.0002 |
| [[34m2026-03-01 00:48:35[0m] [Epoch 39 | Step 49600] train/loss: 0.3886, train/lr: 0.0002 |
| [[34m2026-03-01 00:48:59[0m] [Epoch 39 | Step 49700] train/loss: 0.3915, train/lr: 0.0002 |
| [[34m2026-03-01 00:49:23[0m] [Epoch 39 | Step 49800] train/loss: 0.3905, train/lr: 0.0002 |
| [[34m2026-03-01 00:49:47[0m] [Epoch 39 | Step 49900] train/loss: 0.3919, train/lr: 0.0002 |
| [[34m2026-03-01 00:50:12[0m] [Epoch 39 | Step 50000] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 00:50:12[0m] Generating EMA samples... |
| [[34m2026-03-01 00:50:12[0m] Generating EMA samples done. |
| [[34m2026-03-01 00:50:22[0m] [Epoch 39] epoch/loss: 0.3892 |
| [[34m2026-03-01 00:50:22[0m] Saving checkpoint at epoch 40... |
| [[34m2026-03-01 00:50:47[0m] [Epoch 40 | Step 50100] train/loss: 0.3900, train/lr: 0.0002 |
| [[34m2026-03-01 00:51:11[0m] [Epoch 40 | Step 50200] train/loss: 0.3884, train/lr: 0.0002 |
| [[34m2026-03-01 00:51:35[0m] [Epoch 40 | Step 50300] train/loss: 0.3874, train/lr: 0.0002 |
| [[34m2026-03-01 00:51:59[0m] [Epoch 40 | Step 50400] train/loss: 0.3874, train/lr: 0.0002 |
| [[34m2026-03-01 00:52:23[0m] [Epoch 40 | Step 50500] train/loss: 0.3864, train/lr: 0.0002 |
| [[34m2026-03-01 00:52:47[0m] [Epoch 40 | Step 50600] train/loss: 0.3892, train/lr: 0.0002 |
| [[34m2026-03-01 00:53:11[0m] [Epoch 40 | Step 50700] train/loss: 0.3896, train/lr: 0.0002 |
| [[34m2026-03-01 00:53:36[0m] [Epoch 40 | Step 50800] train/loss: 0.3880, train/lr: 0.0002 |
| [[34m2026-03-01 00:54:00[0m] [Epoch 40 | Step 50900] train/loss: 0.3881, train/lr: 0.0002 |
| [[34m2026-03-01 00:54:24[0m] [Epoch 40 | Step 51000] train/loss: 0.3868, train/lr: 0.0002 |
| [[34m2026-03-01 00:54:48[0m] [Epoch 40 | Step 51100] train/loss: 0.3874, train/lr: 0.0002 |
| [[34m2026-03-01 00:55:13[0m] [Epoch 40 | Step 51200] train/loss: 0.3874, train/lr: 0.0002 |
| [[34m2026-03-01 00:55:35[0m] [Epoch 40] epoch/loss: 0.3879 |
| [[34m2026-03-01 00:55:40[0m] [Epoch 41 | Step 51300] train/loss: 0.3891, train/lr: 0.0002 |
| [[34m2026-03-01 00:56:05[0m] [Epoch 41 | Step 51400] train/loss: 0.3898, train/lr: 0.0002 |
| [[34m2026-03-01 00:56:30[0m] [Epoch 41 | Step 51500] train/loss: 0.3892, train/lr: 0.0002 |
| [[34m2026-03-01 00:56:56[0m] [Epoch 41 | Step 51600] train/loss: 0.3896, train/lr: 0.0002 |
| [[34m2026-03-01 00:57:22[0m] [Epoch 41 | Step 51700] train/loss: 0.3875, train/lr: 0.0002 |
| [[34m2026-03-01 00:57:50[0m] [Epoch 41 | Step 51800] train/loss: 0.3902, train/lr: 0.0002 |
| [[34m2026-03-01 00:58:17[0m] [Epoch 41 | Step 51900] train/loss: 0.3906, train/lr: 0.0002 |
| [[34m2026-03-01 00:58:41[0m] [Epoch 41 | Step 52000] train/loss: 0.3885, train/lr: 0.0002 |
| [[34m2026-03-01 00:59:05[0m] [Epoch 41 | Step 52100] train/loss: 0.3854, train/lr: 0.0002 |
| [[34m2026-03-01 00:59:29[0m] [Epoch 41 | Step 52200] train/loss: 0.3856, train/lr: 0.0002 |
| [[34m2026-03-01 00:59:54[0m] [Epoch 41 | Step 52300] train/loss: 0.3854, train/lr: 0.0002 |
| [[34m2026-03-01 01:00:18[0m] [Epoch 41 | Step 52400] train/loss: 0.3906, train/lr: 0.0002 |
| [[34m2026-03-01 01:00:42[0m] [Epoch 41 | Step 52500] train/loss: 0.3855, train/lr: 0.0002 |
| [[34m2026-03-01 01:00:52[0m] [Epoch 41] epoch/loss: 0.3882 |
| [[34m2026-03-01 01:01:09[0m] [Epoch 42 | Step 52600] train/loss: 0.3869, train/lr: 0.0002 |
| [[34m2026-03-01 01:01:33[0m] [Epoch 42 | Step 52700] train/loss: 0.3891, train/lr: 0.0002 |
| [[34m2026-03-01 01:01:57[0m] [Epoch 42 | Step 52800] train/loss: 0.3915, train/lr: 0.0002 |
| [[34m2026-03-01 01:02:21[0m] [Epoch 42 | Step 52900] train/loss: 0.3863, train/lr: 0.0002 |
| [[34m2026-03-01 01:02:45[0m] [Epoch 42 | Step 53000] train/loss: 0.3921, train/lr: 0.0002 |
| [[34m2026-03-01 01:03:09[0m] [Epoch 42 | Step 53100] train/loss: 0.3875, train/lr: 0.0002 |
| [[34m2026-03-01 01:03:33[0m] [Epoch 42 | Step 53200] train/loss: 0.3885, train/lr: 0.0002 |
| [[34m2026-03-01 01:03:57[0m] [Epoch 42 | Step 53300] train/loss: 0.3886, train/lr: 0.0002 |
| [[34m2026-03-01 01:04:21[0m] [Epoch 42 | Step 53400] train/loss: 0.3886, train/lr: 0.0002 |
| [[34m2026-03-01 01:04:46[0m] [Epoch 42 | Step 53500] train/loss: 0.3876, train/lr: 0.0002 |
| [[34m2026-03-01 01:05:10[0m] [Epoch 42 | Step 53600] train/loss: 0.3882, train/lr: 0.0002 |
| [[34m2026-03-01 01:05:35[0m] [Epoch 42 | Step 53700] train/loss: 0.3887, train/lr: 0.0002 |
| [[34m2026-03-01 01:05:58[0m] [Epoch 42] epoch/loss: 0.3885 |
| [[34m2026-03-01 01:06:03[0m] [Epoch 43 | Step 53800] train/loss: 0.3869, train/lr: 0.0002 |
| [[34m2026-03-01 01:06:27[0m] [Epoch 43 | Step 53900] train/loss: 0.3856, train/lr: 0.0002 |
| [[34m2026-03-01 01:06:53[0m] [Epoch 43 | Step 54000] train/loss: 0.3855, train/lr: 0.0002 |
| [[34m2026-03-01 01:07:22[0m] [Epoch 43 | Step 54100] train/loss: 0.3832, train/lr: 0.0002 |
| [[34m2026-03-01 01:07:49[0m] [Epoch 43 | Step 54200] train/loss: 0.3845, train/lr: 0.0002 |
| [[34m2026-03-01 01:08:14[0m] [Epoch 43 | Step 54300] train/loss: 0.3865, train/lr: 0.0002 |
| [[34m2026-03-01 01:08:38[0m] [Epoch 43 | Step 54400] train/loss: 0.3853, train/lr: 0.0002 |
| [[34m2026-03-01 01:09:02[0m] [Epoch 43 | Step 54500] train/loss: 0.3873, train/lr: 0.0002 |
| [[34m2026-03-01 01:09:27[0m] [Epoch 43 | Step 54600] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 01:09:51[0m] [Epoch 43 | Step 54700] train/loss: 0.3888, train/lr: 0.0002 |
| [[34m2026-03-01 01:10:15[0m] [Epoch 43 | Step 54800] train/loss: 0.3866, train/lr: 0.0002 |
| [[34m2026-03-01 01:10:39[0m] [Epoch 43 | Step 54900] train/loss: 0.3871, train/lr: 0.0002 |
| [[34m2026-03-01 01:11:03[0m] [Epoch 43 | Step 55000] train/loss: 0.3893, train/lr: 0.0002 |
| [[34m2026-03-01 01:11:14[0m] [Epoch 43] epoch/loss: 0.3862 |
| [[34m2026-03-01 01:11:30[0m] [Epoch 44 | Step 55100] train/loss: 0.3868, train/lr: 0.0002 |
| [[34m2026-03-01 01:11:54[0m] [Epoch 44 | Step 55200] train/loss: 0.3854, train/lr: 0.0002 |
| [[34m2026-03-01 01:12:18[0m] [Epoch 44 | Step 55300] train/loss: 0.3882, train/lr: 0.0002 |
| [[34m2026-03-01 01:12:42[0m] [Epoch 44 | Step 55400] train/loss: 0.3871, train/lr: 0.0002 |
| [[34m2026-03-01 01:13:06[0m] [Epoch 44 | Step 55500] train/loss: 0.3831, train/lr: 0.0002 |
| [[34m2026-03-01 01:13:31[0m] [Epoch 44 | Step 55600] train/loss: 0.3854, train/lr: 0.0002 |
| [[34m2026-03-01 01:13:55[0m] [Epoch 44 | Step 55700] train/loss: 0.3852, train/lr: 0.0002 |
| [[34m2026-03-01 01:14:19[0m] [Epoch 44 | Step 55800] train/loss: 0.3858, train/lr: 0.0002 |
| [[34m2026-03-01 01:14:44[0m] [Epoch 44 | Step 55900] train/loss: 0.3887, train/lr: 0.0002 |
| [[34m2026-03-01 01:15:08[0m] [Epoch 44 | Step 56000] train/loss: 0.3852, train/lr: 0.0002 |
| [[34m2026-03-01 01:15:34[0m] [Epoch 44 | Step 56100] train/loss: 0.3850, train/lr: 0.0002 |
| [[34m2026-03-01 01:15:58[0m] [Epoch 44 | Step 56200] train/loss: 0.3860, train/lr: 0.0002 |
| [[34m2026-03-01 01:16:23[0m] [Epoch 44] epoch/loss: 0.3857 |
| [[34m2026-03-01 01:16:27[0m] [Epoch 45 | Step 56300] train/loss: 0.3832, train/lr: 0.0002 |
| [[34m2026-03-01 01:16:54[0m] [Epoch 45 | Step 56400] train/loss: 0.3864, train/lr: 0.0002 |
| [[34m2026-03-01 01:17:21[0m] [Epoch 45 | Step 56500] train/loss: 0.3864, train/lr: 0.0002 |
| [[34m2026-03-01 01:17:47[0m] [Epoch 45 | Step 56600] train/loss: 0.3866, train/lr: 0.0002 |
| [[34m2026-03-01 01:18:12[0m] [Epoch 45 | Step 56700] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 01:18:36[0m] [Epoch 45 | Step 56800] train/loss: 0.3879, train/lr: 0.0002 |
| [[34m2026-03-01 01:19:00[0m] [Epoch 45 | Step 56900] train/loss: 0.3836, train/lr: 0.0002 |
| [[34m2026-03-01 01:19:24[0m] [Epoch 45 | Step 57000] train/loss: 0.3873, train/lr: 0.0002 |
| [[34m2026-03-01 01:19:48[0m] [Epoch 45 | Step 57100] train/loss: 0.3855, train/lr: 0.0002 |
| [[34m2026-03-01 01:20:13[0m] [Epoch 45 | Step 57200] train/loss: 0.3874, train/lr: 0.0002 |
| [[34m2026-03-01 01:20:37[0m] [Epoch 45 | Step 57300] train/loss: 0.3864, train/lr: 0.0002 |
| [[34m2026-03-01 01:21:01[0m] [Epoch 45 | Step 57400] train/loss: 0.3838, train/lr: 0.0002 |
| [[34m2026-03-01 01:21:25[0m] [Epoch 45 | Step 57500] train/loss: 0.3841, train/lr: 0.0002 |
| [[34m2026-03-01 01:21:36[0m] [Epoch 45] epoch/loss: 0.3857 |
| [[34m2026-03-01 01:21:52[0m] [Epoch 46 | Step 57600] train/loss: 0.3856, train/lr: 0.0002 |
| [[34m2026-03-01 01:22:16[0m] [Epoch 46 | Step 57700] train/loss: 0.3809, train/lr: 0.0002 |
| [[34m2026-03-01 01:22:40[0m] [Epoch 46 | Step 57800] train/loss: 0.3834, train/lr: 0.0002 |
| [[34m2026-03-01 01:23:04[0m] [Epoch 46 | Step 57900] train/loss: 0.3832, train/lr: 0.0002 |
| [[34m2026-03-01 01:23:29[0m] [Epoch 46 | Step 58000] train/loss: 0.3852, train/lr: 0.0002 |
| [[34m2026-03-01 01:23:53[0m] [Epoch 46 | Step 58100] train/loss: 0.3869, train/lr: 0.0002 |
| [[34m2026-03-01 01:24:18[0m] [Epoch 46 | Step 58200] train/loss: 0.3811, train/lr: 0.0002 |
| [[34m2026-03-01 01:24:42[0m] [Epoch 46 | Step 58300] train/loss: 0.3850, train/lr: 0.0002 |
| [[34m2026-03-01 01:25:07[0m] [Epoch 46 | Step 58400] train/loss: 0.3863, train/lr: 0.0002 |
| [[34m2026-03-01 01:25:32[0m] [Epoch 46 | Step 58500] train/loss: 0.3885, train/lr: 0.0002 |
| [[34m2026-03-01 01:25:57[0m] [Epoch 46 | Step 58600] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 01:26:24[0m] [Epoch 46 | Step 58700] train/loss: 0.3844, train/lr: 0.0002 |
| [[34m2026-03-01 01:26:50[0m] [Epoch 46] epoch/loss: 0.3848 |
| [[34m2026-03-01 01:26:53[0m] [Epoch 47 | Step 58800] train/loss: 0.3863, train/lr: 0.0002 |
| [[34m2026-03-01 01:27:22[0m] [Epoch 47 | Step 58900] train/loss: 0.3849, train/lr: 0.0002 |
| [[34m2026-03-01 01:27:47[0m] [Epoch 47 | Step 59000] train/loss: 0.3856, train/lr: 0.0002 |
| [[34m2026-03-01 01:28:11[0m] [Epoch 47 | Step 59100] train/loss: 0.3814, train/lr: 0.0002 |
| [[34m2026-03-01 01:28:35[0m] [Epoch 47 | Step 59200] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 01:28:59[0m] [Epoch 47 | Step 59300] train/loss: 0.3828, train/lr: 0.0002 |
| [[34m2026-03-01 01:29:23[0m] [Epoch 47 | Step 59400] train/loss: 0.3884, train/lr: 0.0002 |
| [[34m2026-03-01 01:29:47[0m] [Epoch 47 | Step 59500] train/loss: 0.3810, train/lr: 0.0002 |
| [[34m2026-03-01 01:30:12[0m] [Epoch 47 | Step 59600] train/loss: 0.3827, train/lr: 0.0002 |
| [[34m2026-03-01 01:30:36[0m] [Epoch 47 | Step 59700] train/loss: 0.3849, train/lr: 0.0002 |
| [[34m2026-03-01 01:31:00[0m] [Epoch 47 | Step 59800] train/loss: 0.3805, train/lr: 0.0002 |
| [[34m2026-03-01 01:31:24[0m] [Epoch 47 | Step 59900] train/loss: 0.3849, train/lr: 0.0002 |
| [[34m2026-03-01 01:31:48[0m] [Epoch 47 | Step 60000] train/loss: 0.3836, train/lr: 0.0002 |
| [[34m2026-03-01 01:31:48[0m] Generating EMA samples... |
| [[34m2026-03-01 01:31:49[0m] Generating EMA samples done. |
| [[34m2026-03-01 01:32:00[0m] [Epoch 47] epoch/loss: 0.3834 |
| [[34m2026-03-01 01:32:15[0m] [Epoch 48 | Step 60100] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 01:32:40[0m] [Epoch 48 | Step 60200] train/loss: 0.3855, train/lr: 0.0002 |
| [[34m2026-03-01 01:33:04[0m] [Epoch 48 | Step 60300] train/loss: 0.3820, train/lr: 0.0002 |
| [[34m2026-03-01 01:33:28[0m] [Epoch 48 | Step 60400] train/loss: 0.3833, train/lr: 0.0002 |
| [[34m2026-03-01 01:33:52[0m] [Epoch 48 | Step 60500] train/loss: 0.3841, train/lr: 0.0002 |
| [[34m2026-03-01 01:34:17[0m] [Epoch 48 | Step 60600] train/loss: 0.3841, train/lr: 0.0002 |
| [[34m2026-03-01 01:34:41[0m] [Epoch 48 | Step 60700] train/loss: 0.3833, train/lr: 0.0002 |
| [[34m2026-03-01 01:35:07[0m] [Epoch 48 | Step 60800] train/loss: 0.3850, train/lr: 0.0002 |
| [[34m2026-03-01 01:35:31[0m] [Epoch 48 | Step 60900] train/loss: 0.3831, train/lr: 0.0002 |
| [[34m2026-03-01 01:35:58[0m] [Epoch 48 | Step 61000] train/loss: 0.3835, train/lr: 0.0002 |
| [[34m2026-03-01 01:36:24[0m] [Epoch 48 | Step 61100] train/loss: 0.3839, train/lr: 0.0002 |
| [[34m2026-03-01 01:36:52[0m] [Epoch 48 | Step 61200] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 01:37:18[0m] [Epoch 48] epoch/loss: 0.3837 |
| [[34m2026-03-01 01:37:21[0m] [Epoch 49 | Step 61300] train/loss: 0.3831, train/lr: 0.0002 |
| [[34m2026-03-01 01:37:46[0m] [Epoch 49 | Step 61400] train/loss: 0.3824, train/lr: 0.0002 |
| [[34m2026-03-01 01:38:10[0m] [Epoch 49 | Step 61500] train/loss: 0.3848, train/lr: 0.0002 |
| [[34m2026-03-01 01:38:34[0m] [Epoch 49 | Step 61600] train/loss: 0.3821, train/lr: 0.0002 |
| [[34m2026-03-01 01:38:58[0m] [Epoch 49 | Step 61700] train/loss: 0.3828, train/lr: 0.0002 |
| [[34m2026-03-01 01:39:22[0m] [Epoch 49 | Step 61800] train/loss: 0.3833, train/lr: 0.0002 |
| [[34m2026-03-01 01:39:46[0m] [Epoch 49 | Step 61900] train/loss: 0.3830, train/lr: 0.0002 |
| [[34m2026-03-01 01:40:10[0m] [Epoch 49 | Step 62000] train/loss: 0.3855, train/lr: 0.0002 |
| [[34m2026-03-01 01:40:34[0m] [Epoch 49 | Step 62100] train/loss: 0.3823, train/lr: 0.0002 |
| [[34m2026-03-01 01:40:59[0m] [Epoch 49 | Step 62200] train/loss: 0.3842, train/lr: 0.0002 |
| [[34m2026-03-01 01:41:23[0m] [Epoch 49 | Step 62300] train/loss: 0.3837, train/lr: 0.0002 |
| [[34m2026-03-01 01:41:47[0m] [Epoch 49 | Step 62400] train/loss: 0.3850, train/lr: 0.0002 |
| [[34m2026-03-01 01:42:11[0m] [Epoch 49 | Step 62500] train/loss: 0.3804, train/lr: 0.0002 |
| [[34m2026-03-01 01:42:23[0m] [Epoch 49] epoch/loss: 0.3832 |
| [[34m2026-03-01 01:42:23[0m] Saving checkpoint at epoch 50... |
| [[34m2026-03-01 01:42:46[0m] [Epoch 50 | Step 62600] train/loss: 0.3826, train/lr: 0.0002 |
| [[34m2026-03-01 01:43:10[0m] [Epoch 50 | Step 62700] train/loss: 0.3820, train/lr: 0.0002 |
| [[34m2026-03-01 01:43:34[0m] [Epoch 50 | Step 62800] train/loss: 0.3834, train/lr: 0.0002 |
| [[34m2026-03-01 01:43:59[0m] [Epoch 50 | Step 62900] train/loss: 0.3808, train/lr: 0.0002 |
| [[34m2026-03-01 01:44:23[0m] [Epoch 50 | Step 63000] train/loss: 0.3849, train/lr: 0.0002 |
| [[34m2026-03-01 01:44:49[0m] [Epoch 50 | Step 63100] train/loss: 0.3853, train/lr: 0.0002 |
| [[34m2026-03-01 01:45:13[0m] [Epoch 50 | Step 63200] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 01:45:38[0m] [Epoch 50 | Step 63300] train/loss: 0.3819, train/lr: 0.0002 |
| [[34m2026-03-01 01:46:06[0m] [Epoch 50 | Step 63400] train/loss: 0.3813, train/lr: 0.0002 |
| [[34m2026-03-01 01:46:31[0m] [Epoch 50 | Step 63500] train/loss: 0.3869, train/lr: 0.0002 |
| [[34m2026-03-01 01:46:59[0m] [Epoch 50 | Step 63600] train/loss: 0.3837, train/lr: 0.0002 |
| [[34m2026-03-01 01:47:25[0m] [Epoch 50 | Step 63700] train/loss: 0.3845, train/lr: 0.0002 |
| [[34m2026-03-01 01:47:49[0m] [Epoch 50 | Step 63800] train/loss: 0.3841, train/lr: 0.0002 |
| [[34m2026-03-01 01:47:49[0m] [Epoch 50] epoch/loss: 0.3833 |
| [[34m2026-03-01 01:48:16[0m] [Epoch 51 | Step 63900] train/loss: 0.3844, train/lr: 0.0002 |
| [[34m2026-03-01 01:48:40[0m] [Epoch 51 | Step 64000] train/loss: 0.3819, train/lr: 0.0002 |
| [[34m2026-03-01 01:49:04[0m] [Epoch 51 | Step 64100] train/loss: 0.3820, train/lr: 0.0002 |
| [[34m2026-03-01 01:49:28[0m] [Epoch 51 | Step 64200] train/loss: 0.3845, train/lr: 0.0002 |
| [[34m2026-03-01 01:49:52[0m] [Epoch 51 | Step 64300] train/loss: 0.3814, train/lr: 0.0002 |
| [[34m2026-03-01 01:50:16[0m] [Epoch 51 | Step 64400] train/loss: 0.3819, train/lr: 0.0002 |
| [[34m2026-03-01 01:50:40[0m] [Epoch 51 | Step 64500] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 01:51:04[0m] [Epoch 51 | Step 64600] train/loss: 0.3804, train/lr: 0.0002 |
| [[34m2026-03-01 01:51:28[0m] [Epoch 51 | Step 64700] train/loss: 0.3822, train/lr: 0.0002 |
| [[34m2026-03-01 01:51:52[0m] [Epoch 51 | Step 64800] train/loss: 0.3850, train/lr: 0.0002 |
| [[34m2026-03-01 01:52:16[0m] [Epoch 51 | Step 64900] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 01:52:40[0m] [Epoch 51 | Step 65000] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 01:52:53[0m] [Epoch 51] epoch/loss: 0.3820 |
| [[34m2026-03-01 01:53:07[0m] [Epoch 52 | Step 65100] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 01:53:31[0m] [Epoch 52 | Step 65200] train/loss: 0.3842, train/lr: 0.0002 |
| [[34m2026-03-01 01:53:56[0m] [Epoch 52 | Step 65300] train/loss: 0.3796, train/lr: 0.0002 |
| [[34m2026-03-01 01:54:21[0m] [Epoch 52 | Step 65400] train/loss: 0.3802, train/lr: 0.0002 |
| [[34m2026-03-01 01:54:46[0m] [Epoch 52 | Step 65500] train/loss: 0.3788, train/lr: 0.0002 |
| [[34m2026-03-01 01:55:10[0m] [Epoch 52 | Step 65600] train/loss: 0.3847, train/lr: 0.0002 |
| [[34m2026-03-01 01:55:37[0m] [Epoch 52 | Step 65700] train/loss: 0.3827, train/lr: 0.0002 |
| [[34m2026-03-01 01:56:03[0m] [Epoch 52 | Step 65800] train/loss: 0.3814, train/lr: 0.0002 |
| [[34m2026-03-01 01:56:30[0m] [Epoch 52 | Step 65900] train/loss: 0.3806, train/lr: 0.0002 |
| [[34m2026-03-01 01:56:57[0m] [Epoch 52 | Step 66000] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 01:57:22[0m] [Epoch 52 | Step 66100] train/loss: 0.3815, train/lr: 0.0002 |
| [[34m2026-03-01 01:57:46[0m] [Epoch 52 | Step 66200] train/loss: 0.3811, train/lr: 0.0002 |
| [[34m2026-03-01 01:58:10[0m] [Epoch 52 | Step 66300] train/loss: 0.3833, train/lr: 0.0002 |
| [[34m2026-03-01 01:58:11[0m] [Epoch 52] epoch/loss: 0.3816 |
| [[34m2026-03-01 01:58:37[0m] [Epoch 53 | Step 66400] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 01:59:01[0m] [Epoch 53 | Step 66500] train/loss: 0.3786, train/lr: 0.0002 |
| [[34m2026-03-01 01:59:25[0m] [Epoch 53 | Step 66600] train/loss: 0.3837, train/lr: 0.0002 |
| [[34m2026-03-01 01:59:49[0m] [Epoch 53 | Step 66700] train/loss: 0.3815, train/lr: 0.0002 |
| [[34m2026-03-01 02:00:13[0m] [Epoch 53 | Step 66800] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:00:37[0m] [Epoch 53 | Step 66900] train/loss: 0.3800, train/lr: 0.0002 |
| [[34m2026-03-01 02:01:02[0m] [Epoch 53 | Step 67000] train/loss: 0.3835, train/lr: 0.0002 |
| [[34m2026-03-01 02:01:26[0m] [Epoch 53 | Step 67100] train/loss: 0.3839, train/lr: 0.0002 |
| [[34m2026-03-01 02:01:50[0m] [Epoch 53 | Step 67200] train/loss: 0.3783, train/lr: 0.0002 |
| [[34m2026-03-01 02:02:14[0m] [Epoch 53 | Step 67300] train/loss: 0.3789, train/lr: 0.0002 |
| [[34m2026-03-01 02:02:38[0m] [Epoch 53 | Step 67400] train/loss: 0.3826, train/lr: 0.0002 |
| [[34m2026-03-01 02:03:02[0m] [Epoch 53 | Step 67500] train/loss: 0.3835, train/lr: 0.0002 |
| [[34m2026-03-01 02:03:15[0m] [Epoch 53] epoch/loss: 0.3814 |
| [[34m2026-03-01 02:03:29[0m] [Epoch 54 | Step 67600] train/loss: 0.3835, train/lr: 0.0002 |
| [[34m2026-03-01 02:03:54[0m] [Epoch 54 | Step 67700] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:04:20[0m] [Epoch 54 | Step 67800] train/loss: 0.3791, train/lr: 0.0002 |
| [[34m2026-03-01 02:04:44[0m] [Epoch 54 | Step 67900] train/loss: 0.3798, train/lr: 0.0002 |
| [[34m2026-03-01 02:05:09[0m] [Epoch 54 | Step 68000] train/loss: 0.3822, train/lr: 0.0002 |
| [[34m2026-03-01 02:05:36[0m] [Epoch 54 | Step 68100] train/loss: 0.3830, train/lr: 0.0002 |
| [[34m2026-03-01 02:06:02[0m] [Epoch 54 | Step 68200] train/loss: 0.3840, train/lr: 0.0002 |
| [[34m2026-03-01 02:06:30[0m] [Epoch 54 | Step 68300] train/loss: 0.3835, train/lr: 0.0002 |
| [[34m2026-03-01 02:06:56[0m] [Epoch 54 | Step 68400] train/loss: 0.3784, train/lr: 0.0002 |
| [[34m2026-03-01 02:07:20[0m] [Epoch 54 | Step 68500] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 02:07:44[0m] [Epoch 54 | Step 68600] train/loss: 0.3829, train/lr: 0.0002 |
| [[34m2026-03-01 02:08:08[0m] [Epoch 54 | Step 68700] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:08:32[0m] [Epoch 54 | Step 68800] train/loss: 0.3803, train/lr: 0.0002 |
| [[34m2026-03-01 02:08:34[0m] [Epoch 54] epoch/loss: 0.3816 |
| [[34m2026-03-01 02:08:59[0m] [Epoch 55 | Step 68900] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:09:23[0m] [Epoch 55 | Step 69000] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 02:09:47[0m] [Epoch 55 | Step 69100] train/loss: 0.3798, train/lr: 0.0002 |
| [[34m2026-03-01 02:10:11[0m] [Epoch 55 | Step 69200] train/loss: 0.3810, train/lr: 0.0002 |
| [[34m2026-03-01 02:10:35[0m] [Epoch 55 | Step 69300] train/loss: 0.3827, train/lr: 0.0002 |
| [[34m2026-03-01 02:11:00[0m] [Epoch 55 | Step 69400] train/loss: 0.3803, train/lr: 0.0002 |
| [[34m2026-03-01 02:11:24[0m] [Epoch 55 | Step 69500] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:11:48[0m] [Epoch 55 | Step 69600] train/loss: 0.3805, train/lr: 0.0002 |
| [[34m2026-03-01 02:12:12[0m] [Epoch 55 | Step 69700] train/loss: 0.3859, train/lr: 0.0002 |
| [[34m2026-03-01 02:12:36[0m] [Epoch 55 | Step 69800] train/loss: 0.3803, train/lr: 0.0002 |
| [[34m2026-03-01 02:13:00[0m] [Epoch 55 | Step 69900] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 02:13:25[0m] [Epoch 55 | Step 70000] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 02:13:25[0m] Generating EMA samples... |
| [[34m2026-03-01 02:13:26[0m] Generating EMA samples done. |
| [[34m2026-03-01 02:13:39[0m] [Epoch 55] epoch/loss: 0.3812 |
| [[34m2026-03-01 02:13:53[0m] [Epoch 56 | Step 70100] train/loss: 0.3827, train/lr: 0.0002 |
| [[34m2026-03-01 02:14:18[0m] [Epoch 56 | Step 70200] train/loss: 0.3794, train/lr: 0.0002 |
| [[34m2026-03-01 02:14:42[0m] [Epoch 56 | Step 70300] train/loss: 0.3822, train/lr: 0.0002 |
| [[34m2026-03-01 02:15:09[0m] [Epoch 56 | Step 70400] train/loss: 0.3800, train/lr: 0.0002 |
| [[34m2026-03-01 02:15:35[0m] [Epoch 56 | Step 70500] train/loss: 0.3797, train/lr: 0.0002 |
| [[34m2026-03-01 02:16:02[0m] [Epoch 56 | Step 70600] train/loss: 0.3830, train/lr: 0.0002 |
| [[34m2026-03-01 02:16:29[0m] [Epoch 56 | Step 70700] train/loss: 0.3826, train/lr: 0.0002 |
| [[34m2026-03-01 02:16:54[0m] [Epoch 56 | Step 70800] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:17:18[0m] [Epoch 56 | Step 70900] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 02:17:42[0m] [Epoch 56 | Step 71000] train/loss: 0.3808, train/lr: 0.0002 |
| [[34m2026-03-01 02:18:06[0m] [Epoch 56 | Step 71100] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 02:18:30[0m] [Epoch 56 | Step 71200] train/loss: 0.3840, train/lr: 0.0002 |
| [[34m2026-03-01 02:18:54[0m] [Epoch 56 | Step 71300] train/loss: 0.3821, train/lr: 0.0002 |
| [[34m2026-03-01 02:18:56[0m] [Epoch 56] epoch/loss: 0.3816 |
| [[34m2026-03-01 02:19:21[0m] [Epoch 57 | Step 71400] train/loss: 0.3829, train/lr: 0.0002 |
| [[34m2026-03-01 02:19:46[0m] [Epoch 57 | Step 71500] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 02:20:10[0m] [Epoch 57 | Step 71600] train/loss: 0.3791, train/lr: 0.0002 |
| [[34m2026-03-01 02:20:34[0m] [Epoch 57 | Step 71700] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 02:20:58[0m] [Epoch 57 | Step 71800] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 02:21:22[0m] [Epoch 57 | Step 71900] train/loss: 0.3802, train/lr: 0.0002 |
| [[34m2026-03-01 02:21:46[0m] [Epoch 57 | Step 72000] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 02:22:10[0m] [Epoch 57 | Step 72100] train/loss: 0.3780, train/lr: 0.0002 |
| [[34m2026-03-01 02:22:34[0m] [Epoch 57 | Step 72200] train/loss: 0.3799, train/lr: 0.0002 |
| [[34m2026-03-01 02:22:59[0m] [Epoch 57 | Step 72300] train/loss: 0.3786, train/lr: 0.0002 |
| [[34m2026-03-01 02:23:23[0m] [Epoch 57 | Step 72400] train/loss: 0.3825, train/lr: 0.0002 |
| [[34m2026-03-01 02:23:47[0m] [Epoch 57 | Step 72500] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 02:24:03[0m] [Epoch 57] epoch/loss: 0.3793 |
| [[34m2026-03-01 02:24:16[0m] [Epoch 58 | Step 72600] train/loss: 0.3806, train/lr: 0.0002 |
| [[34m2026-03-01 02:24:41[0m] [Epoch 58 | Step 72700] train/loss: 0.3783, train/lr: 0.0002 |
| [[34m2026-03-01 02:25:08[0m] [Epoch 58 | Step 72800] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 02:25:34[0m] [Epoch 58 | Step 72900] train/loss: 0.3811, train/lr: 0.0002 |
| [[34m2026-03-01 02:26:02[0m] [Epoch 58 | Step 73000] train/loss: 0.3820, train/lr: 0.0002 |
| [[34m2026-03-01 02:26:27[0m] [Epoch 58 | Step 73100] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 02:26:51[0m] [Epoch 58 | Step 73200] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:27:15[0m] [Epoch 58 | Step 73300] train/loss: 0.3778, train/lr: 0.0002 |
| [[34m2026-03-01 02:27:40[0m] [Epoch 58 | Step 73400] train/loss: 0.3792, train/lr: 0.0002 |
| [[34m2026-03-01 02:28:04[0m] [Epoch 58 | Step 73500] train/loss: 0.3778, train/lr: 0.0002 |
| [[34m2026-03-01 02:28:28[0m] [Epoch 58 | Step 73600] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 02:28:52[0m] [Epoch 58 | Step 73700] train/loss: 0.3833, train/lr: 0.0002 |
| [[34m2026-03-01 02:29:16[0m] [Epoch 58 | Step 73800] train/loss: 0.3816, train/lr: 0.0002 |
| [[34m2026-03-01 02:29:18[0m] [Epoch 58] epoch/loss: 0.3800 |
| [[34m2026-03-01 02:29:43[0m] [Epoch 59 | Step 73900] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:30:07[0m] [Epoch 59 | Step 74000] train/loss: 0.3788, train/lr: 0.0002 |
| [[34m2026-03-01 02:30:31[0m] [Epoch 59 | Step 74100] train/loss: 0.3786, train/lr: 0.0002 |
| [[34m2026-03-01 02:30:55[0m] [Epoch 59 | Step 74200] train/loss: 0.3822, train/lr: 0.0002 |
| [[34m2026-03-01 02:31:19[0m] [Epoch 59 | Step 74300] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 02:31:43[0m] [Epoch 59 | Step 74400] train/loss: 0.3762, train/lr: 0.0002 |
| [[34m2026-03-01 02:32:08[0m] [Epoch 59 | Step 74500] train/loss: 0.3799, train/lr: 0.0002 |
| [[34m2026-03-01 02:32:32[0m] [Epoch 59 | Step 74600] train/loss: 0.3775, train/lr: 0.0002 |
| [[34m2026-03-01 02:32:56[0m] [Epoch 59 | Step 74700] train/loss: 0.3808, train/lr: 0.0002 |
| [[34m2026-03-01 02:33:21[0m] [Epoch 59 | Step 74800] train/loss: 0.3789, train/lr: 0.0002 |
| [[34m2026-03-01 02:33:45[0m] [Epoch 59 | Step 74900] train/loss: 0.3802, train/lr: 0.0002 |
| [[34m2026-03-01 02:34:10[0m] [Epoch 59 | Step 75000] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 02:34:26[0m] [Epoch 59] epoch/loss: 0.3792 |
| [[34m2026-03-01 02:34:26[0m] Saving checkpoint at epoch 60... |
| [[34m2026-03-01 02:34:47[0m] [Epoch 60 | Step 75100] train/loss: 0.3743, train/lr: 0.0002 |
| [[34m2026-03-01 02:35:13[0m] [Epoch 60 | Step 75200] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:35:39[0m] [Epoch 60 | Step 75300] train/loss: 0.3773, train/lr: 0.0002 |
| [[34m2026-03-01 02:36:07[0m] [Epoch 60 | Step 75400] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:36:32[0m] [Epoch 60 | Step 75500] train/loss: 0.3781, train/lr: 0.0002 |
| [[34m2026-03-01 02:36:56[0m] [Epoch 60 | Step 75600] train/loss: 0.3783, train/lr: 0.0002 |
| [[34m2026-03-01 02:37:20[0m] [Epoch 60 | Step 75700] train/loss: 0.3817, train/lr: 0.0002 |
| [[34m2026-03-01 02:37:44[0m] [Epoch 60 | Step 75800] train/loss: 0.3800, train/lr: 0.0002 |
| [[34m2026-03-01 02:38:08[0m] [Epoch 60 | Step 75900] train/loss: 0.3808, train/lr: 0.0002 |
| [[34m2026-03-01 02:38:32[0m] [Epoch 60 | Step 76000] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 02:38:57[0m] [Epoch 60 | Step 76100] train/loss: 0.3795, train/lr: 0.0002 |
| [[34m2026-03-01 02:39:21[0m] [Epoch 60 | Step 76200] train/loss: 0.3785, train/lr: 0.0002 |
| [[34m2026-03-01 02:39:45[0m] [Epoch 60 | Step 76300] train/loss: 0.3796, train/lr: 0.0002 |
| [[34m2026-03-01 02:39:48[0m] [Epoch 60] epoch/loss: 0.3791 |
| [[34m2026-03-01 02:40:12[0m] [Epoch 61 | Step 76400] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:40:36[0m] [Epoch 61 | Step 76500] train/loss: 0.3773, train/lr: 0.0002 |
| [[34m2026-03-01 02:41:00[0m] [Epoch 61 | Step 76600] train/loss: 0.3776, train/lr: 0.0002 |
| [[34m2026-03-01 02:41:24[0m] [Epoch 61 | Step 76700] train/loss: 0.3796, train/lr: 0.0002 |
| [[34m2026-03-01 02:41:48[0m] [Epoch 61 | Step 76800] train/loss: 0.3757, train/lr: 0.0002 |
| [[34m2026-03-01 02:42:12[0m] [Epoch 61 | Step 76900] train/loss: 0.3824, train/lr: 0.0002 |
| [[34m2026-03-01 02:42:36[0m] [Epoch 61 | Step 77000] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 02:43:01[0m] [Epoch 61 | Step 77100] train/loss: 0.3777, train/lr: 0.0002 |
| [[34m2026-03-01 02:43:26[0m] [Epoch 61 | Step 77200] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 02:43:51[0m] [Epoch 61 | Step 77300] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 02:44:16[0m] [Epoch 61 | Step 77400] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 02:44:44[0m] [Epoch 61 | Step 77500] train/loss: 0.3768, train/lr: 0.0002 |
| [[34m2026-03-01 02:44:59[0m] [Epoch 61] epoch/loss: 0.3785 |
| [[34m2026-03-01 02:45:12[0m] [Epoch 62 | Step 77600] train/loss: 0.3807, train/lr: 0.0002 |
| [[34m2026-03-01 02:45:39[0m] [Epoch 62 | Step 77700] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 02:46:06[0m] [Epoch 62 | Step 77800] train/loss: 0.3797, train/lr: 0.0002 |
| [[34m2026-03-01 02:46:30[0m] [Epoch 62 | Step 77900] train/loss: 0.3744, train/lr: 0.0002 |
| [[34m2026-03-01 02:46:54[0m] [Epoch 62 | Step 78000] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:47:18[0m] [Epoch 62 | Step 78100] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 02:47:42[0m] [Epoch 62 | Step 78200] train/loss: 0.3784, train/lr: 0.0002 |
| [[34m2026-03-01 02:48:06[0m] [Epoch 62 | Step 78300] train/loss: 0.3801, train/lr: 0.0002 |
| [[34m2026-03-01 02:48:30[0m] [Epoch 62 | Step 78400] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 02:48:55[0m] [Epoch 62 | Step 78500] train/loss: 0.3773, train/lr: 0.0002 |
| [[34m2026-03-01 02:49:19[0m] [Epoch 62 | Step 78600] train/loss: 0.3759, train/lr: 0.0002 |
| [[34m2026-03-01 02:49:43[0m] [Epoch 62 | Step 78700] train/loss: 0.3815, train/lr: 0.0002 |
| [[34m2026-03-01 02:50:07[0m] [Epoch 62 | Step 78800] train/loss: 0.3808, train/lr: 0.0002 |
| [[34m2026-03-01 02:50:10[0m] [Epoch 62] epoch/loss: 0.3783 |
| [[34m2026-03-01 02:50:34[0m] [Epoch 63 | Step 78900] train/loss: 0.3800, train/lr: 0.0002 |
| [[34m2026-03-01 02:50:58[0m] [Epoch 63 | Step 79000] train/loss: 0.3812, train/lr: 0.0002 |
| [[34m2026-03-01 02:51:22[0m] [Epoch 63 | Step 79100] train/loss: 0.3748, train/lr: 0.0002 |
| [[34m2026-03-01 02:51:46[0m] [Epoch 63 | Step 79200] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 02:52:10[0m] [Epoch 63 | Step 79300] train/loss: 0.3780, train/lr: 0.0002 |
| [[34m2026-03-01 02:52:34[0m] [Epoch 63 | Step 79400] train/loss: 0.3790, train/lr: 0.0002 |
| [[34m2026-03-01 02:52:59[0m] [Epoch 63 | Step 79500] train/loss: 0.3765, train/lr: 0.0002 |
| [[34m2026-03-01 02:53:24[0m] [Epoch 63 | Step 79600] train/loss: 0.3769, train/lr: 0.0002 |
| [[34m2026-03-01 02:53:49[0m] [Epoch 63 | Step 79700] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 02:54:16[0m] [Epoch 63 | Step 79800] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 02:54:40[0m] [Epoch 63 | Step 79900] train/loss: 0.3756, train/lr: 0.0002 |
| [[34m2026-03-01 02:55:08[0m] [Epoch 63 | Step 80000] train/loss: 0.3800, train/lr: 0.0002 |
| [[34m2026-03-01 02:55:08[0m] Generating EMA samples... |
| [[34m2026-03-01 02:55:09[0m] Generating EMA samples done. |
| [[34m2026-03-01 02:55:25[0m] [Epoch 63] epoch/loss: 0.3770 |
| [[34m2026-03-01 02:55:37[0m] [Epoch 64 | Step 80100] train/loss: 0.3753, train/lr: 0.0002 |
| [[34m2026-03-01 02:56:03[0m] [Epoch 64 | Step 80200] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 02:56:27[0m] [Epoch 64 | Step 80300] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 02:56:51[0m] [Epoch 64 | Step 80400] train/loss: 0.3806, train/lr: 0.0002 |
| [[34m2026-03-01 02:57:15[0m] [Epoch 64 | Step 80500] train/loss: 0.3804, train/lr: 0.0002 |
| [[34m2026-03-01 02:57:39[0m] [Epoch 64 | Step 80600] train/loss: 0.3804, train/lr: 0.0002 |
| [[34m2026-03-01 02:58:03[0m] [Epoch 64 | Step 80700] train/loss: 0.3782, train/lr: 0.0002 |
| [[34m2026-03-01 02:58:27[0m] [Epoch 64 | Step 80800] train/loss: 0.3777, train/lr: 0.0002 |
| [[34m2026-03-01 02:58:51[0m] [Epoch 64 | Step 80900] train/loss: 0.3782, train/lr: 0.0002 |
| [[34m2026-03-01 02:59:15[0m] [Epoch 64 | Step 81000] train/loss: 0.3780, train/lr: 0.0002 |
| [[34m2026-03-01 02:59:39[0m] [Epoch 64 | Step 81100] train/loss: 0.3772, train/lr: 0.0002 |
| [[34m2026-03-01 03:00:03[0m] [Epoch 64 | Step 81200] train/loss: 0.3791, train/lr: 0.0002 |
| [[34m2026-03-01 03:00:28[0m] [Epoch 64 | Step 81300] train/loss: 0.3768, train/lr: 0.0002 |
| [[34m2026-03-01 03:00:31[0m] [Epoch 64] epoch/loss: 0.3785 |
| [[34m2026-03-01 03:00:54[0m] [Epoch 65 | Step 81400] train/loss: 0.3759, train/lr: 0.0002 |
| [[34m2026-03-01 03:01:18[0m] [Epoch 65 | Step 81500] train/loss: 0.3767, train/lr: 0.0002 |
| [[34m2026-03-01 03:01:43[0m] [Epoch 65 | Step 81600] train/loss: 0.3784, train/lr: 0.0002 |
| [[34m2026-03-01 03:02:06[0m] [Epoch 65 | Step 81700] train/loss: 0.3776, train/lr: 0.0002 |
| [[34m2026-03-01 03:02:31[0m] [Epoch 65 | Step 81800] train/loss: 0.3728, train/lr: 0.0002 |
| [[34m2026-03-01 03:02:56[0m] [Epoch 65 | Step 81900] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 03:03:21[0m] [Epoch 65 | Step 82000] train/loss: 0.3767, train/lr: 0.0002 |
| [[34m2026-03-01 03:03:47[0m] [Epoch 65 | Step 82100] train/loss: 0.3775, train/lr: 0.0002 |
| [[34m2026-03-01 03:04:14[0m] [Epoch 65 | Step 82200] train/loss: 0.3776, train/lr: 0.0002 |
| [[34m2026-03-01 03:04:40[0m] [Epoch 65 | Step 82300] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 03:05:06[0m] [Epoch 65 | Step 82400] train/loss: 0.3789, train/lr: 0.0002 |
| [[34m2026-03-01 03:05:32[0m] [Epoch 65 | Step 82500] train/loss: 0.3748, train/lr: 0.0002 |
| [[34m2026-03-01 03:05:48[0m] [Epoch 65] epoch/loss: 0.3767 |
| [[34m2026-03-01 03:05:59[0m] [Epoch 66 | Step 82600] train/loss: 0.3753, train/lr: 0.0002 |
| [[34m2026-03-01 03:06:23[0m] [Epoch 66 | Step 82700] train/loss: 0.3761, train/lr: 0.0002 |
| [[34m2026-03-01 03:06:47[0m] [Epoch 66 | Step 82800] train/loss: 0.3783, train/lr: 0.0002 |
| [[34m2026-03-01 03:07:11[0m] [Epoch 66 | Step 82900] train/loss: 0.3810, train/lr: 0.0002 |
| [[34m2026-03-01 03:07:35[0m] [Epoch 66 | Step 83000] train/loss: 0.3777, train/lr: 0.0002 |
| [[34m2026-03-01 03:07:59[0m] [Epoch 66 | Step 83100] train/loss: 0.3785, train/lr: 0.0002 |
| [[34m2026-03-01 03:08:23[0m] [Epoch 66 | Step 83200] train/loss: 0.3781, train/lr: 0.0002 |
| [[34m2026-03-01 03:08:47[0m] [Epoch 66 | Step 83300] train/loss: 0.3784, train/lr: 0.0002 |
| [[34m2026-03-01 03:09:12[0m] [Epoch 66 | Step 83400] train/loss: 0.3760, train/lr: 0.0002 |
| [[34m2026-03-01 03:09:36[0m] [Epoch 66 | Step 83500] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 03:10:00[0m] [Epoch 66 | Step 83600] train/loss: 0.3774, train/lr: 0.0002 |
| [[34m2026-03-01 03:10:24[0m] [Epoch 66 | Step 83700] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 03:10:48[0m] [Epoch 66 | Step 83800] train/loss: 0.3774, train/lr: 0.0002 |
| [[34m2026-03-01 03:10:52[0m] [Epoch 66] epoch/loss: 0.3773 |
| [[34m2026-03-01 03:11:15[0m] [Epoch 67 | Step 83900] train/loss: 0.3755, train/lr: 0.0002 |
| [[34m2026-03-01 03:11:39[0m] [Epoch 67 | Step 84000] train/loss: 0.3778, train/lr: 0.0002 |
| [[34m2026-03-01 03:12:03[0m] [Epoch 67 | Step 84100] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 03:12:28[0m] [Epoch 67 | Step 84200] train/loss: 0.3772, train/lr: 0.0002 |
| [[34m2026-03-01 03:12:53[0m] [Epoch 67 | Step 84300] train/loss: 0.3732, train/lr: 0.0002 |
| [[34m2026-03-01 03:13:18[0m] [Epoch 67 | Step 84400] train/loss: 0.3763, train/lr: 0.0002 |
| [[34m2026-03-01 03:13:45[0m] [Epoch 67 | Step 84500] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 03:14:11[0m] [Epoch 67 | Step 84600] train/loss: 0.3782, train/lr: 0.0002 |
| [[34m2026-03-01 03:14:38[0m] [Epoch 67 | Step 84700] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 03:15:04[0m] [Epoch 67 | Step 84800] train/loss: 0.3748, train/lr: 0.0002 |
| [[34m2026-03-01 03:15:30[0m] [Epoch 67 | Step 84900] train/loss: 0.3785, train/lr: 0.0002 |
| [[34m2026-03-01 03:15:54[0m] [Epoch 67 | Step 85000] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 03:16:11[0m] [Epoch 67] epoch/loss: 0.3765 |
| [[34m2026-03-01 03:16:21[0m] [Epoch 68 | Step 85100] train/loss: 0.3756, train/lr: 0.0002 |
| [[34m2026-03-01 03:16:45[0m] [Epoch 68 | Step 85200] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 03:17:09[0m] [Epoch 68 | Step 85300] train/loss: 0.3750, train/lr: 0.0002 |
| [[34m2026-03-01 03:17:33[0m] [Epoch 68 | Step 85400] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 03:17:58[0m] [Epoch 68 | Step 85500] train/loss: 0.3753, train/lr: 0.0002 |
| [[34m2026-03-01 03:18:22[0m] [Epoch 68 | Step 85600] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 03:18:46[0m] [Epoch 68 | Step 85700] train/loss: 0.3766, train/lr: 0.0002 |
| [[34m2026-03-01 03:19:10[0m] [Epoch 68 | Step 85800] train/loss: 0.3776, train/lr: 0.0002 |
| [[34m2026-03-01 03:19:34[0m] [Epoch 68 | Step 85900] train/loss: 0.3778, train/lr: 0.0002 |
| [[34m2026-03-01 03:19:58[0m] [Epoch 68 | Step 86000] train/loss: 0.3779, train/lr: 0.0002 |
| [[34m2026-03-01 03:20:22[0m] [Epoch 68 | Step 86100] train/loss: 0.3783, train/lr: 0.0002 |
| [[34m2026-03-01 03:20:46[0m] [Epoch 68 | Step 86200] train/loss: 0.3729, train/lr: 0.0002 |
| [[34m2026-03-01 03:21:11[0m] [Epoch 68 | Step 86300] train/loss: 0.3753, train/lr: 0.0002 |
| [[34m2026-03-01 03:21:15[0m] [Epoch 68] epoch/loss: 0.3760 |
| [[34m2026-03-01 03:21:38[0m] [Epoch 69 | Step 86400] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 03:22:03[0m] [Epoch 69 | Step 86500] train/loss: 0.3745, train/lr: 0.0002 |
| [[34m2026-03-01 03:22:28[0m] [Epoch 69 | Step 86600] train/loss: 0.3769, train/lr: 0.0002 |
| [[34m2026-03-01 03:22:53[0m] [Epoch 69 | Step 86700] train/loss: 0.3748, train/lr: 0.0002 |
| [[34m2026-03-01 03:23:18[0m] [Epoch 69 | Step 86800] train/loss: 0.3730, train/lr: 0.0002 |
| [[34m2026-03-01 03:23:44[0m] [Epoch 69 | Step 86900] train/loss: 0.3731, train/lr: 0.0002 |
| [[34m2026-03-01 03:24:11[0m] [Epoch 69 | Step 87000] train/loss: 0.3779, train/lr: 0.0002 |
| [[34m2026-03-01 03:24:37[0m] [Epoch 69 | Step 87100] train/loss: 0.3767, train/lr: 0.0002 |
| [[34m2026-03-01 03:25:04[0m] [Epoch 69 | Step 87200] train/loss: 0.3750, train/lr: 0.0002 |
| [[34m2026-03-01 03:25:28[0m] [Epoch 69 | Step 87300] train/loss: 0.3749, train/lr: 0.0002 |
| [[34m2026-03-01 03:25:52[0m] [Epoch 69 | Step 87400] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 03:26:16[0m] [Epoch 69 | Step 87500] train/loss: 0.3769, train/lr: 0.0002 |
| [[34m2026-03-01 03:26:33[0m] [Epoch 69] epoch/loss: 0.3754 |
| [[34m2026-03-01 03:26:33[0m] Saving checkpoint at epoch 70... |
| [[34m2026-03-01 03:26:51[0m] [Epoch 70 | Step 87600] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 03:27:15[0m] [Epoch 70 | Step 87700] train/loss: 0.3750, train/lr: 0.0002 |
| [[34m2026-03-01 03:27:39[0m] [Epoch 70 | Step 87800] train/loss: 0.3758, train/lr: 0.0002 |
| [[34m2026-03-01 03:28:03[0m] [Epoch 70 | Step 87900] train/loss: 0.3765, train/lr: 0.0002 |
| [[34m2026-03-01 03:28:28[0m] [Epoch 70 | Step 88000] train/loss: 0.3766, train/lr: 0.0002 |
| [[34m2026-03-01 03:28:52[0m] [Epoch 70 | Step 88100] train/loss: 0.3742, train/lr: 0.0002 |
| [[34m2026-03-01 03:29:16[0m] [Epoch 70 | Step 88200] train/loss: 0.3744, train/lr: 0.0002 |
| [[34m2026-03-01 03:29:40[0m] [Epoch 70 | Step 88300] train/loss: 0.3730, train/lr: 0.0002 |
| [[34m2026-03-01 03:30:04[0m] [Epoch 70 | Step 88400] train/loss: 0.3757, train/lr: 0.0002 |
| [[34m2026-03-01 03:30:28[0m] [Epoch 70 | Step 88500] train/loss: 0.3772, train/lr: 0.0002 |
| [[34m2026-03-01 03:30:52[0m] [Epoch 70 | Step 88600] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 03:31:16[0m] [Epoch 70 | Step 88700] train/loss: 0.3754, train/lr: 0.0002 |
| [[34m2026-03-01 03:31:41[0m] [Epoch 70 | Step 88800] train/loss: 0.3784, train/lr: 0.0002 |
| [[34m2026-03-01 03:31:46[0m] [Epoch 70] epoch/loss: 0.3756 |
| [[34m2026-03-01 03:32:08[0m] [Epoch 71 | Step 88900] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 03:32:33[0m] [Epoch 71 | Step 89000] train/loss: 0.3724, train/lr: 0.0002 |
| [[34m2026-03-01 03:33:00[0m] [Epoch 71 | Step 89100] train/loss: 0.3759, train/lr: 0.0002 |
| [[34m2026-03-01 03:33:26[0m] [Epoch 71 | Step 89200] train/loss: 0.3762, train/lr: 0.0002 |
| [[34m2026-03-01 03:33:51[0m] [Epoch 71 | Step 89300] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 03:34:19[0m] [Epoch 71 | Step 89400] train/loss: 0.3715, train/lr: 0.0002 |
| [[34m2026-03-01 03:34:46[0m] [Epoch 71 | Step 89500] train/loss: 0.3738, train/lr: 0.0002 |
| [[34m2026-03-01 03:35:11[0m] [Epoch 71 | Step 89600] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 03:35:36[0m] [Epoch 71 | Step 89700] train/loss: 0.3764, train/lr: 0.0002 |
| [[34m2026-03-01 03:36:00[0m] [Epoch 71 | Step 89800] train/loss: 0.3737, train/lr: 0.0002 |
| [[34m2026-03-01 03:36:24[0m] [Epoch 71 | Step 89900] train/loss: 0.3757, train/lr: 0.0002 |
| [[34m2026-03-01 03:36:48[0m] [Epoch 71 | Step 90000] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 03:36:48[0m] Generating EMA samples... |
| [[34m2026-03-01 03:36:49[0m] Generating EMA samples done. |
| [[34m2026-03-01 03:37:06[0m] [Epoch 71] epoch/loss: 0.3738 |
| [[34m2026-03-01 03:37:16[0m] [Epoch 72 | Step 90100] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 03:37:40[0m] [Epoch 72 | Step 90200] train/loss: 0.3744, train/lr: 0.0002 |
| [[34m2026-03-01 03:38:04[0m] [Epoch 72 | Step 90300] train/loss: 0.3740, train/lr: 0.0002 |
| [[34m2026-03-01 03:38:28[0m] [Epoch 72 | Step 90400] train/loss: 0.3774, train/lr: 0.0002 |
| [[34m2026-03-01 03:38:52[0m] [Epoch 72 | Step 90500] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 03:39:16[0m] [Epoch 72 | Step 90600] train/loss: 0.3762, train/lr: 0.0002 |
| [[34m2026-03-01 03:39:40[0m] [Epoch 72 | Step 90700] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 03:40:04[0m] [Epoch 72 | Step 90800] train/loss: 0.3756, train/lr: 0.0002 |
| [[34m2026-03-01 03:40:28[0m] [Epoch 72 | Step 90900] train/loss: 0.3789, train/lr: 0.0002 |
| [[34m2026-03-01 03:40:52[0m] [Epoch 72 | Step 91000] train/loss: 0.3803, train/lr: 0.0002 |
| [[34m2026-03-01 03:41:17[0m] [Epoch 72 | Step 91100] train/loss: 0.3750, train/lr: 0.0002 |
| [[34m2026-03-01 03:41:41[0m] [Epoch 72 | Step 91200] train/loss: 0.3748, train/lr: 0.0002 |
| [[34m2026-03-01 03:42:06[0m] [Epoch 72 | Step 91300] train/loss: 0.3725, train/lr: 0.0002 |
| [[34m2026-03-01 03:42:12[0m] [Epoch 72] epoch/loss: 0.3754 |
| [[34m2026-03-01 03:42:34[0m] [Epoch 73 | Step 91400] train/loss: 0.3757, train/lr: 0.0002 |
| [[34m2026-03-01 03:43:00[0m] [Epoch 73 | Step 91500] train/loss: 0.3761, train/lr: 0.0002 |
| [[34m2026-03-01 03:43:27[0m] [Epoch 73 | Step 91600] train/loss: 0.3749, train/lr: 0.0002 |
| [[34m2026-03-01 03:43:53[0m] [Epoch 73 | Step 91700] train/loss: 0.3766, train/lr: 0.0002 |
| [[34m2026-03-01 03:44:19[0m] [Epoch 73 | Step 91800] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 03:44:45[0m] [Epoch 73 | Step 91900] train/loss: 0.3741, train/lr: 0.0002 |
| [[34m2026-03-01 03:45:10[0m] [Epoch 73 | Step 92000] train/loss: 0.3739, train/lr: 0.0002 |
| [[34m2026-03-01 03:45:34[0m] [Epoch 73 | Step 92100] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 03:45:58[0m] [Epoch 73 | Step 92200] train/loss: 0.3735, train/lr: 0.0002 |
| [[34m2026-03-01 03:46:22[0m] [Epoch 73 | Step 92300] train/loss: 0.3773, train/lr: 0.0002 |
| [[34m2026-03-01 03:46:46[0m] [Epoch 73 | Step 92400] train/loss: 0.3768, train/lr: 0.0002 |
| [[34m2026-03-01 03:47:11[0m] [Epoch 73 | Step 92500] train/loss: 0.3761, train/lr: 0.0002 |
| [[34m2026-03-01 03:47:28[0m] [Epoch 73] epoch/loss: 0.3747 |
| [[34m2026-03-01 03:47:37[0m] [Epoch 74 | Step 92600] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 03:48:01[0m] [Epoch 74 | Step 92700] train/loss: 0.3707, train/lr: 0.0002 |
| [[34m2026-03-01 03:48:26[0m] [Epoch 74 | Step 92800] train/loss: 0.3738, train/lr: 0.0002 |
| [[34m2026-03-01 03:48:50[0m] [Epoch 74 | Step 92900] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 03:49:14[0m] [Epoch 74 | Step 93000] train/loss: 0.3760, train/lr: 0.0002 |
| [[34m2026-03-01 03:49:38[0m] [Epoch 74 | Step 93100] train/loss: 0.3770, train/lr: 0.0002 |
| [[34m2026-03-01 03:50:02[0m] [Epoch 74 | Step 93200] train/loss: 0.3737, train/lr: 0.0002 |
| [[34m2026-03-01 03:50:26[0m] [Epoch 74 | Step 93300] train/loss: 0.3739, train/lr: 0.0002 |
| [[34m2026-03-01 03:50:50[0m] [Epoch 74 | Step 93400] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 03:51:14[0m] [Epoch 74 | Step 93500] train/loss: 0.3739, train/lr: 0.0002 |
| [[34m2026-03-01 03:51:38[0m] [Epoch 74 | Step 93600] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 03:52:03[0m] [Epoch 74 | Step 93700] train/loss: 0.3757, train/lr: 0.0002 |
| [[34m2026-03-01 03:52:29[0m] [Epoch 74 | Step 93800] train/loss: 0.3787, train/lr: 0.0002 |
| [[34m2026-03-01 03:52:36[0m] [Epoch 74] epoch/loss: 0.3746 |
| [[34m2026-03-01 03:52:59[0m] [Epoch 75 | Step 93900] train/loss: 0.3714, train/lr: 0.0002 |
| [[34m2026-03-01 03:53:24[0m] [Epoch 75 | Step 94000] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 03:53:53[0m] [Epoch 75 | Step 94100] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 03:54:19[0m] [Epoch 75 | Step 94200] train/loss: 0.3742, train/lr: 0.0002 |
| [[34m2026-03-01 03:54:44[0m] [Epoch 75 | Step 94300] train/loss: 0.3771, train/lr: 0.0002 |
| [[34m2026-03-01 03:55:08[0m] [Epoch 75 | Step 94400] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 03:55:32[0m] [Epoch 75 | Step 94500] train/loss: 0.3728, train/lr: 0.0002 |
| [[34m2026-03-01 03:55:56[0m] [Epoch 75 | Step 94600] train/loss: 0.3739, train/lr: 0.0002 |
| [[34m2026-03-01 03:56:20[0m] [Epoch 75 | Step 94700] train/loss: 0.3749, train/lr: 0.0002 |
| [[34m2026-03-01 03:56:44[0m] [Epoch 75 | Step 94800] train/loss: 0.3710, train/lr: 0.0002 |
| [[34m2026-03-01 03:57:08[0m] [Epoch 75 | Step 94900] train/loss: 0.3724, train/lr: 0.0002 |
| [[34m2026-03-01 03:57:32[0m] [Epoch 75 | Step 95000] train/loss: 0.3729, train/lr: 0.0002 |
| [[34m2026-03-01 03:57:51[0m] [Epoch 75] epoch/loss: 0.3734 |
| [[34m2026-03-01 03:57:59[0m] [Epoch 76 | Step 95100] train/loss: 0.3758, train/lr: 0.0002 |
| [[34m2026-03-01 03:58:23[0m] [Epoch 76 | Step 95200] train/loss: 0.3726, train/lr: 0.0002 |
| [[34m2026-03-01 03:58:48[0m] [Epoch 76 | Step 95300] train/loss: 0.3707, train/lr: 0.0002 |
| [[34m2026-03-01 03:59:12[0m] [Epoch 76 | Step 95400] train/loss: 0.3719, train/lr: 0.0002 |
| [[34m2026-03-01 03:59:36[0m] [Epoch 76 | Step 95500] train/loss: 0.3743, train/lr: 0.0002 |
| [[34m2026-03-01 04:00:00[0m] [Epoch 76 | Step 95600] train/loss: 0.3736, train/lr: 0.0002 |
| [[34m2026-03-01 04:00:24[0m] [Epoch 76 | Step 95700] train/loss: 0.3699, train/lr: 0.0002 |
| [[34m2026-03-01 04:00:48[0m] [Epoch 76 | Step 95800] train/loss: 0.3741, train/lr: 0.0002 |
| [[34m2026-03-01 04:01:12[0m] [Epoch 76 | Step 95900] train/loss: 0.3703, train/lr: 0.0002 |
| [[34m2026-03-01 04:01:37[0m] [Epoch 76 | Step 96000] train/loss: 0.3758, train/lr: 0.0002 |
| [[34m2026-03-01 04:02:02[0m] [Epoch 76 | Step 96100] train/loss: 0.3741, train/lr: 0.0002 |
| [[34m2026-03-01 04:02:28[0m] [Epoch 76 | Step 96200] train/loss: 0.3766, train/lr: 0.0002 |
| [[34m2026-03-01 04:02:54[0m] [Epoch 76 | Step 96300] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 04:03:01[0m] [Epoch 76] epoch/loss: 0.3731 |
| [[34m2026-03-01 04:03:23[0m] [Epoch 77 | Step 96400] train/loss: 0.3735, train/lr: 0.0002 |
| [[34m2026-03-01 04:03:49[0m] [Epoch 77 | Step 96500] train/loss: 0.3761, train/lr: 0.0002 |
| [[34m2026-03-01 04:04:15[0m] [Epoch 77 | Step 96600] train/loss: 0.3715, train/lr: 0.0002 |
| [[34m2026-03-01 04:04:40[0m] [Epoch 77 | Step 96700] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 04:05:04[0m] [Epoch 77 | Step 96800] train/loss: 0.3740, train/lr: 0.0002 |
| [[34m2026-03-01 04:05:28[0m] [Epoch 77 | Step 96900] train/loss: 0.3725, train/lr: 0.0002 |
| [[34m2026-03-01 04:05:53[0m] [Epoch 77 | Step 97000] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 04:06:17[0m] [Epoch 77 | Step 97100] train/loss: 0.3745, train/lr: 0.0002 |
| [[34m2026-03-01 04:06:41[0m] [Epoch 77 | Step 97200] train/loss: 0.3726, train/lr: 0.0002 |
| [[34m2026-03-01 04:07:05[0m] [Epoch 77 | Step 97300] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 04:07:29[0m] [Epoch 77 | Step 97400] train/loss: 0.3731, train/lr: 0.0002 |
| [[34m2026-03-01 04:07:53[0m] [Epoch 77 | Step 97500] train/loss: 0.3765, train/lr: 0.0002 |
| [[34m2026-03-01 04:08:12[0m] [Epoch 77] epoch/loss: 0.3734 |
| [[34m2026-03-01 04:08:20[0m] [Epoch 78 | Step 97600] train/loss: 0.3724, train/lr: 0.0002 |
| [[34m2026-03-01 04:08:44[0m] [Epoch 78 | Step 97700] train/loss: 0.3745, train/lr: 0.0002 |
| [[34m2026-03-01 04:09:08[0m] [Epoch 78 | Step 97800] train/loss: 0.3720, train/lr: 0.0002 |
| [[34m2026-03-01 04:09:32[0m] [Epoch 78 | Step 97900] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 04:09:57[0m] [Epoch 78 | Step 98000] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 04:10:21[0m] [Epoch 78 | Step 98100] train/loss: 0.3686, train/lr: 0.0002 |
| [[34m2026-03-01 04:10:45[0m] [Epoch 78 | Step 98200] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 04:11:09[0m] [Epoch 78 | Step 98300] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 04:11:34[0m] [Epoch 78 | Step 98400] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 04:12:00[0m] [Epoch 78 | Step 98500] train/loss: 0.3726, train/lr: 0.0002 |
| [[34m2026-03-01 04:12:26[0m] [Epoch 78 | Step 98600] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 04:12:52[0m] [Epoch 78 | Step 98700] train/loss: 0.3732, train/lr: 0.0002 |
| [[34m2026-03-01 04:13:20[0m] [Epoch 78 | Step 98800] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 04:13:28[0m] [Epoch 78] epoch/loss: 0.3726 |
| [[34m2026-03-01 04:13:50[0m] [Epoch 79 | Step 98900] train/loss: 0.3720, train/lr: 0.0002 |
| [[34m2026-03-01 04:14:14[0m] [Epoch 79 | Step 99000] train/loss: 0.3736, train/lr: 0.0002 |
| [[34m2026-03-01 04:14:38[0m] [Epoch 79 | Step 99100] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 04:15:02[0m] [Epoch 79 | Step 99200] train/loss: 0.3735, train/lr: 0.0002 |
| [[34m2026-03-01 04:15:26[0m] [Epoch 79 | Step 99300] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 04:15:50[0m] [Epoch 79 | Step 99400] train/loss: 0.3741, train/lr: 0.0002 |
| [[34m2026-03-01 04:16:15[0m] [Epoch 79 | Step 99500] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 04:16:39[0m] [Epoch 79 | Step 99600] train/loss: 0.3742, train/lr: 0.0002 |
| [[34m2026-03-01 04:17:03[0m] [Epoch 79 | Step 99700] train/loss: 0.3728, train/lr: 0.0002 |
| [[34m2026-03-01 04:17:27[0m] [Epoch 79 | Step 99800] train/loss: 0.3725, train/lr: 0.0002 |
| [[34m2026-03-01 04:17:51[0m] [Epoch 79 | Step 99900] train/loss: 0.3747, train/lr: 0.0002 |
| [[34m2026-03-01 04:18:15[0m] [Epoch 79 | Step 100000] train/loss: 0.3771, train/lr: 0.0002 |
| [[34m2026-03-01 04:18:15[0m] Generating EMA samples... |
| [[34m2026-03-01 04:18:16[0m] Generating EMA samples done. |
| [[34m2026-03-01 04:18:16[0m] Starting evaluation... |
| [[34m2026-03-01 04:21:22[0m] eval_ema/fid: 5.472469575562018 |
| [[34m2026-03-01 04:24:28[0m] eval_model/fid: 7.496063486116231 |
| [[34m2026-03-01 04:24:28[0m] Evaluation done. |
| [[34m2026-03-01 04:24:47[0m] [Epoch 79] epoch/loss: 0.3740 |
| [[34m2026-03-01 04:24:47[0m] Saving checkpoint at epoch 80... |
| [[34m2026-03-01 04:25:03[0m] [Epoch 80 | Step 100100] train/loss: 0.3760, train/lr: 0.0002 |
| [[34m2026-03-01 04:25:27[0m] [Epoch 80 | Step 100200] train/loss: 0.3721, train/lr: 0.0002 |
| [[34m2026-03-01 04:25:51[0m] [Epoch 80 | Step 100300] train/loss: 0.3724, train/lr: 0.0002 |
| [[34m2026-03-01 04:26:15[0m] [Epoch 80 | Step 100400] train/loss: 0.3718, train/lr: 0.0002 |
| [[34m2026-03-01 04:26:39[0m] [Epoch 80 | Step 100500] train/loss: 0.3751, train/lr: 0.0002 |
| [[34m2026-03-01 04:27:03[0m] [Epoch 80 | Step 100600] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 04:27:29[0m] [Epoch 80 | Step 100700] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 04:27:54[0m] [Epoch 80 | Step 100800] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 04:28:19[0m] [Epoch 80 | Step 100900] train/loss: 0.3747, train/lr: 0.0002 |
| [[34m2026-03-01 04:28:46[0m] [Epoch 80 | Step 101000] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 04:29:12[0m] [Epoch 80 | Step 101100] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 04:29:39[0m] [Epoch 80 | Step 101200] train/loss: 0.3712, train/lr: 0.0002 |
| [[34m2026-03-01 04:30:06[0m] [Epoch 80 | Step 101300] train/loss: 0.3718, train/lr: 0.0002 |
| [[34m2026-03-01 04:30:14[0m] [Epoch 80] epoch/loss: 0.3719 |
| [[34m2026-03-01 04:30:33[0m] [Epoch 81 | Step 101400] train/loss: 0.3745, train/lr: 0.0002 |
| [[34m2026-03-01 04:30:57[0m] [Epoch 81 | Step 101500] train/loss: 0.3743, train/lr: 0.0002 |
| [[34m2026-03-01 04:31:21[0m] [Epoch 81 | Step 101600] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 04:31:45[0m] [Epoch 81 | Step 101700] train/loss: 0.3730, train/lr: 0.0002 |
| [[34m2026-03-01 04:32:09[0m] [Epoch 81 | Step 101800] train/loss: 0.3715, train/lr: 0.0002 |
| [[34m2026-03-01 04:32:33[0m] [Epoch 81 | Step 101900] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 04:32:57[0m] [Epoch 81 | Step 102000] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 04:33:21[0m] [Epoch 81 | Step 102100] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 04:33:45[0m] [Epoch 81 | Step 102200] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 04:34:10[0m] [Epoch 81 | Step 102300] train/loss: 0.3732, train/lr: 0.0002 |
| [[34m2026-03-01 04:34:34[0m] [Epoch 81 | Step 102400] train/loss: 0.3701, train/lr: 0.0002 |
| [[34m2026-03-01 04:34:58[0m] [Epoch 81 | Step 102500] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 04:35:18[0m] [Epoch 81] epoch/loss: 0.3717 |
| [[34m2026-03-01 04:35:25[0m] [Epoch 82 | Step 102600] train/loss: 0.3750, train/lr: 0.0002 |
| [[34m2026-03-01 04:35:49[0m] [Epoch 82 | Step 102700] train/loss: 0.3728, train/lr: 0.0002 |
| [[34m2026-03-01 04:36:13[0m] [Epoch 82 | Step 102800] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 04:36:37[0m] [Epoch 82 | Step 102900] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 04:37:01[0m] [Epoch 82 | Step 103000] train/loss: 0.3758, train/lr: 0.0002 |
| [[34m2026-03-01 04:37:28[0m] [Epoch 82 | Step 103100] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 04:37:53[0m] [Epoch 82 | Step 103200] train/loss: 0.3747, train/lr: 0.0002 |
| [[34m2026-03-01 04:38:18[0m] [Epoch 82 | Step 103300] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 04:38:45[0m] [Epoch 82 | Step 103400] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 04:39:12[0m] [Epoch 82 | Step 103500] train/loss: 0.3732, train/lr: 0.0002 |
| [[34m2026-03-01 04:39:39[0m] [Epoch 82 | Step 103600] train/loss: 0.3749, train/lr: 0.0002 |
| [[34m2026-03-01 04:40:04[0m] [Epoch 82 | Step 103700] train/loss: 0.3732, train/lr: 0.0002 |
| [[34m2026-03-01 04:40:28[0m] [Epoch 82 | Step 103800] train/loss: 0.3738, train/lr: 0.0002 |
| [[34m2026-03-01 04:40:36[0m] [Epoch 82] epoch/loss: 0.3730 |
| [[34m2026-03-01 04:40:55[0m] [Epoch 83 | Step 103900] train/loss: 0.3685, train/lr: 0.0002 |
| [[34m2026-03-01 04:41:19[0m] [Epoch 83 | Step 104000] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 04:41:43[0m] [Epoch 83 | Step 104100] train/loss: 0.3719, train/lr: 0.0002 |
| [[34m2026-03-01 04:42:07[0m] [Epoch 83 | Step 104200] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 04:42:31[0m] [Epoch 83 | Step 104300] train/loss: 0.3729, train/lr: 0.0002 |
| [[34m2026-03-01 04:42:55[0m] [Epoch 83 | Step 104400] train/loss: 0.3710, train/lr: 0.0002 |
| [[34m2026-03-01 04:43:19[0m] [Epoch 83 | Step 104500] train/loss: 0.3714, train/lr: 0.0002 |
| [[34m2026-03-01 04:43:43[0m] [Epoch 83 | Step 104600] train/loss: 0.3701, train/lr: 0.0002 |
| [[34m2026-03-01 04:44:07[0m] [Epoch 83 | Step 104700] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 04:44:32[0m] [Epoch 83 | Step 104800] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 04:44:56[0m] [Epoch 83 | Step 104900] train/loss: 0.3703, train/lr: 0.0002 |
| [[34m2026-03-01 04:45:20[0m] [Epoch 83 | Step 105000] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 04:45:40[0m] [Epoch 83] epoch/loss: 0.3712 |
| [[34m2026-03-01 04:45:47[0m] [Epoch 84 | Step 105100] train/loss: 0.3728, train/lr: 0.0002 |
| [[34m2026-03-01 04:46:11[0m] [Epoch 84 | Step 105200] train/loss: 0.3707, train/lr: 0.0002 |
| [[34m2026-03-01 04:46:35[0m] [Epoch 84 | Step 105300] train/loss: 0.3714, train/lr: 0.0002 |
| [[34m2026-03-01 04:47:01[0m] [Epoch 84 | Step 105400] train/loss: 0.3710, train/lr: 0.0002 |
| [[34m2026-03-01 04:47:27[0m] [Epoch 84 | Step 105500] train/loss: 0.3721, train/lr: 0.0002 |
| [[34m2026-03-01 04:47:51[0m] [Epoch 84 | Step 105600] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 04:48:18[0m] [Epoch 84 | Step 105700] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 04:48:43[0m] [Epoch 84 | Step 105800] train/loss: 0.3742, train/lr: 0.0002 |
| [[34m2026-03-01 04:49:12[0m] [Epoch 84 | Step 105900] train/loss: 0.3730, train/lr: 0.0002 |
| [[34m2026-03-01 04:49:37[0m] [Epoch 84 | Step 106000] train/loss: 0.3714, train/lr: 0.0002 |
| [[34m2026-03-01 04:50:02[0m] [Epoch 84 | Step 106100] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 04:50:26[0m] [Epoch 84 | Step 106200] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 04:50:50[0m] [Epoch 84 | Step 106300] train/loss: 0.3726, train/lr: 0.0002 |
| [[34m2026-03-01 04:50:58[0m] [Epoch 84] epoch/loss: 0.3714 |
| [[34m2026-03-01 04:51:17[0m] [Epoch 85 | Step 106400] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 04:51:41[0m] [Epoch 85 | Step 106500] train/loss: 0.3686, train/lr: 0.0002 |
| [[34m2026-03-01 04:52:05[0m] [Epoch 85 | Step 106600] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 04:52:29[0m] [Epoch 85 | Step 106700] train/loss: 0.3724, train/lr: 0.0002 |
| [[34m2026-03-01 04:52:53[0m] [Epoch 85 | Step 106800] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 04:53:17[0m] [Epoch 85 | Step 106900] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 04:53:41[0m] [Epoch 85 | Step 107000] train/loss: 0.3731, train/lr: 0.0002 |
| [[34m2026-03-01 04:54:05[0m] [Epoch 85 | Step 107100] train/loss: 0.3731, train/lr: 0.0002 |
| [[34m2026-03-01 04:54:29[0m] [Epoch 85 | Step 107200] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 04:54:53[0m] [Epoch 85 | Step 107300] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 04:55:18[0m] [Epoch 85 | Step 107400] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 04:55:42[0m] [Epoch 85 | Step 107500] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 04:56:02[0m] [Epoch 85] epoch/loss: 0.3715 |
| [[34m2026-03-01 04:56:08[0m] [Epoch 86 | Step 107600] train/loss: 0.3747, train/lr: 0.0002 |
| [[34m2026-03-01 04:56:33[0m] [Epoch 86 | Step 107700] train/loss: 0.3719, train/lr: 0.0002 |
| [[34m2026-03-01 04:56:59[0m] [Epoch 86 | Step 107800] train/loss: 0.3673, train/lr: 0.0002 |
| [[34m2026-03-01 04:57:24[0m] [Epoch 86 | Step 107900] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 04:57:51[0m] [Epoch 86 | Step 108000] train/loss: 0.3698, train/lr: 0.0002 |
| [[34m2026-03-01 04:58:16[0m] [Epoch 86 | Step 108100] train/loss: 0.3721, train/lr: 0.0002 |
| [[34m2026-03-01 04:58:43[0m] [Epoch 86 | Step 108200] train/loss: 0.3707, train/lr: 0.0002 |
| [[34m2026-03-01 04:59:10[0m] [Epoch 86 | Step 108300] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 04:59:34[0m] [Epoch 86 | Step 108400] train/loss: 0.3736, train/lr: 0.0002 |
| [[34m2026-03-01 04:59:59[0m] [Epoch 86 | Step 108500] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 05:00:23[0m] [Epoch 86 | Step 108600] train/loss: 0.3733, train/lr: 0.0002 |
| [[34m2026-03-01 05:00:47[0m] [Epoch 86 | Step 108700] train/loss: 0.3682, train/lr: 0.0002 |
| [[34m2026-03-01 05:01:12[0m] [Epoch 86 | Step 108800] train/loss: 0.3735, train/lr: 0.0002 |
| [[34m2026-03-01 05:01:20[0m] [Epoch 86] epoch/loss: 0.3711 |
| [[34m2026-03-01 05:01:38[0m] [Epoch 87 | Step 108900] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 05:02:03[0m] [Epoch 87 | Step 109000] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 05:02:27[0m] [Epoch 87 | Step 109100] train/loss: 0.3698, train/lr: 0.0002 |
| [[34m2026-03-01 05:02:51[0m] [Epoch 87 | Step 109200] train/loss: 0.3727, train/lr: 0.0002 |
| [[34m2026-03-01 05:03:15[0m] [Epoch 87 | Step 109300] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 05:03:39[0m] [Epoch 87 | Step 109400] train/loss: 0.3752, train/lr: 0.0002 |
| [[34m2026-03-01 05:04:03[0m] [Epoch 87 | Step 109500] train/loss: 0.3679, train/lr: 0.0002 |
| [[34m2026-03-01 05:04:27[0m] [Epoch 87 | Step 109600] train/loss: 0.3670, train/lr: 0.0002 |
| [[34m2026-03-01 05:04:51[0m] [Epoch 87 | Step 109700] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 05:05:16[0m] [Epoch 87 | Step 109800] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 05:05:40[0m] [Epoch 87 | Step 109900] train/loss: 0.3679, train/lr: 0.0002 |
| [[34m2026-03-01 05:06:04[0m] [Epoch 87 | Step 110000] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 05:06:04[0m] Generating EMA samples... |
| [[34m2026-03-01 05:06:05[0m] Generating EMA samples done. |
| [[34m2026-03-01 05:06:28[0m] [Epoch 87] epoch/loss: 0.3699 |
| [[34m2026-03-01 05:06:33[0m] [Epoch 88 | Step 110100] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 05:06:59[0m] [Epoch 88 | Step 110200] train/loss: 0.3704, train/lr: 0.0002 |
| [[34m2026-03-01 05:07:24[0m] [Epoch 88 | Step 110300] train/loss: 0.3738, train/lr: 0.0002 |
| [[34m2026-03-01 05:07:51[0m] [Epoch 88 | Step 110400] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 05:08:16[0m] [Epoch 88 | Step 110500] train/loss: 0.3731, train/lr: 0.0002 |
| [[34m2026-03-01 05:08:43[0m] [Epoch 88 | Step 110600] train/loss: 0.3741, train/lr: 0.0002 |
| [[34m2026-03-01 05:09:08[0m] [Epoch 88 | Step 110700] train/loss: 0.3719, train/lr: 0.0002 |
| [[34m2026-03-01 05:09:33[0m] [Epoch 88 | Step 110800] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 05:09:57[0m] [Epoch 88 | Step 110900] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 05:10:22[0m] [Epoch 88 | Step 111000] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 05:10:46[0m] [Epoch 88 | Step 111100] train/loss: 0.3694, train/lr: 0.0002 |
| [[34m2026-03-01 05:11:10[0m] [Epoch 88 | Step 111200] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 05:11:34[0m] [Epoch 88 | Step 111300] train/loss: 0.3715, train/lr: 0.0002 |
| [[34m2026-03-01 05:11:43[0m] [Epoch 88] epoch/loss: 0.3712 |
| [[34m2026-03-01 05:12:01[0m] [Epoch 89 | Step 111400] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 05:12:25[0m] [Epoch 89 | Step 111500] train/loss: 0.3685, train/lr: 0.0002 |
| [[34m2026-03-01 05:12:49[0m] [Epoch 89 | Step 111600] train/loss: 0.3727, train/lr: 0.0002 |
| [[34m2026-03-01 05:13:13[0m] [Epoch 89 | Step 111700] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 05:13:37[0m] [Epoch 89 | Step 111800] train/loss: 0.3730, train/lr: 0.0002 |
| [[34m2026-03-01 05:14:01[0m] [Epoch 89 | Step 111900] train/loss: 0.3721, train/lr: 0.0002 |
| [[34m2026-03-01 05:14:25[0m] [Epoch 89 | Step 112000] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 05:14:49[0m] [Epoch 89 | Step 112100] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 05:15:14[0m] [Epoch 89 | Step 112200] train/loss: 0.3691, train/lr: 0.0002 |
| [[34m2026-03-01 05:15:38[0m] [Epoch 89 | Step 112300] train/loss: 0.3702, train/lr: 0.0002 |
| [[34m2026-03-01 05:16:03[0m] [Epoch 89 | Step 112400] train/loss: 0.3740, train/lr: 0.0002 |
| [[34m2026-03-01 05:16:28[0m] [Epoch 89 | Step 112500] train/loss: 0.3743, train/lr: 0.0002 |
| [[34m2026-03-01 05:16:51[0m] [Epoch 89] epoch/loss: 0.3713 |
| [[34m2026-03-01 05:16:51[0m] Saving checkpoint at epoch 90... |
| [[34m2026-03-01 05:17:05[0m] [Epoch 90 | Step 112600] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 05:17:31[0m] [Epoch 90 | Step 112700] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 05:17:57[0m] [Epoch 90 | Step 112800] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 05:18:23[0m] [Epoch 90 | Step 112900] train/loss: 0.3710, train/lr: 0.0002 |
| [[34m2026-03-01 05:18:51[0m] [Epoch 90 | Step 113000] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 05:19:15[0m] [Epoch 90 | Step 113100] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 05:19:40[0m] [Epoch 90 | Step 113200] train/loss: 0.3723, train/lr: 0.0002 |
| [[34m2026-03-01 05:20:04[0m] [Epoch 90 | Step 113300] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:20:28[0m] [Epoch 90 | Step 113400] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 05:20:52[0m] [Epoch 90 | Step 113500] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 05:21:16[0m] [Epoch 90 | Step 113600] train/loss: 0.3701, train/lr: 0.0002 |
| [[34m2026-03-01 05:21:40[0m] [Epoch 90 | Step 113700] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 05:22:04[0m] [Epoch 90 | Step 113800] train/loss: 0.3694, train/lr: 0.0002 |
| [[34m2026-03-01 05:22:14[0m] [Epoch 90] epoch/loss: 0.3699 |
| [[34m2026-03-01 05:22:31[0m] [Epoch 91 | Step 113900] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 05:22:55[0m] [Epoch 91 | Step 114000] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 05:23:19[0m] [Epoch 91 | Step 114100] train/loss: 0.3740, train/lr: 0.0002 |
| [[34m2026-03-01 05:23:43[0m] [Epoch 91 | Step 114200] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 05:24:07[0m] [Epoch 91 | Step 114300] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 05:24:31[0m] [Epoch 91 | Step 114400] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 05:24:55[0m] [Epoch 91 | Step 114500] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 05:25:19[0m] [Epoch 91 | Step 114600] train/loss: 0.3721, train/lr: 0.0002 |
| [[34m2026-03-01 05:25:43[0m] [Epoch 91 | Step 114700] train/loss: 0.3669, train/lr: 0.0002 |
| [[34m2026-03-01 05:26:09[0m] [Epoch 91 | Step 114800] train/loss: 0.3669, train/lr: 0.0002 |
| [[34m2026-03-01 05:26:35[0m] [Epoch 91 | Step 114900] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 05:27:00[0m] [Epoch 91 | Step 115000] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 05:27:23[0m] [Epoch 91] epoch/loss: 0.3699 |
| [[34m2026-03-01 05:27:30[0m] [Epoch 92 | Step 115100] train/loss: 0.3694, train/lr: 0.0002 |
| [[34m2026-03-01 05:27:54[0m] [Epoch 92 | Step 115200] train/loss: 0.3699, train/lr: 0.0002 |
| [[34m2026-03-01 05:28:21[0m] [Epoch 92 | Step 115300] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 05:28:48[0m] [Epoch 92 | Step 115400] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 05:29:13[0m] [Epoch 92 | Step 115500] train/loss: 0.3720, train/lr: 0.0002 |
| [[34m2026-03-01 05:29:37[0m] [Epoch 92 | Step 115600] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 05:30:01[0m] [Epoch 92 | Step 115700] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 05:30:25[0m] [Epoch 92 | Step 115800] train/loss: 0.3735, train/lr: 0.0002 |
| [[34m2026-03-01 05:30:49[0m] [Epoch 92 | Step 115900] train/loss: 0.3734, train/lr: 0.0002 |
| [[34m2026-03-01 05:31:14[0m] [Epoch 92 | Step 116000] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 05:31:38[0m] [Epoch 92 | Step 116100] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:32:02[0m] [Epoch 92 | Step 116200] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 05:32:26[0m] [Epoch 92 | Step 116300] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:32:36[0m] [Epoch 92] epoch/loss: 0.3705 |
| [[34m2026-03-01 05:32:53[0m] [Epoch 93 | Step 116400] train/loss: 0.3698, train/lr: 0.0002 |
| [[34m2026-03-01 05:33:17[0m] [Epoch 93 | Step 116500] train/loss: 0.3692, train/lr: 0.0002 |
| [[34m2026-03-01 05:33:41[0m] [Epoch 93 | Step 116600] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:34:05[0m] [Epoch 93 | Step 116700] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:34:29[0m] [Epoch 93 | Step 116800] train/loss: 0.3712, train/lr: 0.0002 |
| [[34m2026-03-01 05:34:54[0m] [Epoch 93 | Step 116900] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 05:35:18[0m] [Epoch 93 | Step 117000] train/loss: 0.3670, train/lr: 0.0002 |
| [[34m2026-03-01 05:35:43[0m] [Epoch 93 | Step 117100] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 05:36:08[0m] [Epoch 93 | Step 117200] train/loss: 0.3729, train/lr: 0.0002 |
| [[34m2026-03-01 05:36:33[0m] [Epoch 93 | Step 117300] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 05:36:58[0m] [Epoch 93 | Step 117400] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 05:37:25[0m] [Epoch 93 | Step 117500] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 05:37:50[0m] [Epoch 93] epoch/loss: 0.3697 |
| [[34m2026-03-01 05:37:54[0m] [Epoch 94 | Step 117600] train/loss: 0.3720, train/lr: 0.0002 |
| [[34m2026-03-01 05:38:21[0m] [Epoch 94 | Step 117700] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 05:38:46[0m] [Epoch 94 | Step 117800] train/loss: 0.3712, train/lr: 0.0002 |
| [[34m2026-03-01 05:39:12[0m] [Epoch 94 | Step 117900] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 05:39:36[0m] [Epoch 94 | Step 118000] train/loss: 0.3702, train/lr: 0.0002 |
| [[34m2026-03-01 05:40:00[0m] [Epoch 94 | Step 118100] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 05:40:24[0m] [Epoch 94 | Step 118200] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 05:40:48[0m] [Epoch 94 | Step 118300] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 05:41:12[0m] [Epoch 94 | Step 118400] train/loss: 0.3677, train/lr: 0.0002 |
| [[34m2026-03-01 05:41:36[0m] [Epoch 94 | Step 118500] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 05:42:00[0m] [Epoch 94 | Step 118600] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 05:42:24[0m] [Epoch 94 | Step 118700] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 05:42:48[0m] [Epoch 94 | Step 118800] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 05:42:59[0m] [Epoch 94] epoch/loss: 0.3693 |
| [[34m2026-03-01 05:43:15[0m] [Epoch 95 | Step 118900] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 05:43:39[0m] [Epoch 95 | Step 119000] train/loss: 0.3715, train/lr: 0.0002 |
| [[34m2026-03-01 05:44:03[0m] [Epoch 95 | Step 119100] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 05:44:27[0m] [Epoch 95 | Step 119200] train/loss: 0.3746, train/lr: 0.0002 |
| [[34m2026-03-01 05:44:51[0m] [Epoch 95 | Step 119300] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 05:45:16[0m] [Epoch 95 | Step 119400] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 05:45:41[0m] [Epoch 95 | Step 119500] train/loss: 0.3678, train/lr: 0.0002 |
| [[34m2026-03-01 05:46:07[0m] [Epoch 95 | Step 119600] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 05:46:32[0m] [Epoch 95 | Step 119700] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 05:46:57[0m] [Epoch 95 | Step 119800] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 05:47:24[0m] [Epoch 95 | Step 119900] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:47:50[0m] [Epoch 95 | Step 120000] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 05:47:50[0m] Generating EMA samples... |
| [[34m2026-03-01 05:47:51[0m] Generating EMA samples done. |
| [[34m2026-03-01 05:48:17[0m] [Epoch 95] epoch/loss: 0.3695 |
| [[34m2026-03-01 05:48:20[0m] [Epoch 96 | Step 120100] train/loss: 0.3699, train/lr: 0.0002 |
| [[34m2026-03-01 05:48:45[0m] [Epoch 96 | Step 120200] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 05:49:09[0m] [Epoch 96 | Step 120300] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 05:49:34[0m] [Epoch 96 | Step 120400] train/loss: 0.3682, train/lr: 0.0002 |
| [[34m2026-03-01 05:49:58[0m] [Epoch 96 | Step 120500] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 05:50:22[0m] [Epoch 96 | Step 120600] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 05:50:46[0m] [Epoch 96 | Step 120700] train/loss: 0.3694, train/lr: 0.0002 |
| [[34m2026-03-01 05:51:10[0m] [Epoch 96 | Step 120800] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 05:51:34[0m] [Epoch 96 | Step 120900] train/loss: 0.3675, train/lr: 0.0002 |
| [[34m2026-03-01 05:51:58[0m] [Epoch 96 | Step 121000] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 05:52:22[0m] [Epoch 96 | Step 121100] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 05:52:46[0m] [Epoch 96 | Step 121200] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 05:53:11[0m] [Epoch 96 | Step 121300] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 05:53:22[0m] [Epoch 96] epoch/loss: 0.3691 |
| [[34m2026-03-01 05:53:37[0m] [Epoch 97 | Step 121400] train/loss: 0.3708, train/lr: 0.0002 |
| [[34m2026-03-01 05:54:02[0m] [Epoch 97 | Step 121500] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 05:54:26[0m] [Epoch 97 | Step 121600] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 05:54:50[0m] [Epoch 97 | Step 121700] train/loss: 0.3636, train/lr: 0.0002 |
| [[34m2026-03-01 05:55:15[0m] [Epoch 97 | Step 121800] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 05:55:40[0m] [Epoch 97 | Step 121900] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 05:56:06[0m] [Epoch 97 | Step 122000] train/loss: 0.3658, train/lr: 0.0002 |
| [[34m2026-03-01 05:56:31[0m] [Epoch 97 | Step 122100] train/loss: 0.3665, train/lr: 0.0002 |
| [[34m2026-03-01 05:56:57[0m] [Epoch 97 | Step 122200] train/loss: 0.3685, train/lr: 0.0002 |
| [[34m2026-03-01 05:57:23[0m] [Epoch 97 | Step 122300] train/loss: 0.3665, train/lr: 0.0002 |
| [[34m2026-03-01 05:57:52[0m] [Epoch 97 | Step 122400] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 05:58:16[0m] [Epoch 97 | Step 122500] train/loss: 0.3710, train/lr: 0.0002 |
| [[34m2026-03-01 05:58:40[0m] [Epoch 97] epoch/loss: 0.3679 |
| [[34m2026-03-01 05:58:44[0m] [Epoch 98 | Step 122600] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 05:59:08[0m] [Epoch 98 | Step 122700] train/loss: 0.3713, train/lr: 0.0002 |
| [[34m2026-03-01 05:59:32[0m] [Epoch 98 | Step 122800] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 05:59:56[0m] [Epoch 98 | Step 122900] train/loss: 0.3669, train/lr: 0.0002 |
| [[34m2026-03-01 06:00:20[0m] [Epoch 98 | Step 123000] train/loss: 0.3662, train/lr: 0.0002 |
| [[34m2026-03-01 06:00:44[0m] [Epoch 98 | Step 123100] train/loss: 0.3709, train/lr: 0.0002 |
| [[34m2026-03-01 06:01:08[0m] [Epoch 98 | Step 123200] train/loss: 0.3669, train/lr: 0.0002 |
| [[34m2026-03-01 06:01:32[0m] [Epoch 98 | Step 123300] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 06:01:56[0m] [Epoch 98 | Step 123400] train/loss: 0.3666, train/lr: 0.0002 |
| [[34m2026-03-01 06:02:21[0m] [Epoch 98 | Step 123500] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 06:02:45[0m] [Epoch 98 | Step 123600] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 06:03:09[0m] [Epoch 98 | Step 123700] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 06:03:33[0m] [Epoch 98 | Step 123800] train/loss: 0.3679, train/lr: 0.0002 |
| [[34m2026-03-01 06:03:45[0m] [Epoch 98] epoch/loss: 0.3685 |
| [[34m2026-03-01 06:04:00[0m] [Epoch 99 | Step 123900] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 06:04:24[0m] [Epoch 99 | Step 124000] train/loss: 0.3663, train/lr: 0.0002 |
| [[34m2026-03-01 06:04:49[0m] [Epoch 99 | Step 124100] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 06:05:14[0m] [Epoch 99 | Step 124200] train/loss: 0.3704, train/lr: 0.0002 |
| [[34m2026-03-01 06:05:40[0m] [Epoch 99 | Step 124300] train/loss: 0.3701, train/lr: 0.0002 |
| [[34m2026-03-01 06:06:05[0m] [Epoch 99 | Step 124400] train/loss: 0.3665, train/lr: 0.0002 |
| [[34m2026-03-01 06:06:29[0m] [Epoch 99 | Step 124500] train/loss: 0.3678, train/lr: 0.0002 |
| [[34m2026-03-01 06:06:56[0m] [Epoch 99 | Step 124600] train/loss: 0.3712, train/lr: 0.0002 |
| [[34m2026-03-01 06:07:23[0m] [Epoch 99 | Step 124700] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 06:07:50[0m] [Epoch 99 | Step 124800] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 06:08:14[0m] [Epoch 99 | Step 124900] train/loss: 0.3699, train/lr: 0.0002 |
| [[34m2026-03-01 06:08:39[0m] [Epoch 99 | Step 125000] train/loss: 0.3716, train/lr: 0.0002 |
| [[34m2026-03-01 06:09:03[0m] [Epoch 99] epoch/loss: 0.3687 |
| [[34m2026-03-01 06:09:03[0m] Saving checkpoint at epoch 100... |
| [[34m2026-03-01 06:09:14[0m] [Epoch 100 | Step 125100] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 06:09:38[0m] [Epoch 100 | Step 125200] train/loss: 0.3675, train/lr: 0.0002 |
| [[34m2026-03-01 06:10:02[0m] [Epoch 100 | Step 125300] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 06:10:26[0m] [Epoch 100 | Step 125400] train/loss: 0.3722, train/lr: 0.0002 |
| [[34m2026-03-01 06:10:50[0m] [Epoch 100 | Step 125500] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 06:11:14[0m] [Epoch 100 | Step 125600] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:11:39[0m] [Epoch 100 | Step 125700] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 06:12:03[0m] [Epoch 100 | Step 125800] train/loss: 0.3704, train/lr: 0.0002 |
| [[34m2026-03-01 06:12:27[0m] [Epoch 100 | Step 125900] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 06:12:51[0m] [Epoch 100 | Step 126000] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 06:13:15[0m] [Epoch 100 | Step 126100] train/loss: 0.3675, train/lr: 0.0002 |
| [[34m2026-03-01 06:13:39[0m] [Epoch 100 | Step 126200] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 06:14:03[0m] [Epoch 100 | Step 126300] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 06:14:16[0m] [Epoch 100] epoch/loss: 0.3681 |
| [[34m2026-03-01 06:14:30[0m] [Epoch 101 | Step 126400] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 06:14:56[0m] [Epoch 101 | Step 126500] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 06:15:20[0m] [Epoch 101 | Step 126600] train/loss: 0.3690, train/lr: 0.0002 |
| [[34m2026-03-01 06:15:46[0m] [Epoch 101 | Step 126700] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:16:11[0m] [Epoch 101 | Step 126800] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 06:16:37[0m] [Epoch 101 | Step 126900] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 06:17:03[0m] [Epoch 101 | Step 127000] train/loss: 0.3677, train/lr: 0.0002 |
| [[34m2026-03-01 06:17:30[0m] [Epoch 101 | Step 127100] train/loss: 0.3673, train/lr: 0.0002 |
| [[34m2026-03-01 06:17:56[0m] [Epoch 101 | Step 127200] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 06:18:21[0m] [Epoch 101 | Step 127300] train/loss: 0.3691, train/lr: 0.0002 |
| [[34m2026-03-01 06:18:45[0m] [Epoch 101 | Step 127400] train/loss: 0.3673, train/lr: 0.0002 |
| [[34m2026-03-01 06:19:09[0m] [Epoch 101 | Step 127500] train/loss: 0.3660, train/lr: 0.0002 |
| [[34m2026-03-01 06:19:33[0m] [Epoch 101 | Step 127600] train/loss: 0.3652, train/lr: 0.0002 |
| [[34m2026-03-01 06:19:34[0m] [Epoch 101] epoch/loss: 0.3679 |
| [[34m2026-03-01 06:20:00[0m] [Epoch 102 | Step 127700] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 06:20:24[0m] [Epoch 102 | Step 127800] train/loss: 0.3652, train/lr: 0.0002 |
| [[34m2026-03-01 06:20:48[0m] [Epoch 102 | Step 127900] train/loss: 0.3655, train/lr: 0.0002 |
| [[34m2026-03-01 06:21:12[0m] [Epoch 102 | Step 128000] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 06:21:36[0m] [Epoch 102 | Step 128100] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 06:22:01[0m] [Epoch 102 | Step 128200] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 06:22:25[0m] [Epoch 102 | Step 128300] train/loss: 0.3645, train/lr: 0.0002 |
| [[34m2026-03-01 06:22:49[0m] [Epoch 102 | Step 128400] train/loss: 0.3660, train/lr: 0.0002 |
| [[34m2026-03-01 06:23:13[0m] [Epoch 102 | Step 128500] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 06:23:37[0m] [Epoch 102 | Step 128600] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 06:24:01[0m] [Epoch 102 | Step 128700] train/loss: 0.3704, train/lr: 0.0002 |
| [[34m2026-03-01 06:24:26[0m] [Epoch 102 | Step 128800] train/loss: 0.3654, train/lr: 0.0002 |
| [[34m2026-03-01 06:24:39[0m] [Epoch 102] epoch/loss: 0.3672 |
| [[34m2026-03-01 06:24:54[0m] [Epoch 103 | Step 128900] train/loss: 0.3691, train/lr: 0.0002 |
| [[34m2026-03-01 06:25:19[0m] [Epoch 103 | Step 129000] train/loss: 0.3669, train/lr: 0.0002 |
| [[34m2026-03-01 06:25:45[0m] [Epoch 103 | Step 129100] train/loss: 0.3686, train/lr: 0.0002 |
| [[34m2026-03-01 06:26:10[0m] [Epoch 103 | Step 129200] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 06:26:37[0m] [Epoch 103 | Step 129300] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 06:27:02[0m] [Epoch 103 | Step 129400] train/loss: 0.3638, train/lr: 0.0002 |
| [[34m2026-03-01 06:27:29[0m] [Epoch 103 | Step 129500] train/loss: 0.3659, train/lr: 0.0002 |
| [[34m2026-03-01 06:27:54[0m] [Epoch 103 | Step 129600] train/loss: 0.3661, train/lr: 0.0002 |
| [[34m2026-03-01 06:28:19[0m] [Epoch 103 | Step 129700] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 06:28:43[0m] [Epoch 103 | Step 129800] train/loss: 0.3687, train/lr: 0.0002 |
| [[34m2026-03-01 06:29:07[0m] [Epoch 103 | Step 129900] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 06:29:31[0m] [Epoch 103 | Step 130000] train/loss: 0.3692, train/lr: 0.0002 |
| [[34m2026-03-01 06:29:31[0m] Generating EMA samples... |
| [[34m2026-03-01 06:29:32[0m] Generating EMA samples done. |
| [[34m2026-03-01 06:29:56[0m] [Epoch 103 | Step 130100] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 06:29:57[0m] [Epoch 103] epoch/loss: 0.3677 |
| [[34m2026-03-01 06:30:23[0m] [Epoch 104 | Step 130200] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 06:30:47[0m] [Epoch 104 | Step 130300] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 06:31:11[0m] [Epoch 104 | Step 130400] train/loss: 0.3655, train/lr: 0.0002 |
| [[34m2026-03-01 06:31:36[0m] [Epoch 104 | Step 130500] train/loss: 0.3662, train/lr: 0.0002 |
| [[34m2026-03-01 06:32:00[0m] [Epoch 104 | Step 130600] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 06:32:24[0m] [Epoch 104 | Step 130700] train/loss: 0.3682, train/lr: 0.0002 |
| [[34m2026-03-01 06:32:48[0m] [Epoch 104 | Step 130800] train/loss: 0.3700, train/lr: 0.0002 |
| [[34m2026-03-01 06:33:12[0m] [Epoch 104 | Step 130900] train/loss: 0.3717, train/lr: 0.0002 |
| [[34m2026-03-01 06:33:36[0m] [Epoch 104 | Step 131000] train/loss: 0.3695, train/lr: 0.0002 |
| [[34m2026-03-01 06:34:00[0m] [Epoch 104 | Step 131100] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 06:34:26[0m] [Epoch 104 | Step 131200] train/loss: 0.3656, train/lr: 0.0002 |
| [[34m2026-03-01 06:34:50[0m] [Epoch 104 | Step 131300] train/loss: 0.3691, train/lr: 0.0002 |
| [[34m2026-03-01 06:35:04[0m] [Epoch 104] epoch/loss: 0.3679 |
| [[34m2026-03-01 06:35:19[0m] [Epoch 105 | Step 131400] train/loss: 0.3658, train/lr: 0.0002 |
| [[34m2026-03-01 06:35:44[0m] [Epoch 105 | Step 131500] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 06:36:10[0m] [Epoch 105 | Step 131600] train/loss: 0.3706, train/lr: 0.0002 |
| [[34m2026-03-01 06:36:36[0m] [Epoch 105 | Step 131700] train/loss: 0.3684, train/lr: 0.0002 |
| [[34m2026-03-01 06:37:04[0m] [Epoch 105 | Step 131800] train/loss: 0.3705, train/lr: 0.0002 |
| [[34m2026-03-01 06:37:29[0m] [Epoch 105 | Step 131900] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 06:37:54[0m] [Epoch 105 | Step 132000] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 06:38:18[0m] [Epoch 105 | Step 132100] train/loss: 0.3704, train/lr: 0.0002 |
| [[34m2026-03-01 06:38:42[0m] [Epoch 105 | Step 132200] train/loss: 0.3631, train/lr: 0.0002 |
| [[34m2026-03-01 06:39:06[0m] [Epoch 105 | Step 132300] train/loss: 0.3681, train/lr: 0.0002 |
| [[34m2026-03-01 06:39:30[0m] [Epoch 105 | Step 132400] train/loss: 0.3647, train/lr: 0.0002 |
| [[34m2026-03-01 06:39:54[0m] [Epoch 105 | Step 132500] train/loss: 0.3694, train/lr: 0.0002 |
| [[34m2026-03-01 06:40:18[0m] [Epoch 105 | Step 132600] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 06:40:20[0m] [Epoch 105] epoch/loss: 0.3680 |
| [[34m2026-03-01 06:40:45[0m] [Epoch 106 | Step 132700] train/loss: 0.3665, train/lr: 0.0002 |
| [[34m2026-03-01 06:41:09[0m] [Epoch 106 | Step 132800] train/loss: 0.3673, train/lr: 0.0002 |
| [[34m2026-03-01 06:41:33[0m] [Epoch 106 | Step 132900] train/loss: 0.3646, train/lr: 0.0002 |
| [[34m2026-03-01 06:41:58[0m] [Epoch 106 | Step 133000] train/loss: 0.3692, train/lr: 0.0002 |
| [[34m2026-03-01 06:42:22[0m] [Epoch 106 | Step 133100] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:42:46[0m] [Epoch 106 | Step 133200] train/loss: 0.3646, train/lr: 0.0002 |
| [[34m2026-03-01 06:43:10[0m] [Epoch 106 | Step 133300] train/loss: 0.3636, train/lr: 0.0002 |
| [[34m2026-03-01 06:43:34[0m] [Epoch 106 | Step 133400] train/loss: 0.3635, train/lr: 0.0002 |
| [[34m2026-03-01 06:43:59[0m] [Epoch 106 | Step 133500] train/loss: 0.3663, train/lr: 0.0002 |
| [[34m2026-03-01 06:44:24[0m] [Epoch 106 | Step 133600] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 06:44:49[0m] [Epoch 106 | Step 133700] train/loss: 0.3674, train/lr: 0.0002 |
| [[34m2026-03-01 06:45:15[0m] [Epoch 106 | Step 133800] train/loss: 0.3661, train/lr: 0.0002 |
| [[34m2026-03-01 06:45:28[0m] [Epoch 106] epoch/loss: 0.3659 |
| [[34m2026-03-01 06:45:42[0m] [Epoch 107 | Step 133900] train/loss: 0.3665, train/lr: 0.0002 |
| [[34m2026-03-01 06:46:09[0m] [Epoch 107 | Step 134000] train/loss: 0.3711, train/lr: 0.0002 |
| [[34m2026-03-01 06:46:35[0m] [Epoch 107 | Step 134100] train/loss: 0.3662, train/lr: 0.0002 |
| [[34m2026-03-01 06:47:02[0m] [Epoch 107 | Step 134200] train/loss: 0.3663, train/lr: 0.0002 |
| [[34m2026-03-01 06:47:27[0m] [Epoch 107 | Step 134300] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 06:47:52[0m] [Epoch 107 | Step 134400] train/loss: 0.3649, train/lr: 0.0002 |
| [[34m2026-03-01 06:48:16[0m] [Epoch 107 | Step 134500] train/loss: 0.3657, train/lr: 0.0002 |
| [[34m2026-03-01 06:48:40[0m] [Epoch 107 | Step 134600] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 06:49:04[0m] [Epoch 107 | Step 134700] train/loss: 0.3657, train/lr: 0.0002 |
| [[34m2026-03-01 06:49:28[0m] [Epoch 107 | Step 134800] train/loss: 0.3698, train/lr: 0.0002 |
| [[34m2026-03-01 06:49:52[0m] [Epoch 107 | Step 134900] train/loss: 0.3671, train/lr: 0.0002 |
| [[34m2026-03-01 06:50:16[0m] [Epoch 107 | Step 135000] train/loss: 0.3658, train/lr: 0.0002 |
| [[34m2026-03-01 06:50:40[0m] [Epoch 107 | Step 135100] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:50:42[0m] [Epoch 107] epoch/loss: 0.3672 |
| [[34m2026-03-01 06:51:07[0m] [Epoch 108 | Step 135200] train/loss: 0.3693, train/lr: 0.0002 |
| [[34m2026-03-01 06:51:31[0m] [Epoch 108 | Step 135300] train/loss: 0.3653, train/lr: 0.0002 |
| [[34m2026-03-01 06:51:55[0m] [Epoch 108 | Step 135400] train/loss: 0.3625, train/lr: 0.0002 |
| [[34m2026-03-01 06:52:19[0m] [Epoch 108 | Step 135500] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 06:52:43[0m] [Epoch 108 | Step 135600] train/loss: 0.3638, train/lr: 0.0002 |
| [[34m2026-03-01 06:53:07[0m] [Epoch 108 | Step 135700] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 06:53:31[0m] [Epoch 108 | Step 135800] train/loss: 0.3658, train/lr: 0.0002 |
| [[34m2026-03-01 06:53:57[0m] [Epoch 108 | Step 135900] train/loss: 0.3697, train/lr: 0.0002 |
| [[34m2026-03-01 06:54:21[0m] [Epoch 108 | Step 136000] train/loss: 0.3659, train/lr: 0.0002 |
| [[34m2026-03-01 06:54:47[0m] [Epoch 108 | Step 136100] train/loss: 0.3658, train/lr: 0.0002 |
| [[34m2026-03-01 06:55:11[0m] [Epoch 108 | Step 136200] train/loss: 0.3648, train/lr: 0.0002 |
| [[34m2026-03-01 06:55:40[0m] [Epoch 108 | Step 136300] train/loss: 0.3660, train/lr: 0.0002 |
| [[34m2026-03-01 06:55:55[0m] [Epoch 108] epoch/loss: 0.3663 |
| [[34m2026-03-01 06:56:08[0m] [Epoch 109 | Step 136400] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:56:34[0m] [Epoch 109 | Step 136500] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 06:57:00[0m] [Epoch 109 | Step 136600] train/loss: 0.3651, train/lr: 0.0002 |
| [[34m2026-03-01 06:57:25[0m] [Epoch 109 | Step 136700] train/loss: 0.3654, train/lr: 0.0002 |
| [[34m2026-03-01 06:57:49[0m] [Epoch 109 | Step 136800] train/loss: 0.3664, train/lr: 0.0002 |
| [[34m2026-03-01 06:58:13[0m] [Epoch 109 | Step 136900] train/loss: 0.3677, train/lr: 0.0002 |
| [[34m2026-03-01 06:58:38[0m] [Epoch 109 | Step 137000] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 06:59:02[0m] [Epoch 109 | Step 137100] train/loss: 0.3640, train/lr: 0.0002 |
| [[34m2026-03-01 06:59:26[0m] [Epoch 109 | Step 137200] train/loss: 0.3673, train/lr: 0.0002 |
| [[34m2026-03-01 06:59:50[0m] [Epoch 109 | Step 137300] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 07:00:14[0m] [Epoch 109 | Step 137400] train/loss: 0.3630, train/lr: 0.0002 |
| [[34m2026-03-01 07:00:38[0m] [Epoch 109 | Step 137500] train/loss: 0.3689, train/lr: 0.0002 |
| [[34m2026-03-01 07:01:02[0m] [Epoch 109 | Step 137600] train/loss: 0.3672, train/lr: 0.0002 |
| [[34m2026-03-01 07:01:05[0m] [Epoch 109] epoch/loss: 0.3663 |
| [[34m2026-03-01 07:01:05[0m] Saving checkpoint at epoch 110... |
| [[34m2026-03-01 07:01:38[0m] [Epoch 110 | Step 137700] train/loss: 0.3629, train/lr: 0.0002 |
| [[34m2026-03-01 07:02:02[0m] [Epoch 110 | Step 137800] train/loss: 0.3645, train/lr: 0.0002 |
| [[34m2026-03-01 07:02:26[0m] [Epoch 110 | Step 137900] train/loss: 0.3667, train/lr: 0.0002 |
| [[34m2026-03-01 07:02:50[0m] [Epoch 110 | Step 138000] train/loss: 0.3668, train/lr: 0.0002 |
| [[34m2026-03-01 07:03:14[0m] [Epoch 110 | Step 138100] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 07:03:39[0m] [Epoch 110 | Step 138200] train/loss: 0.3696, train/lr: 0.0002 |
| [[34m2026-03-01 07:04:04[0m] [Epoch 110 | Step 138300] train/loss: 0.3642, train/lr: 0.0002 |
| [[34m2026-03-01 07:04:29[0m] [Epoch 110 | Step 138400] train/loss: 0.3656, train/lr: 0.0002 |
| [[34m2026-03-01 07:04:55[0m] [Epoch 110 | Step 138500] train/loss: 0.3659, train/lr: 0.0002 |
| [[34m2026-03-01 07:05:21[0m] [Epoch 110 | Step 138600] train/loss: 0.3627, train/lr: 0.0002 |
| [[34m2026-03-01 07:05:47[0m] [Epoch 110 | Step 138700] train/loss: 0.3688, train/lr: 0.0002 |
| [[34m2026-03-01 07:06:12[0m] [Epoch 110 | Step 138800] train/loss: 0.3670, train/lr: 0.0002 |
| [[34m2026-03-01 07:06:30[0m] [Epoch 110] epoch/loss: 0.3659 |
| [[34m2026-03-01 07:06:42[0m] [Epoch 111 | Step 138900] train/loss: 0.3633, train/lr: 0.0002 |
| [[34m2026-03-01 07:07:07[0m] [Epoch 111 | Step 139000] train/loss: 0.3676, train/lr: 0.0002 |
| [[34m2026-03-01 07:07:32[0m] [Epoch 111 | Step 139100] train/loss: 0.3650, train/lr: 0.0002 |
| [[34m2026-03-01 07:07:56[0m] [Epoch 111 | Step 139200] train/loss: 0.3683, train/lr: 0.0002 |
| [[34m2026-03-01 07:08:20[0m] [Epoch 111 | Step 139300] train/loss: 0.3621, train/lr: 0.0002 |
| [[34m2026-03-01 07:08:44[0m] [Epoch 111 | Step 139400] train/loss: 0.3680, train/lr: 0.0002 |
| [[34m2026-03-01 16:10:14[0m] Experiment directory created at results-sh-ceph/stage2/rae_lightningdit_b |
| [[34m2026-03-01 16:10:51[0m] Model Parameters: 676.37M |
| [[34m2026-03-01 16:11:16[0m] Evaluation dataset loaded from /home/jovyan/liushanyuan-sh-ceph/project/sub_project/gongyue/data/ILSVRC2012/ILSVRC2012_validation/data, containing 50000 images. |
| [[34m2026-03-01 16:11:16[0m] Experiment resume checkpoint found at results-sh-ceph/stage2/rae_lightningdit_b/checkpoints/ep-0000110.pt, automatically resuming... |
|
|