|
|
nohup: ignoring input |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:70: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. |
|
|
self.scaler = GradScaler() |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:116: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.embeddings = torch.load(combined_path, map_location=self.device) |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:180: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.compressor.load_state_dict(torch.load('final_compressor_model.pth', map_location=self.device)) |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:181: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.decompressor.load_state_dict(torch.load('final_decompressor_model.pth', map_location=self.device)) |
|
|
/data2/edwardsun/flow_home/cfg_dataset.py:253: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
|
|
self.embeddings = torch.load(combined_path, map_location='cpu') |
|
|
Starting optimized training with batch_size=96, epochs=6000 |
|
|
Using GPU 0 for optimized H100 training |
|
|
Mixed precision: True |
|
|
Batch size: 96 |
|
|
Target epochs: 6000 |
|
|
Learning rate: 0.0004 -> 0.0002 |
|
|
β Mixed precision training enabled (BF16) |
|
|
Loading ALL AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
|
|
Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt... |
|
|
β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
|
|
Computing preprocessing statistics... |
|
|
β Statistics computed and saved: |
|
|
Total embeddings: 17,968 |
|
|
Mean: -0.0005 Β± 0.0897 |
|
|
Std: 0.0869 Β± 0.1168 |
|
|
Range: [-9.1738, 3.2894] |
|
|
Initializing models... |
|
|
β Model compiled with torch.compile for speedup |
|
|
β Models initialized: |
|
|
Compressor parameters: 78,817,360 |
|
|
Decompressor parameters: 39,458,720 |
|
|
Flow model parameters: 50,779,584 |
|
|
Initializing datasets with FULL data... |
|
|
Loading AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
|
|
Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt (FULL DATA)... |
|
|
β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
|
|
Loading CFG data from FASTA: /home/edwardsun/flow/combined_final.fasta... |
|
|
Parsing FASTA file: /home/edwardsun/flow/combined_final.fasta |
|
|
Label assignment: >AP = AMP (0), >sp = Non-AMP (1) |
|
|
β Parsed 6983 valid sequences from FASTA |
|
|
AMP sequences: 3306 |
|
|
Non-AMP sequences: 3677 |
|
|
Masked for CFG: 698 |
|
|
Loaded 6983 CFG sequences |
|
|
Label distribution: [3306 3677] |
|
|
Masked 698 labels for CFG training |
|
|
Aligning AMP embeddings with CFG data... |
|
|
Aligned 6983 samples |
|
|
CFG Flow Dataset initialized: |
|
|
AMP embeddings: torch.Size([17968, 50, 1280]) |
|
|
CFG labels: 6983 |
|
|
Aligned samples: 6983 |
|
|
β Dataset initialized with FULL data: |
|
|
Total samples: 6,983 |
|
|
Batch size: 96 |
|
|
Batches per epoch: 73 |
|
|
Total training steps: 438,000 |
|
|
Validation every: 10,000 steps |
|
|
Initializing optimizer and scheduler... |
|
|
β Optimizer initialized: |
|
|
Base LR: 0.0004 |
|
|
Min LR: 0.0002 |
|
|
Warmup steps: 5000 |
|
|
Weight decay: 0.01 |
|
|
Gradient clip norm: 1.0 |
|
|
β Optimized Single GPU training setup complete with FULL DATA! |
|
|
π Starting Optimized Single GPU Flow Matching Training with FULL DATA |
|
|
GPU: 0 |
|
|
Total iterations: 6000 |
|
|
Batch size: 96 |
|
|
Total samples: 6,983 |
|
|
Mixed precision: True |
|
|
Estimated time: ~8-10 hours (overnight training with ALL data) |
|
|
============================================================ |
|
|
Training Flow Model: 0%| | 0/6000 [00:00<?, ?it/s]/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 1/6000 [01:09<116:23:06, 69.84s/it]Epoch 0 | Step 1/438000 | Loss: 2.328033 | LR: 4.01e-05 | Speed: 0.0 steps/s | ETA: 3889.5h |
|
|
Epoch 0 | Avg Loss: 0.950054 | LR: 4.53e-05 | Time: 69.8s | Samples: 6,983 |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 2/6000 [01:15<53:24:52, 32.06s/it] Epoch 1 | Step 74/438000 | Loss: 0.629602 | LR: 4.53e-05 | Speed: 1.0 steps/s | ETA: 116.2h |
|
|
Epoch 1 | Avg Loss: 0.415130 | LR: 5.05e-05 | Time: 5.6s | Samples: 6,983 |
|
|
/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
|
|
with autocast(dtype=torch.bfloat16): |
|
|
Training Flow Model: 0%| | 3/6000 [01:18<31:05:28, 18.66s/it]Epoch 2 | Step 147/438000 | Loss: 0.304313 | LR: 5.06e-05 | Speed: 1.9 steps/s | ETA: 63.2h |
|
|
Epoch 2 | Avg Loss: 0.227218 | LR: 5.58e-05 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 4/6000 [01:20<20:29:56, 12.31s/it]Epoch 3 | Step 220/438000 | Loss: 0.210514 | LR: 5.58e-05 | Speed: 2.8 steps/s | ETA: 43.7h |
|
|
Epoch 3 | Avg Loss: 0.178846 | LR: 6.10e-05 | Time: 2.6s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 5/6000 [01:23<14:48:54, 8.90s/it]Epoch 4 | Step 293/438000 | Loss: 0.182317 | LR: 6.11e-05 | Speed: 3.6 steps/s | ETA: 33.9h |
|
|
Epoch 4 | Avg Loss: 0.148526 | LR: 6.63e-05 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 6/6000 [01:26<11:23:00, 6.84s/it]Epoch 5 | Step 366/438000 | Loss: 0.128248 | LR: 6.64e-05 | Speed: 4.3 steps/s | ETA: 28.1h |
|
|
Epoch 5 | Avg Loss: 0.127575 | LR: 7.15e-05 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 7/6000 [01:29<9:07:19, 5.48s/it] Epoch 6 | Step 439/438000 | Loss: 0.105957 | LR: 7.16e-05 | Speed: 5.0 steps/s | ETA: 24.2h |
|
|
Epoch 6 | Avg Loss: 0.109353 | LR: 7.68e-05 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 8/6000 [01:31<7:38:51, 4.59s/it]Epoch 7 | Step 512/438000 | Loss: 0.087330 | LR: 7.69e-05 | Speed: 5.7 steps/s | ETA: 21.3h |
|
|
Epoch 7 | Avg Loss: 0.101109 | LR: 8.20e-05 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 9/6000 [01:34<6:44:23, 4.05s/it]Epoch 8 | Step 585/438000 | Loss: 0.081881 | LR: 8.21e-05 | Speed: 6.3 steps/s | ETA: 19.3h |
|
|
Epoch 8 | Avg Loss: 0.089056 | LR: 8.73e-05 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 10/6000 [01:37<6:07:40, 3.68s/it]Epoch 9 | Step 658/438000 | Loss: 0.085630 | LR: 8.74e-05 | Speed: 6.9 steps/s | ETA: 17.6h |
|
|
Epoch 9 | Avg Loss: 0.083894 | LR: 9.26e-05 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 11/6000 [01:40<5:42:16, 3.43s/it]Epoch 10 | Step 731/438000 | Loss: 0.081927 | LR: 9.26e-05 | Speed: 7.4 steps/s | ETA: 16.4h |
|
|
Epoch 10 | Avg Loss: 0.077295 | LR: 9.78e-05 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 12/6000 [01:43<5:23:05, 3.24s/it]Epoch 11 | Step 804/438000 | Loss: 0.068221 | LR: 9.79e-05 | Speed: 7.9 steps/s | ETA: 15.3h |
|
|
Epoch 11 | Avg Loss: 0.072662 | LR: 1.03e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 13/6000 [01:46<5:12:15, 3.13s/it]Epoch 12 | Step 877/438000 | Loss: 0.079151 | LR: 1.03e-04 | Speed: 8.4 steps/s | ETA: 14.4h |
|
|
Epoch 12 | Avg Loss: 0.069846 | LR: 1.08e-04 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 14/6000 [01:48<4:58:17, 2.99s/it]Epoch 13 | Step 950/438000 | Loss: 0.074991 | LR: 1.08e-04 | Speed: 8.9 steps/s | ETA: 13.7h |
|
|
Epoch 13 | Avg Loss: 0.064569 | LR: 1.14e-04 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 15/6000 [01:51<4:51:40, 2.92s/it]Epoch 14 | Step 1023/438000 | Loss: 0.043908 | LR: 1.14e-04 | Speed: 9.3 steps/s | ETA: 13.0h |
|
|
Epoch 14 | Avg Loss: 0.057743 | LR: 1.19e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 16/6000 [01:54<4:48:57, 2.90s/it]Epoch 15 | Step 1096/438000 | Loss: 0.048052 | LR: 1.19e-04 | Speed: 9.7 steps/s | ETA: 12.4h |
|
|
Epoch 15 | Avg Loss: 0.058437 | LR: 1.24e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 17/6000 [01:57<4:44:23, 2.85s/it]Epoch 16 | Step 1169/438000 | Loss: 0.045587 | LR: 1.24e-04 | Speed: 10.1 steps/s | ETA: 12.0h |
|
|
Epoch 16 | Avg Loss: 0.055771 | LR: 1.29e-04 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 18/6000 [01:59<4:41:37, 2.82s/it]Epoch 17 | Step 1242/438000 | Loss: 0.053337 | LR: 1.29e-04 | Speed: 10.5 steps/s | ETA: 11.5h |
|
|
Epoch 17 | Avg Loss: 0.053140 | LR: 1.35e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 19/6000 [02:02<4:42:38, 2.84s/it]Epoch 18 | Step 1315/438000 | Loss: 0.075343 | LR: 1.35e-04 | Speed: 10.9 steps/s | ETA: 11.1h |
|
|
Epoch 18 | Avg Loss: 0.049295 | LR: 1.40e-04 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 20/6000 [02:05<4:42:48, 2.84s/it]Epoch 19 | Step 1388/438000 | Loss: 0.043840 | LR: 1.40e-04 | Speed: 11.2 steps/s | ETA: 10.8h |
|
|
Epoch 19 | Avg Loss: 0.049483 | LR: 1.45e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 21/6000 [02:08<4:41:33, 2.83s/it]Epoch 20 | Step 1461/438000 | Loss: 0.076462 | LR: 1.45e-04 | Speed: 11.6 steps/s | ETA: 10.5h |
|
|
Epoch 20 | Avg Loss: 0.048242 | LR: 1.50e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 22/6000 [02:11<4:40:07, 2.81s/it]Epoch 21 | Step 1534/438000 | Loss: 0.039453 | LR: 1.50e-04 | Speed: 11.9 steps/s | ETA: 10.2h |
|
|
Epoch 21 | Avg Loss: 0.047419 | LR: 1.56e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 23/6000 [02:13<4:40:29, 2.82s/it]Epoch 22 | Step 1607/438000 | Loss: 0.058766 | LR: 1.56e-04 | Speed: 12.2 steps/s | ETA: 10.0h |
|
|
Epoch 22 | Avg Loss: 0.047794 | LR: 1.61e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 24/6000 [02:16<4:45:06, 2.86s/it]Epoch 23 | Step 1680/438000 | Loss: 0.038332 | LR: 1.61e-04 | Speed: 12.4 steps/s | ETA: 9.7h |
|
|
Epoch 23 | Avg Loss: 0.047601 | LR: 1.66e-04 | Time: 3.0s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 25/6000 [02:19<4:45:46, 2.87s/it]Epoch 24 | Step 1753/438000 | Loss: 0.053138 | LR: 1.66e-04 | Speed: 12.7 steps/s | ETA: 9.5h |
|
|
Epoch 24 | Avg Loss: 0.045266 | LR: 1.71e-04 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 26/6000 [02:22<4:41:52, 2.83s/it]Epoch 25 | Step 1826/438000 | Loss: 0.045704 | LR: 1.71e-04 | Speed: 13.0 steps/s | ETA: 9.3h |
|
|
Epoch 25 | Avg Loss: 0.044707 | LR: 1.77e-04 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 27/6000 [02:25<4:38:38, 2.80s/it]Epoch 26 | Step 1899/438000 | Loss: 0.052826 | LR: 1.77e-04 | Speed: 13.2 steps/s | ETA: 9.1h |
|
|
Epoch 26 | Avg Loss: 0.041951 | LR: 1.82e-04 | Time: 2.7s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 28/6000 [02:28<4:41:26, 2.83s/it]Epoch 27 | Step 1972/438000 | Loss: 0.030554 | LR: 1.82e-04 | Speed: 13.5 steps/s | ETA: 9.0h |
|
|
Epoch 27 | Avg Loss: 0.044097 | LR: 1.87e-04 | Time: 2.9s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 29/6000 [02:31<4:41:46, 2.83s/it]Epoch 28 | Step 2045/438000 | Loss: 0.036556 | LR: 1.87e-04 | Speed: 13.7 steps/s | ETA: 8.8h |
|
|
Epoch 28 | Avg Loss: 0.043588 | LR: 1.92e-04 | Time: 2.8s | Samples: 6,983 |
|
|
Training Flow Model: 0%| | 30/6000 [02:34<5:10:48, 3.12s/it]Epoch 29 | Step 2118/438000 | Loss: 0.036764 | LR: 1.92e-04 | Speed: 13.9 steps/s | ETA: 8.7h |
|
|
Epoch 29 | Avg Loss: 0.042376 | LR: 1.98e-04 | Time: 3.8s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 31/6000 [02:38<5:33:49, 3.36s/it]Epoch 30 | Step 2191/438000 | Loss: 0.034607 | LR: 1.98e-04 | Speed: 14.1 steps/s | ETA: 8.6h |
|
|
Epoch 30 | Avg Loss: 0.039175 | LR: 2.03e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 32/6000 [02:42<5:52:54, 3.55s/it]Epoch 31 | Step 2264/438000 | Loss: 0.026377 | LR: 2.03e-04 | Speed: 14.2 steps/s | ETA: 8.5h |
|
|
Epoch 31 | Avg Loss: 0.041455 | LR: 2.08e-04 | Time: 4.0s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 33/6000 [02:46<6:02:28, 3.64s/it]Epoch 32 | Step 2337/438000 | Loss: 0.043802 | LR: 2.08e-04 | Speed: 14.3 steps/s | ETA: 8.5h |
|
|
Epoch 32 | Avg Loss: 0.040566 | LR: 2.13e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 34/6000 [02:50<6:11:21, 3.73s/it]Epoch 33 | Step 2410/438000 | Loss: 0.041541 | LR: 2.14e-04 | Speed: 14.4 steps/s | ETA: 8.4h |
|
|
Epoch 33 | Avg Loss: 0.038954 | LR: 2.19e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 35/6000 [02:54<6:14:18, 3.77s/it]Epoch 34 | Step 2483/438000 | Loss: 0.040879 | LR: 2.19e-04 | Speed: 14.5 steps/s | ETA: 8.4h |
|
|
Epoch 34 | Avg Loss: 0.041221 | LR: 2.24e-04 | Time: 3.8s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 36/6000 [02:58<6:20:05, 3.82s/it]Epoch 35 | Step 2556/438000 | Loss: 0.043876 | LR: 2.24e-04 | Speed: 14.6 steps/s | ETA: 8.3h |
|
|
Epoch 35 | Avg Loss: 0.039926 | LR: 2.29e-04 | Time: 4.0s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 37/6000 [03:02<6:24:48, 3.87s/it]Epoch 36 | Step 2629/438000 | Loss: 0.047236 | LR: 2.29e-04 | Speed: 14.7 steps/s | ETA: 8.2h |
|
|
Epoch 36 | Avg Loss: 0.043514 | LR: 2.34e-04 | Time: 4.0s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 38/6000 [03:06<6:26:14, 3.89s/it]Epoch 37 | Step 2702/438000 | Loss: 0.030528 | LR: 2.35e-04 | Speed: 14.7 steps/s | ETA: 8.2h |
|
|
Epoch 37 | Avg Loss: 0.037676 | LR: 2.40e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 39/6000 [03:10<6:23:45, 3.86s/it]Epoch 38 | Step 2775/438000 | Loss: 0.045154 | LR: 2.40e-04 | Speed: 14.8 steps/s | ETA: 8.1h |
|
|
Epoch 38 | Avg Loss: 0.039012 | LR: 2.45e-04 | Time: 3.8s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 40/6000 [03:13<6:24:55, 3.88s/it]Epoch 39 | Step 2848/438000 | Loss: 0.041152 | LR: 2.45e-04 | Speed: 14.9 steps/s | ETA: 8.1h |
|
|
Epoch 39 | Avg Loss: 0.037944 | LR: 2.50e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 41/6000 [03:17<6:23:10, 3.86s/it]Epoch 40 | Step 2921/438000 | Loss: 0.031573 | LR: 2.50e-04 | Speed: 15.0 steps/s | ETA: 8.1h |
|
|
Epoch 40 | Avg Loss: 0.037019 | LR: 2.55e-04 | Time: 3.8s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 42/6000 [03:21<6:24:07, 3.87s/it]Epoch 41 | Step 2994/438000 | Loss: 0.031375 | LR: 2.56e-04 | Speed: 15.1 steps/s | ETA: 8.0h |
|
|
Epoch 41 | Avg Loss: 0.036788 | LR: 2.61e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 43/6000 [03:25<6:23:46, 3.87s/it]Epoch 42 | Step 3067/438000 | Loss: 0.025271 | LR: 2.61e-04 | Speed: 15.1 steps/s | ETA: 8.0h |
|
|
Epoch 42 | Avg Loss: 0.038254 | LR: 2.66e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 44/6000 [03:29<6:27:10, 3.90s/it]Epoch 43 | Step 3140/438000 | Loss: 0.059067 | LR: 2.66e-04 | Speed: 15.2 steps/s | ETA: 7.9h |
|
|
Epoch 43 | Avg Loss: 0.037138 | LR: 2.71e-04 | Time: 4.0s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 45/6000 [03:33<6:24:59, 3.88s/it]Epoch 44 | Step 3213/438000 | Loss: 0.042951 | LR: 2.71e-04 | Speed: 15.3 steps/s | ETA: 7.9h |
|
|
Epoch 44 | Avg Loss: 0.039265 | LR: 2.77e-04 | Time: 3.8s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 46/6000 [03:37<6:24:33, 3.88s/it]Epoch 45 | Step 3286/438000 | Loss: 0.058999 | LR: 2.77e-04 | Speed: 15.3 steps/s | ETA: 7.9h |
|
|
Epoch 45 | Avg Loss: 0.036169 | LR: 2.82e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 47/6000 [03:41<6:24:28, 3.88s/it]Epoch 46 | Step 3359/438000 | Loss: 0.029517 | LR: 2.82e-04 | Speed: 15.4 steps/s | ETA: 7.8h |
|
|
Epoch 46 | Avg Loss: 0.037829 | LR: 2.87e-04 | Time: 3.9s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 48/6000 [03:45<6:28:07, 3.91s/it]Epoch 47 | Step 3432/438000 | Loss: 0.037272 | LR: 2.87e-04 | Speed: 15.5 steps/s | ETA: 7.8h |
|
|
Epoch 47 | Avg Loss: 0.038144 | LR: 2.92e-04 | Time: 4.0s | Samples: 6,983 |
|
|
Training Flow Model: 1%| | 49/6000 [03:48<6:27:19, 3.91s/it]Epoch 48 | Step 3505/438000 | Loss: 0.036242 | LR: 2.92e-04 | Speed: 15.5 steps/s | ETA: 7.8h |
|
|
Epoch 48 | Avg Loss: 0.034156 | LR: 2.98e-04 | Time: 3.9s | Samples: 6,983 |
|
|
|