Anima v1.0 4-step Distillation LoRAs (PCM + DMDX + DMD2 ) Released

#157

by darask0 - opened 16 days ago

I've released two distillation LoRAs for Anima v1.0 base, exploring different
distillation methods at 4-step inference, CFG=1.0. Both are publicly available
for research and study purposes, though neither matches or surpasses the official
Civitai Anima Turbo LoRA in practical quality.

TL;DR

Use the official Civitai Anima Turbo for production
The LoRAs in this repository are for research/educational purposes only
PCM yielded poor visual quality (not recommended)
DMDX was visually indistinguishable from its Civitai Turbo warm-start (no value over Turbo)

Repository

Weights: https://huggingface.co/darask0/anima-distill-loras
Training code: https://huggingface.co/darask0/rapid-anima
/ github.com/daraskme/rapid-anima

Released LoRAs (both non-recommended for production)

LoRA	Method	Init	Result
PCM	Phased Consistency Model (Wang et al. 2024)	Cold-start	⚠ Poor quality — not recommended
DMDX (ADM)	Adversarial Distribution Matching (arxiv 2507.18569)	Civitai Anima Turbo warm-start	⚠ Visually identical to Turbo (no added value)

DMD2 + TrigFlow training is underway and will be added when complete.

Technical Details

Base: anima-base-v1.0.safetensors
LoRA target: Wide LoRA covering AdaLN + attention + MLP linear layers (980 keys, rank 32)
Dataset: 5,000 Anima self-distillation samples (teacher 20-step CFG=4.5)
Hardware: NVIDIA B200 on Modal
Training cost: PCM ~$22 / DMDX ~$27

PCM (Not Recommended)

Phased Consistency Model with N=50 Euler timesteps split into K=4 phases
Pseudo-Huber loss with CFG-augmentation (w∈[4.0, 5.0]) embedded during training
Cold-start — converges numerically without warm-start
5,000 steps, ~3.4 hours, ~$22
Result: Visual quality fell short of the Anima Turbo baseline despite stable loss curves.
The cold-start approach failed to develop coherent style features expected from Anima distillation.
Numerical convergence ≠ generation quality.

DMDX (ADM, Equivalent to Turbo)

Implementation of the ADM portion from arxiv 2507.18569
(Lu et al., ByteDance Seed Vision)
Replaces DMD2's reverse-KL gradient trick with a **learnable discriminator
- hinge GAN (TVD minimization)**
LADD-style discriminator: teacher MiniTrainDIT (frozen) backbone + 5 spectral-norm heads
Cubic time schedule + teacher Δt evolution (Δt = T/64)
5,000 outer steps, ~4.3 hours, ~$27
Result: Student LoRA barely diverged from the Civitai Anima Turbo warm-start state
even after full 5,000 outer training. Output visually indistinguishable from Turbo.
GAN dynamics were healthy (D vs G oscillation, no mode collapse) but weight updates
too conservative to depart from the warm-start trajectory.

Key Lessons Learned

Lesson 1: PCM Cold-Start is Risky on Anima

PCM trained with stable loss but produced degraded visual quality. Numerical metrics
(loss curves, no NaN, no divergence) gave a false sense of success. Visual inspection
at intermediate checkpoints would have caught this earlier.

→ Always verify with visual generation tests at intermediate checkpoints, not just loss curves.

Lesson 2: Warm-Start Dominance in DMDX

DMDX with strong warm-start (Civitai Turbo) failed to develop independent distillation
behavior. Possible causes:

Hinge GAN updates are conservative when D approaches perfect classification
(G's gradients saturate)
Warm-start already worked well at the target 4-step CFG=1.0, leaving little room
for G to improve
recon_weight=0 (pure ADM) provided no soft guidance signal

→ Future attempts might try cold-start, higher lr_gen, or recon_weight > 0 (Smooth-L1
anchor borrowed from LADD).

Lesson 3: Beat the Official Distill LoRA is Hard

CircleStone Labs trained the official Anima Turbo with vastly more compute and iteration
than what's feasible on a hobby budget (~$50-300 per attempt vs official ~$10k+ likely).
Beating the official Turbo on a hobby budget is not realistic; aim for understanding
and method comparison instead.

ComfyUI Usage (if you still want to try)

ModelSamplingAuraFlow: sigma_shift = 3.0
LoraLoaderModelOnly: strength_model = 1.0
KSampler: steps = 4, cfg = 1.0,
sampler = er_sde, scheduler = simple

PCM was tested with both er_sde + simple and res_multistep + beta. DMDX tested with
er_sde + simple. The official Civitai Anima Turbo
remains the recommended option.

Conclusion

While these LoRAs are technically functional, practical users should stick with the
official Civitai Anima Turbo. This repository's value is in:

Documented distillation method comparisons (PCM, DMDX, DMD2)
Source code + Modal pipeline for experimentation
Honest failure analysis (docs/dmdx.md)
Feedback welcome.

License

LoRA weights: Apache-2.0
Base model (Anima v1.0): CircleStone Labs Non-Commercial License + NVIDIA Open Model License
Non-commercial use only

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment