Instructions to use circlestone-labs/Anima with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusion Single File
How to use circlestone-labs/Anima with Diffusion Single File:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Anima v1.0 4-step Distillation LoRAs (PCM + DMDX + DMD2 ) Released
I've released two distillation LoRAs for Anima v1.0 base, exploring different
distillation methods at 4-step inference, CFG=1.0. Both are publicly available
for research and study purposes, though neither matches or surpasses the official
Civitai Anima Turbo LoRA in practical quality.
TL;DR
- Use the official Civitai Anima Turbo for production
- The LoRAs in this repository are for research/educational purposes only
- PCM yielded poor visual quality (not recommended)
- DMDX was visually indistinguishable from its Civitai Turbo warm-start (no value over Turbo)
Repository
- Weights: https://huggingface.co/darask0/anima-distill-loras
- Training code: https://huggingface.co/darask0/rapid-anima
/ github.com/daraskme/rapid-anima
Released LoRAs (both non-recommended for production)
| LoRA | Method | Init | Result |
|---|---|---|---|
| PCM | Phased Consistency Model (Wang et al. 2024) | Cold-start | β Poor quality β not recommended |
| DMDX (ADM) | Adversarial Distribution Matching (arxiv 2507.18569) | Civitai Anima Turbo warm-start | β Visually identical to Turbo (no added value) |
DMD2 + TrigFlow training is underway and will be added when complete.
Technical Details
- Base:
anima-base-v1.0.safetensors - LoRA target: Wide LoRA covering AdaLN + attention + MLP linear layers (980 keys, rank 32)
- Dataset: 5,000 Anima self-distillation samples (teacher 20-step CFG=4.5)
- Hardware: NVIDIA B200 on Modal
- Training cost: PCM ~$22 / DMDX ~$27
PCM (Not Recommended)
- Phased Consistency Model with N=50 Euler timesteps split into K=4 phases
- Pseudo-Huber loss with CFG-augmentation (wβ[4.0, 5.0]) embedded during training
- Cold-start β converges numerically without warm-start
- 5,000 steps, ~3.4 hours, ~$22
- Result: Visual quality fell short of the Anima Turbo baseline despite stable loss curves.
The cold-start approach failed to develop coherent style features expected from Anima distillation.
Numerical convergence β generation quality.
DMDX (ADM, Equivalent to Turbo)
- Implementation of the ADM portion from arxiv 2507.18569
(Lu et al., ByteDance Seed Vision) - Replaces DMD2's reverse-KL gradient trick with a **learnable discriminator
- hinge GAN (TVD minimization)**
- LADD-style discriminator: teacher MiniTrainDIT (frozen) backbone + 5 spectral-norm heads
- Cubic time schedule + teacher Ξt evolution (Ξt = T/64)
- 5,000 outer steps, ~4.3 hours, ~$27
- Result: Student LoRA barely diverged from the Civitai Anima Turbo warm-start state
even after full 5,000 outer training. Output visually indistinguishable from Turbo.
GAN dynamics were healthy (D vs G oscillation, no mode collapse) but weight updates
too conservative to depart from the warm-start trajectory.
Key Lessons Learned
Lesson 1: PCM Cold-Start is Risky on Anima
PCM trained with stable loss but produced degraded visual quality. Numerical metrics
(loss curves, no NaN, no divergence) gave a false sense of success. Visual inspection
at intermediate checkpoints would have caught this earlier.
β Always verify with visual generation tests at intermediate checkpoints, not just loss curves.
Lesson 2: Warm-Start Dominance in DMDX
DMDX with strong warm-start (Civitai Turbo) failed to develop independent distillation
behavior. Possible causes:
- Hinge GAN updates are conservative when D approaches perfect classification
(G's gradients saturate) - Warm-start already worked well at the target 4-step CFG=1.0, leaving little room
for G to improve recon_weight=0(pure ADM) provided no soft guidance signal
β Future attempts might try cold-start, higher lr_gen, or recon_weight > 0 (Smooth-L1
anchor borrowed from LADD).
Lesson 3: Beat the Official Distill LoRA is Hard
CircleStone Labs trained the official Anima Turbo with vastly more compute and iteration
than what's feasible on a hobby budget (~$50-300 per attempt vs official ~$10k+ likely).
Beating the official Turbo on a hobby budget is not realistic; aim for understanding
and method comparison instead.
ComfyUI Usage (if you still want to try)
ModelSamplingAuraFlow: sigma_shift = 3.0
LoraLoaderModelOnly: strength_model = 1.0
KSampler: steps = 4, cfg = 1.0,
sampler = er_sde, scheduler = simple
PCM was tested with both er_sde + simple and res_multistep + beta. DMDX tested wither_sde + simple. The official Civitai Anima Turbo
remains the recommended option.
Conclusion
While these LoRAs are technically functional, practical users should stick with the
official Civitai Anima Turbo. This repository's value is in:
- Documented distillation method comparisons (PCM, DMDX, DMD2)
- Source code + Modal pipeline for experimentation
- Honest failure analysis (docs/dmdx.md)
Feedback welcome.
License
- LoRA weights: Apache-2.0
- Base model (Anima v1.0): CircleStone Labs Non-Commercial License + NVIDIA Open Model License
- Non-commercial use only