Spaces:

optiviseapp
/

fnmodel

Paused

App Files Files Community

47.5 kB

2 contributors

History: 36 commits

aeb56

Monkey-patch transformers to disable flash attention via wrapper script

2900b36 about 2 months ago

.gitattributes
1.52 kB

initial commit about 2 months ago
.gitignore
543 Bytes

Initial commit: LoRA model merger about 2 months ago
Dockerfile
1.1 kB

Switch to vLLM for high-performance, stable inference about 2 months ago
README.md
4.47 kB

Aggressive memory cleanup: 5s wait, env vars, optional model loading about 2 months ago
README_inference.md
2.66 kB

Transform Space into professional inference UI for fine-tuned model about 2 months ago
app.py
20.2 kB

Monkey-patch transformers to disable flash attention via wrapper script about 2 months ago
inference_app.py
11.9 kB

Transform Space into professional inference UI for fine-tuned model about 2 months ago
merge_script.py
4.8 kB

Implement manual LoRA merging to fix PEFT key naming conflicts about 2 months ago
requirements.txt
356 Bytes

Workaround flash-attn: create fake module with PyTorch fallback attention about 2 months ago