| --- |
| language: |
| - en |
| license: apache-2.0 |
| library_name: transformers |
| tags: |
| - medical |
| - spinal-cord-injury |
| - healthcare |
| - disability |
| - accessibility |
| - fine-tuned |
| - lora |
| - mistral |
| base_model: teknium/OpenHermes-2.5-Mistral-7B |
| pipeline_tag: text-generation |
| widget: |
| - text: "What is autonomic dysreflexia?" |
| example_title: "Medical Question" |
| - text: "How can I transfer from my wheelchair to a car?" |
| example_title: "Daily Living" |
| - text: "What exercises are good for someone with paraplegia?" |
| example_title: "Exercise & Rehabilitation" |
| model-index: |
| - name: sci-assistant |
| results: [] |
| --- |
| |
| # SCI Assistant - Spinal Cord Injury Specialized AI Assistant |
|
|
| A specialized AI assistant fine-tuned specifically for people with spinal cord injuries (SCI). This model is based on OpenHermes-2.5-Mistral-7B and has been trained using a two-phase approach with LoRA (Low-Rank Adaptation) to provide contextually appropriate and medically-informed responses for the SCI community. |
|
|
| ## Model Description |
|
|
| This model was fine-tuned using a two-phase training approach: |
| 1. **Phase 1**: Domain pretraining on SCI-related medical texts and resources |
| 2. **Phase 2**: Instruction tuning on conversational SCI-focused Q&A pairs |
|
|
| The model understands the unique challenges, medical realities, and daily life considerations of individuals living with spinal cord injuries. |
|
|
| ## Training Details |
|
|
| - **Base Model**: teknium/OpenHermes-2.5-Mistral-7B |
| - **Training Method**: QLoRA (4-bit quantization with LoRA adapters) |
| - **Training Data**: 119,117 total entries (35,779 domain text + 83,337 instruction pairs) |
| - **Hardware**: RTX 4070 Super (12GB VRAM) |
| - **Training Time**: ~20 hours total (Phase 1 + Phase 2) |
|
|
| ## Usage |
|
|
| This repository contains both the LoRA adapter and the full merged model. Choose the option that works best for you: |
|
|
| ### Option 1: Use the Full Merged Model (Recommended) |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("basiphobe/sci-assistant") |
| tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant") |
| |
| # Example usage |
| prompt = "What are the signs of autonomic dysreflexia?" |
| inputs = tokenizer(prompt, return_tensors="pt") |
| outputs = model.generate(**inputs, max_length=200) |
| response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| ``` |
|
|
| ### Option 2: Use the LoRA Adapter (Smaller Download) |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
| from peft import PeftModel |
| import torch |
| |
| # Load model |
| bnb_config = BitsAndBytesConfig( |
| load_in_4bit=True, |
| bnb_4bit_compute_dtype=torch.float16, |
| ) |
| |
| base_model = AutoModelForCausalLM.from_pretrained( |
| "teknium/OpenHermes-2.5-Mistral-7B", |
| quantization_config=bnb_config, |
| device_map="auto" |
| ) |
| |
| model = PeftModel.from_pretrained(base_model, "basiphobe/sci-assistant") |
| tokenizer = AutoTokenizer.from_pretrained("basiphobe/sci-assistant") |
| |
| # Format prompt with SCI context |
| system_context = "You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI." |
| |
| prompt = f"{system_context}\n\n### Instruction:\n{your_question}\n\n### Response:\n" |
| |
| # Generate response |
| inputs = tokenizer(prompt, return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) |
| response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) |
| ``` |
|
|
| ## Files in this Repository |
|
|
| - **Full Merged Model**: Ready-to-use model files (`model-*.safetensors`, `config.json`, etc.) |
| - **LoRA Adapter**: Smaller adapter files (`adapter_model.safetensors`, `adapter_config.json`) |
| - **Tokenizer**: Shared tokenizer files for both options |
|
|
| ## GGUF Format Models |
|
|
| This repository also includes GGUF format models optimized for use with **llama.cpp**, **Ollama**, and other GGUF-compatible inference engines. These formats offer excellent performance and compatibility across different platforms. |
|
|
| ### Available GGUF Models |
|
|
| | File | Size | Format | Use Case | RAM Required | |
| |------|------|--------|----------|--------------| |
| | `merged-sci-model.gguf` | 14GB | F16 | Maximum quality inference | ~16GB | |
| | `merged-sci-model-q6_k.gguf` | 5.6GB | Q6_K | High quality with good compression | ~8GB | |
| | `merged-sci-model-q5_k_m.gguf` | 4.8GB | Q5_K_M | Excellent quality/size balance | ~7GB | |
| | `merged-sci-model-q5_k_s.gguf` | 4.7GB | Q5_K_S | Good quality, slightly smaller | ~7GB | |
| | `merged-sci-model-q4_k_m.gguf` | 4.1GB | Q4_K_M | Balanced quality/performance | ~6GB | |
| |
| ### Usage with Ollama |
| |
| **1. Download and create Modelfile:** |
| ```bash |
| # Download the Q5_K_M model (recommended balance of quality/size) |
| wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf |
| |
| # Create Modelfile |
| cat > Modelfile << 'EOF' |
| FROM ./merged-sci-model-q5_k_m.gguf |
| TEMPLATE """<|im_start|>system |
| You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|> |
| <|im_start|>user |
| {{ .Prompt }}<|im_end|> |
| <|im_start|>assistant |
| """ |
| PARAMETER stop "<|im_start|>" |
| PARAMETER stop "<|im_end|>" |
| PARAMETER temperature 0.7 |
| PARAMETER top_p 0.9 |
| EOF |
| ``` |
| |
| **2. Create and run the model:** |
| ```bash |
| ollama create sci-assistant -f Modelfile |
| ollama run sci-assistant "What are the signs of autonomic dysreflexia?" |
| ``` |
| |
| ### Usage with llama.cpp |
| |
| **1. Install and setup:** |
| ```bash |
| # Clone and build llama.cpp |
| git clone https://github.com/ggerganov/llama.cpp |
| cd llama.cpp |
| make |
| |
| # Download model |
| wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf |
| ``` |
| |
| **2. Interactive chat:** |
| ```bash |
| ./main -m merged-sci-model-q5_k_m.gguf \ |
| --temp 0.7 \ |
| --repeat_penalty 1.1 \ |
| -c 4096 \ |
| --interactive \ |
| --in-prefix "<|im_start|>user\n" \ |
| --in-suffix "<|im_end|>\n<|im_start|>assistant\n" |
| ``` |
| |
| **3. Single prompt:** |
| ```bash |
| ./main -m merged-sci-model-q5_k_m.gguf \ |
| --temp 0.7 \ |
| -c 2048 \ |
| -p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n" |
| ``` |
| |
| ### Performance Comparison |
| |
| - **F16 Model** (`merged-sci-model.gguf`): Maximum quality, largest memory footprint |
| - **Q6_K Model** (`merged-sci-model-q6_k.gguf`): Near-maximum quality with 60% size reduction |
| - **Q5_K_M Model** (`merged-sci-model-q5_k_m.gguf`): Excellent quality retention, good balance |
| - **Q5_K_S Model** (`merged-sci-model-q5_k_s.gguf`): Very good quality, slightly more compressed |
| - **Q4_K_M Model** (`merged-sci-model-q4_k_m.gguf`): Good quality, smallest size, recommended for resource-constrained environments |
| |
| All models use the **ChatML** template format and support up to **32K context length**. |
| |
| ## Intended Use |
| |
| This model is designed to: |
| - Provide SCI-specific information and guidance |
| - Answer questions about daily life with spinal cord injuries |
| - Offer practical advice for common SCI challenges |
| - Support the SCI community with contextually appropriate responses |
| |
| ## Limitations |
| |
| - This model is for informational purposes only and should not replace professional medical advice |
| - Always consult with healthcare providers for medical decisions |
| - The model may not have information about the latest medical developments |
| - Responses should be verified with medical professionals when making health-related decisions |
| |
| ## Direct Use |
| |
| This model can be used directly for: |
| - Educational purposes about spinal cord injuries |
| - Providing general information and support to the SCI community |
| - Research into specialized medical AI assistants |
| - Personal use by individuals seeking SCI-related information |
| |
| The model is designed to provide contextually appropriate responses that consider the unique challenges and medical realities of spinal cord injuries. |
| |
| ### Downstream Use |
| |
| This model can be fine-tuned further for: |
| - Integration into healthcare applications |
| - Specialized medical chatbots for rehabilitation centers |
| - Educational platforms for SCI awareness and training |
| - Research applications in medical AI |
| - Custom applications for SCI support organizations |
| |
| When used in downstream applications, implementers should: |
| - Maintain the medical disclaimer requirements |
| - Ensure proper supervision by medical professionals |
| - Implement appropriate safety measures and content filtering |
| - Validate outputs for medical accuracy in their specific use case |
| |
| ### Out-of-Scope Use |
| |
| This model should NOT be used for: |
| - **Medical diagnosis or treatment decisions** - Always consult healthcare professionals |
| - **Emergency medical situations** - Seek immediate professional medical help |
| - **Legal or financial advice** related to SCI cases |
| - **Replacement for professional medical consultation** |
| - **Clinical decision-making** without physician oversight |
| - **Applications targeting vulnerable populations** without proper safeguards |
| - **Commercial medical applications** without appropriate medical validation and oversight |
| |
| ## Bias, Risks, and Limitations |
| |
| ### Medical Limitations |
| - **Not a substitute for medical professionals**: All medical advice should be verified with qualified healthcare providers |
| - **Training data limitations**: May not include the most recent medical research or treatments |
| - **Individual variation**: SCI affects individuals differently; responses may not apply to all cases |
| - **Geographic bias**: Training data may be biased toward certain healthcare systems or regions |
| |
| ### Technical Limitations |
| - **Hallucination risk**: Like all language models, may generate plausible-sounding but incorrect information |
| - **Context limitations**: Limited by input context window and may not retain information across long conversations |
| - **Language limitations**: Primarily trained on English content |
| - **Update lag**: Cannot access real-time medical research or current events |
| |
| ### Bias Considerations |
| - **Training data bias**: Reflects biases present in source medical literature and online content |
| - **Demographic representation**: May not equally represent all demographics within the SCI community |
| - **Healthcare access bias**: May reflect biases toward certain types of healthcare systems |
| - **Severity bias**: May be more informed about certain types or severities of SCI |
| |
| ### Risk Mitigation |
| - Always include medical disclaimers when using this model |
| - Implement content filtering for harmful or dangerous advice |
| - Regular evaluation by medical professionals is recommended |
| - Monitor outputs for accuracy and appropriateness |
| |
| ## Recommendations |
| |
| Users should be aware of the following recommendations: |
| |
| **For Direct Users:** |
| - Always verify medical information with qualified healthcare professionals |
| - Use responses as educational/informational starting points, not definitive advice |
| - Be aware that individual SCI experiences vary significantly |
| - Seek immediate professional help for urgent medical concerns |
| |
| **For Developers/Implementers:** |
| - Implement clear medical disclaimers in any application using this model |
| - Provide easy access to professional medical resources alongside model responses |
| - Consider implementing content filtering for potentially harmful advice |
| - Regular review by medical professionals is strongly recommended |
| - Ensure compliance with relevant healthcare regulations (HIPAA, etc.) |
| |
| **For Healthcare Organizations:** |
| - Professional medical oversight is essential when implementing in clinical settings |
| - Regular validation of model outputs against current medical standards |
| - Integration should complement, not replace, professional medical consultation |
| - Staff training on AI limitations and appropriate use cases |
| |
| ## Training Details |
| |
| ### Training Data |
| |
| The training dataset consisted of 119,117 carefully curated entries focused on spinal cord injury information: |
| |
| **Domain Pretraining Data (35,779 entries):** |
| - Medical literature and research papers on SCI |
| - Educational materials from reputable SCI organizations |
| - Clinical guidelines and treatment protocols |
| - Rehabilitation and therapy documentation |
| - Patient education resources |
| |
| **Instruction Tuning Data (83,337 entries):** |
| - SCI-focused question-answer pairs |
| - Conversational examples with appropriate medical context |
| - Real-world scenarios and practical advice situations |
| - Educational Q&A formatted for instruction following |
| |
| All training data was filtered and curated to ensure: |
| - Sources from reputable medical organizations and healthcare professionals |
| - Content originally created or reviewed by medical professionals in the SCI field |
| - Appropriate tone and sensitivity for SCI community |
| - Removal of potentially harmful or dangerous advice |
| - Proper medical disclaimers and context |
| |
| **Note**: While the source materials were created by medical professionals, this model itself has not undergone independent medical validation. |
| |
| ### Training Procedure |
| |
| The model was trained using a two-phase approach with QLoRA (Quantized Low-Rank Adaptation): |
| |
| **Phase 1 - Domain Pretraining:** |
| - Focus: Medical terminology and SCI-specific knowledge |
| - Duration: 2 epochs (~8 hours) |
| - Data: 35,779 domain text entries |
| - Objective: Adapt base model to SCI medical domain |
| |
| **Phase 2 - Instruction Tuning:** |
| - Focus: Conversational abilities and response formatting |
| - Duration: 2 epochs (~12 hours) |
| - Data: 83,337 instruction-response pairs |
| - Objective: Teach appropriate response patterns and tone |
| |
| #### Preprocessing |
| |
| Training data underwent extensive preprocessing: |
| - Content sourced from materials created by healthcare professionals |
| - Sensitive content filtering and safety checks |
| - Standardized formatting for instruction-following |
| - Quality filtering to remove low-quality or inappropriate content |
| - Tokenization optimization for efficient training |
| |
| #### Training Hyperparameters |
| |
| - **Training regime:** 4-bit quantization with LoRA adapters (QLoRA) |
| - **Learning rate:** 2e-4 with cosine scheduling |
| - **LoRA rank:** 16 |
| - **LoRA alpha:** 32 |
| - **LoRA dropout:** 0.05 |
| - **Target modules:** q_proj, v_proj |
| - **Batch size:** 4 with gradient accumulation |
| - **Max sequence length:** 512 tokens |
| - **Optimizer:** AdamW with weight decay |
| |
| #### Speeds, Sizes, Times |
| |
| - **Total training time:** ~20 hours (8h Phase 1 + 12h Phase 2) |
| - **Hardware:** RTX 4070 Super (12GB VRAM) |
| - **Final model size:** 30MB (LoRA adapter only) |
| - **Base model size:** 7B parameters (not included in adapter) |
| - **Training throughput:** ~3.5 samples/second average |
| - **Memory usage:** 6-7GB VRAM during training |
| |
| ## Evaluation |
| |
| ### Testing Data, Factors & Metrics |
| |
| #### Testing Data |
| |
| The model was evaluated using: |
| - Held-out test set of SCI-related questions (500 samples) |
| - Manual review of response quality and appropriateness |
| - Comparative analysis against general-purpose models on SCI topics |
| - Assessment of domain-specific knowledge retention |
| |
| **Note**: Evaluation was conducted by the model developer, not independent medical professionals. |
| |
| #### Factors |
| |
| Evaluation considered multiple factors: |
| - **Medical accuracy**: Correctness of SCI-related information |
| - **Appropriateness**: Sensitivity and tone for SCI community |
| - **Contextual relevance**: Understanding of SCI-specific challenges |
| - **Safety**: Avoidance of harmful or dangerous advice |
| - **Completeness**: Comprehensive responses to complex questions |
| |
| #### Metrics |
| |
| - **Medical accuracy score**: Based on consistency with source medical literature (not independently validated) |
| - **Appropriateness rating**: Developer assessment of tone and sensitivity (4.2/5.0 subjective rating) |
| - **Response relevance**: SCI-specific context understanding (82% relevance score) |
| - **Safety compliance**: No obviously harmful medical advice detected in test samples |
| - **Response quality**: Perplexity improvements over base model for SCI domain |
| |
| ### Results |
| |
| **Quantitative Results:** |
| - 40% improvement in SCI domain perplexity over base model |
| - Responses demonstrate consistency with source medical literature |
| - 95% safety compliance (no obviously harmful medical advice detected) |
| - 82% average relevance score for SCI-specific contexts |
| |
| **Qualitative Results:** |
| - Responses demonstrate clear understanding of SCI terminology and concepts |
| - Appropriate tone and sensitivity for disability community |
| - Consistent inclusion of medical disclaimers |
| - Good balance between being helpful and cautious about medical advice |
| |
| **Limitations of Evaluation:** |
| - Evaluation conducted by model developer, not independent medical experts |
| - No formal clinical validation or testing with SCI patients |
| - Results based on consistency with training sources, not independent medical verification |
| |
| ## Environmental Impact |
| |
| Training carbon emissions estimated using energy consumption data: |
| |
| - **Hardware Type:** RTX 4070 Super (12GB VRAM) |
| - **Hours used:** ~20 hours total training time |
| - **Cloud Provider:** Local training (personal hardware) |
| - **Compute Region:** North America |
| - **Carbon Emitted:** Approximately 2.1 kg CO2eq (estimated based on local energy grid) |
| |
| The use of QLoRA significantly reduced training time and energy consumption compared to full fine-tuning methods, making this a relatively efficient training approach. |
| |
| ## Technical Specifications |
| |
| ### Model Architecture and Objective |
| |
| - **Base Architecture:** Mistral 7B transformer model |
| - **Adaptation Method:** QLoRA (Quantized Low-Rank Adaptation) |
| - **Objective:** Causal language modeling with SCI domain specialization |
| - **Quantization:** 4-bit precision for memory efficiency |
| - **LoRA Configuration:** Rank-16 adapters on attention projection layers |
| |
| ### Compute Infrastructure |
| |
| #### Hardware |
| |
| - **GPU:** NVIDIA RTX 4070 Super (12GB VRAM) |
| - **CPU:** Modern multi-core processor |
| - **RAM:** 32GB system memory |
| - **Storage:** NVMe SSD for fast data loading |
| |
| #### Software |
| |
| - **Framework:** Transformers 4.36+, PEFT 0.16.0 |
| - **Training:** QLoRA with bitsandbytes quantization |
| - **Environment:** Python 3.10+, PyTorch 2.0+, CUDA 12.1 |
| |
| ## Citation |
| |
| If you use this model in your research or applications, please cite: |
| |
| **BibTeX:** |
| ```bibtex |
| @misc{sci_assistant_2025, |
| title={SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support}, |
| author={basiphobe}, |
| year={2025}, |
| howpublished={Hugging Face Model Repository}, |
| url={https://huggingface.co/basiphobe/sci-assistant} |
| } |
| ``` |
| |
| **APA:** |
| basiphobe. (2025). *SCI Assistant: A Specialized AI Assistant for Spinal Cord Injury Support*. Hugging Face. https://huggingface.co/basiphobe/sci-assistant |
| |
| ## Glossary |
| |
| **SCI**: Spinal Cord Injury - damage to the spinal cord that results in temporary or permanent changes in function |
| |
| **QLoRA**: Quantized Low-Rank Adaptation - an efficient fine-tuning method that reduces memory requirements |
| |
| **Domain Pretraining**: Training phase focused on learning domain-specific terminology and knowledge |
| |
| **Instruction Tuning**: Training phase focused on learning conversational patterns and response formatting |
| |
| **Perplexity**: A metric measuring how well a language model predicts text (lower is better) |
| |
| **LoRA**: Low-Rank Adaptation - parameter-efficient fine-tuning technique |
| |
| ## Model Card Authors |
| |
| **Primary Author:** basiphobe |
| **Model Development:** Individual research project for SCI community support |
| **Data Sources:** Curated from medical literature and educational materials created by healthcare professionals |
| **Validation Status:** Model has not undergone independent medical professional validation |
| |
| ## Model Card Contact |
| |
| For questions, issues, or feedback regarding this model: |
| - **Hugging Face:** https://huggingface.co/basiphobe/sci-assistant |
| - **Issues:** Please report issues through Hugging Face model repository |
| - **Medical Concerns:** Always consult qualified healthcare professionals |
| |
| **Important Note:** This model is provided for educational and informational purposes. Always seek professional medical advice for health-related questions and decisions. |
| |
| ### Framework versions |
| |
| - PEFT 0.16.0 |
| |