flan-t5-base – Alzheimer Ultra-Safe Summarizer
Model summary
This repository contains a fine-tuned version of google/flan-t5-base for results- and conclusions-focused summarization of Alzheimer’s disease–related scientific abstracts.
- Base model:
google/flan-t5-base(≈250M parameters, encoder–decoder, Apache-2.0) - Task: Text-to-text summarization of biomedical abstracts
- Domain: Alzheimer’s disease, dementia, and related neurodegenerative / neuroimmunology literature
- Input: Full abstract (usually from PubMed or similar sources)
- Output: 1–3 sentence summary, biased towards the main results and conclusions
⚠️ Important: This model is intended only for research, education, and literature exploration.
It must not be used as a standalone tool for diagnosis, treatment decisions, or any clinical workflow.
Intended use
Primary use case
- Summarizing Alzheimer’s-related scientific abstracts into short, results-oriented summaries that are easier to scan.
- Supporting:
- literature review,
- dataset curation,
- building search / indexing tools,
- rapid exploration of Alzheimer’s disease research.
The model tends to emphasize:
- key findings (e.g., “X polymorphism is associated with AD risk”),
- high-level conclusions,
- sometimes sample characteristics (N, cohort description) when present in the abstract.
Supported languages
- English only.
- The base model is multilingual, but this fine-tuning was performed only on English biomedical abstracts.
- Using it on other languages is out of distribution and may produce poor or incorrect summaries.
Non-goals / out-of-scope
This model is not designed or validated for:
- Patient-level clinical decision support
- Prognosis estimation or risk scoring
- Generating treatment recommendations
- Legal, regulatory, or billing decisions
- Summarizing layperson health information for patients
How it was trained
Base model
google/flan-t5-base(Apache-2.0 licensed, instruction-tuned T5-base).
Training data (high-level)
The underlying dataset itself is not included in this repository. This section only documents how the data was used.
- ~9.6k abstracts related to:
- Alzheimer’s disease (AD),
- dementia,
- neurodegeneration,
- neuroinflammation / neuroimmunology,
- related biomarkers and imaging studies.
- Abstracts were retrieved programmatically from PubMed-like sources using Alzheimer’s-related queries.
- Each abstract is paired with a “teacher summary”, constructed heuristically by selecting sentences that:
- contain sections like
RESULTS:and/orCONCLUSIONS:(if present), - or otherwise capture the core result statement of the study.
- contain sections like
In other words, training labels are extractive, results-focused summaries derived from the abstracts themselves, not human-written abstractive summaries.
Objective
- Text-to-text supervised fine-tuning:
- Input: the full abstract (often with a task prefix like
summarize:or a short instruction). - Target: the corresponding
teacher_summary(1–3 sentences, mostly extractive).
- Input: the full abstract (often with a task prefix like
This encourages the model to:
- focus on the result/conclusion region of the abstract,
- avoid over-emphasizing background and methods,
- stay within the factual space of the original text.
Training setup (approximate)
- Framework: PyTorch +
transformers - Model class:
AutoModelForSeq2SeqLM - Tokenizer:
AutoTokenizerforgoogle/flan-t5-base - Train/validation split: ~90% / 10% on the Alzheimer abstracts
- Hyperparameters (typical configuration used in this project):
- Epochs: 5
- Optimizer:
AdamW - Learning rate: ~1e-4
- Weight decay: ~0.01
- LR schedule: linear decay with ~10% warmup
- Batch size: effective batch size increased via gradient accumulation
- Max input length: 512 tokens
- Max target length: ≈128 tokens
- Loss: standard cross-entropy on decoder outputs with padding tokens masked
Training dynamics (example)
Observed loss over 5 epochs (representative run):
Epoch 1– Train loss ≈ 0.32 | Val loss ≈ 0.18Epoch 5– Train loss ≈ 0.16 | Val loss ≈ 0.16
Combined with qualitative inspection, this indicates:
- Stable training (no divergence / NaNs)
- Reasonable convergence without strong overfitting
- Good alignment to the teacher summaries.
How to use the model
🔎 Note: The raw model is a standard seq2seq model.
For extra safety, you may want to wrap it with an overlap-based filter that removes sentences not grounded in the abstract (described later under “Safety & hallucination”).
Basic usage (raw summarization)
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "ffurkandemir/flan-t5-base-alzheimer-ultra-safe" # or your actual repo ID
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
abstract = """
Alzheimer's disease (AD) is a neurodegenerative disorder...
RESULTS: Patients with moderate-severe periodontitis had a higher risk...
CONCLUSIONS: Our findings suggest that periodontal disease may be associated with...
"""
prompt = (
"Summarize the following abstract in 2-3 sentences, focusing on the main "
"results and conclusions:\n\n" + abstract
)
inputs = tokenizer(
prompt,
return_tensors="pt",
truncation=True,
max_length=512,
)
outputs = model.generate(
**inputs,
max_new_tokens=256, # higher limit to avoid truncation
num_beams=4,
no_repeat_ngram_size=3,
early_stopping=True,
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
- Downloads last month
- 31
Model tree for furkanyagiz/flan-t5-base-alzheimer-ultra-safe
Base model
google/flan-t5-base