linh101201
/

scibert-concept-annotation

Text Classification

concept-annotation

sequence-classification

Model card Files Files and versions

linh101201 commited on 14 days ago

Commit

f284577

·

verified ·

1 Parent(s): b4e1e67

Update README.md

Files changed (1) hide show

README.md +47 -4

README.md CHANGED Viewed

@@ -1,11 +1,54 @@
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
-model = AutoModelForSequenceClassification.from_pretrained("linh101201/scibert-concept-annotation", num_labels=2).to("cuda")
-tokenizer = AutoTokenizer.from_pretrained("allenai/scibert_scivocab_uncased")
-inputs = tokenizer("Large Language Model in Law Documents Hub", "natural language processing", return_tensors="pt").to("cuda")
 with torch.no_grad():
     logits = model(**inputs).logits
-    print(logits)

+---
+language: en
+license: apache-2.0
+library_name: transformers
+tags:
+- scibert
+- concept-annotation
+- nlp
+- sequence-classification
+metrics:
+- accuracy
+pipeline_tag: text-classification
+---
+# SciBERT Concept Annotation
+This model is a fine-tuned version of SciBERT for **Concept Annotation**. It classifies the relationship between a document text and a specific concept/term using sequence classification.
+## Model Description
+- **Model type:** SciBERT (BERT-based)
+- **Language(s):** English
+- **License:** Apache 2.0
+- **Fine-tuned from model:** `allenai/scibert_scivocab_uncased`
+## Usage
+You can use this model directly with a custom inference script. Note that while the model weights are hosted here, it is designed to work with the `allenai/scibert_scivocab_uncased` tokenizer.
+### Example Code
+```python
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
+# Load model and tokenizer
+model_id = "linh101201/scibert-concept-annotation"
+tokenizer_id = "allenai/scibert_scivocab_uncased"
+model = AutoModelForSequenceClassification.from_pretrained(model_id, num_labels=2).to("cuda")
+tokenizer = AutoTokenizer.from_pretrained(tokenizer_id)
+# Example inputs: Document text and the Concept to annotate
+text = "Large Language Model in Law Documents Hub"
+concept = "natural language processing"
+inputs = tokenizer(text, concept, return_tensors="pt").to("cuda")
 with torch.no_grad():
     logits = model(**inputs).logits
+    # Apply softmax to get probabilities
+    probs = torch.nn.functional.softmax(logits, dim=-1)
+    print(f"Logits: {logits}")
+    print(f"Probabilities: {probs}")