Instructions to use raynardj/ner-chemical-bionlp-bc5cdr-pubmed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use raynardj/ner-chemical-bionlp-bc5cdr-pubmed with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="raynardj/ner-chemical-bionlp-bc5cdr-pubmed")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("raynardj/ner-chemical-bionlp-bc5cdr-pubmed") model = AutoModelForTokenClassification.from_pretrained("raynardj/ner-chemical-bionlp-bc5cdr-pubmed") - Notebooks
- Google Colab
- Kaggle
NER to find Gene & Gene products
The model was trained on bionlp and bc4cdr dataset, pretrained on this pubmed-pretrained roberta model All the labels, the possible token classes.
{"label2id":
{
"O": 0,
"Chemical": 1,
}
}
Notice, we removed the 'B-','I-' etc from data label.🗡
This is the template we suggest for using the model
Of course I'm well aware of the aggregation_strategy arguments offered by hf, but by the way of training, I discard any entropy loss for appending subwords, like only the label for the 1st subword token is not -100, after many search effort, I can't find a way to achieve that with default pipeline, hence I fancy an inference class myself.
!pip install forgebox
from forgebox.hf.train import NERInference
ner = NERInference.from_pretrained("raynardj/ner-chemical-bionlp-bc5cdr-pubmed")
a_df = ner.predict(["text1", "text2"])
check our NER model on
- Downloads last month
- 13