模型:
optimum/bert-base-NER
bert-base-NER is a fine-tuned BERT model that is ready to use for Named Entity Recognition and achieves state-of-the-art performance for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC).
Specifically, this model is a bert-base-cased model that was fine-tuned on the English version of the standard CoNLL-2003 Named Entity Recognition dataset.
If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a bert-large-NER version is also available.
You can use this model with Transformers pipeline for NER.
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("dslim/bert-base-NER")
model = AutoModelForTokenClassification.from_pretrained("dslim/bert-base-NER")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Wolfgang and I live in Berlin"
ner_results = nlp(example)
print(ner_results)
 Limitations and bias
 This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains. Furthermore, the model occassionally tags subword tokens as entities and post-processing of results may be necessary to handle those cases.
This model was fine-tuned on English version of the standard CoNLL-2003 Named Entity Recognition dataset.
The training dataset distinguishes between the beginning and continuation of an entity so that if there are back-to-back entities of the same type, the model can output where the second entity begins. As in the dataset, each token will be classified as one of the following classes:
| Abbreviation | Description | 
|---|---|
| O | Outside of a named entity | 
| B-MIS | Beginning of a miscellaneous entity right after another miscellaneous entity | 
| I-MIS | Miscellaneous entity | 
| B-PER | Beginning of a person’s name right after another person’s name | 
| I-PER | Person’s name | 
| B-ORG | Beginning of an organization right after another organization | 
| I-ORG | organization | 
| B-LOC | Beginning of a location right after another location | 
| I-LOC | Location | 
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
# of training examples per entity type| Dataset | LOC | MISC | ORG | PER | 
|---|---|---|---|---|
| Train | 7140 | 3438 | 6321 | 6600 | 
| Dev | 1837 | 922 | 1341 | 1842 | 
| Test | 1668 | 702 | 1661 | 1617 | 
| Dataset | Articles | Sentences | Tokens | 
|---|---|---|---|
| Train | 946 | 14,987 | 203,621 | 
| Dev | 216 | 3,466 | 51,362 | 
| Test | 231 | 3,684 | 46,435 | 
This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the original BERT paper which trained & evaluated the model on CoNLL-2003 NER task.
| metric | dev | test | 
|---|---|---|
| f1 | 95.1 | 91.3 | 
| precision | 95.0 | 90.7 | 
| recall | 95.3 | 91.9 | 
The test metrics are a little lower than the official Google BERT results which encoded document context & experimented with CRF. More on replicating the original results here .
@article{DBLP:journals/corr/abs-1810-04805,
  author    = {Jacob Devlin and
               Ming{-}Wei Chang and
               Kenton Lee and
               Kristina Toutanova},
  title     = {{BERT:} Pre-training of Deep Bidirectional Transformers for Language
               Understanding},
  journal   = {CoRR},
  volume    = {abs/1810.04805},
  year      = {2018},
  url       = {http://arxiv.org/abs/1810.04805},
  archivePrefix = {arXiv},
  eprint    = {1810.04805},
  timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1810-04805.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
 @inproceedings{tjong-kim-sang-de-meulder-2003-introduction,
    title = "Introduction to the {C}o{NLL}-2003 Shared Task: Language-Independent Named Entity Recognition",
    author = "Tjong Kim Sang, Erik F.  and
      De Meulder, Fien",
    booktitle = "Proceedings of the Seventh Conference on Natural Language Learning at {HLT}-{NAACL} 2003",
    year = "2003",
    url = "https://www.aclweb.org/anthology/W03-0419",
    pages = "142--147",
}