Skip to main content

Few-Shot Named Entity Recognition using Span Markers

Project description

SpanMarker for Named Entity Recognition

SpanMarker is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take advantage of its valuable functionality.

Based on the PL-Marker paper, SpanMarker breaks the mold through its accessibility and ease of use. Crucially, SpanMarker works out of the box with many common encoders such as bert-base-cased and roberta-large, and automatically works with datasets using the IOB, IOB2, BIOES, BILOU or no label annotation scheme.

Installation

You may install the span_marker Python module via pip like so:

pip install span_marker

Quick Start

Please have a look at our Getting Started jupyter notebook for details on how SpanMarker is commonly used. That notebook explains the following snippet in more detail.

from datasets import load_dataset
from span_marker import SpanMarkerModel, Trainer
from transformers import TrainingArguments

dataset = load_dataset("DFKI-SLT/few-nerd", "supervised")
labels = dataset["train"].features["ner_tags"].feature.names

model_name = "bert-base-cased"
model = SpanMarkerModel.from_pretrained(model_name, labels=labels)

args = TrainingArguments(
    output_dir="my_span_marker_model",
    learning_rate=5e-5,
    gradient_accumulation_steps=2,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=1,
    save_strategy="steps",
    eval_steps=200,
    logging_steps=50,
    fp16=True,
    warmup_ratio=0.1,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=dataset["train"].select(range(8000)),
    eval_dataset=dataset["validation"].select(range(2000)),
)

trainer.train()
trainer.save_model("my_span_marker_model/checkpoint-final")

metrics = trainer.evaluate()
print(metrics)

For this work is based on PL-Marker, you may expect similar results to its Papers with Code Leaderboard. Tests, documentation and further information on expected performance will come soon.

Pretrained Models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

span_marker-0.1.1.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

span_marker-0.1.1-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file span_marker-0.1.1.tar.gz.

File metadata

  • Download URL: span_marker-0.1.1.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.1.1.tar.gz
Algorithm Hash digest
SHA256 699c9881728df19fc731dd33fa9682a02baf4dc5257facaf2b5ccefcfa29d88a
MD5 753ee8c345bb13acd267eae816f675c9
BLAKE2b-256 30b97187daffc56886b755190c4f4f1b57d25679021649540b12600ece705278

See more details on using hashes here.

File details

Details for the file span_marker-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: span_marker-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c23c9ee2db8242c6d315862246d7a1937674ee8ac7268e354deb6674921978d3
MD5 e37193ce73d576467363c6de6c2299c0
BLAKE2b-256 2ac954ea8604ee86b138e489ff900b4268a2d0aba5809f826a969552927a342b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page