Skip to main content

Few-Shot Named Entity Recognition using Span Markers

Project description

SpanMarker for Named Entity Recognition

SpanMarker is a framework for training powerful Named Entity Recognition models using familiar encoders such as BERT, RoBERTa and DeBERTa. Tightly implemented on top of the 🤗 Transformers library, SpanMarker can take advantage of its valuable functionality.

Based on the PL-Marker paper, SpanMarker breaks the mold through its accessibility and ease of use. Crucially, SpanMarker works out of the box with many common encoders such as bert-base-cased and roberta-large, and automatically works with datasets using the IOB, IOB2, BIOES, BILOU or no label annotation scheme.

Documentation

Feel free to have a look at the documentation.

Installation

You may install the span_marker Python module via pip like so:

pip install span_marker

Quick Start

Please have a look at our Getting Started notebook for details on how SpanMarker is commonly used. It explains the following snippet in more detail.

from datasets import load_dataset
from span_marker import SpanMarkerModel, Trainer
from transformers import TrainingArguments

dataset = load_dataset("DFKI-SLT/few-nerd", "supervised")
labels = dataset["train"].features["ner_tags"].feature.names

model_name = "bert-base-cased"
model = SpanMarkerModel.from_pretrained(model_name, labels=labels)

args = TrainingArguments(
    output_dir="my_span_marker_model",
    learning_rate=5e-5,
    gradient_accumulation_steps=2,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=1,
    save_strategy="steps",
    eval_steps=200,
    logging_steps=50,
    fp16=True,
    warmup_ratio=0.1,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=dataset["train"].select(range(8000)),
    eval_dataset=dataset["validation"].select(range(2000)),
)

trainer.train()
trainer.save_model("my_span_marker_model/checkpoint-final")

metrics = trainer.evaluate()
print(metrics)

Because this work is based on PL-Marker, you may expect similar results to its Papers with Code Leaderboard results. Tests, documentation and further information on expected performance will come soon.

Pretrained Models

Changelog

See CHANGELOG.md for news on all SpanMarker versions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

span_marker-0.2.1.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

span_marker-0.2.1-py3-none-any.whl (23.6 kB view details)

Uploaded Python 3

File details

Details for the file span_marker-0.2.1.tar.gz.

File metadata

  • Download URL: span_marker-0.2.1.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.2.1.tar.gz
Algorithm Hash digest
SHA256 2580b7d531d8b13144610c93f06340ed971b9969d23e2f194267a7a766a3fb02
MD5 befe7950326b78d462cd2cc9b73bacdf
BLAKE2b-256 e8c6927f21d9e51f66d3d25ff86805f5ea76cccf31c148efdbd5ae9c1bf8d2c9

See more details on using hashes here.

File details

Details for the file span_marker-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: span_marker-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 23.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for span_marker-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c89bc416cb362ecfc94900628a339b40a60bde945c24b946480785226ab7cf6e
MD5 9e95f09e3499736601ff9344d2d8150c
BLAKE2b-256 8c26e705f722181c710cb1c7821fbd2873bd4906d1e32e5434a221071a264942

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page