Skip to main content

An insanely fast whisper CLI

Project description

Insanely Fast Whisper

Powered by 🤗 Transformers, Optimum & flash-attn

TL;DR - Transcribe 300 minutes (5 hours) of audio in less than 5 minutes - with OpenAI's Whisper Large v2. Blazingly fast transcription is now a reality!⚡️

Not convinced? Here are some benchmarks we ran on a free Google Colab T4 GPU! 👇

Optimisation type Time to Transcribe (150 mins of Audio)
Transformers (fp32) ~31 (31 min 1 sec)
Transformers (fp32 + batching [8]) ~13 (13 min 19 sec)
Transformers (fp16 + batching [24] + bettertransformer) ~5 (5 min 2 sec)
Transformers (distil-whisper) (fp16 + batching [24] + bettertransformer) ~3 (3 min 16 sec)
Faster Whisper (fp16 + beam_size [1]) ~9.23 (9 min 23 sec)
Faster Whisper (8-bit + beam_size [1]) ~8 (8 min 15 sec)

🆕 You can now access blazingly fast transcriptions via your terminal! ⚡️

We've added a CLI to enable fast transcriptions. Here's how you can use it:

Transcribe your audio

Install insanely-fast-whisper with pipx:

pipx install insanely-fast-whisper

Run inference from any path on your computer:

insanely-fast-whisper --file-name <filename or URL>

Don't want to install? Just use pipx run:

pipx run insanely-fast-whisper --file-name <filename or URL>

Note: The CLI is opinionated and currently only works for Nvidia GPUs. Make sure to check out the defaults and the list of options you can play around with to maximise your transcription throughput. Run insanely-fast-whisper --help or pipx run insanely-fast-whisper --help to get all the CLI arguments and defaults.

How to use it without a CLI?

For older GPUs, all you need to run is:

import torch
from transformers import pipeline

pipe = pipeline("automatic-speech-recognition",
                "openai/whisper-large-v2",
                torch_dtype=torch.float16,
                device="cuda:0")

pipe.model = pipe.model.to_bettertransformer()

outputs = pipe("<FILE_NAME>",
               chunk_length_s=30,
               batch_size=24,
               return_timestamps=True)

outputs["text"]

For newer (A10, A100, H100s), use Flash Attention:

import torch
from transformers import pipeline

pipe = pipeline("automatic-speech-recognition",
                "openai/whisper-large-v2",
                torch_dtype=torch.float16,
                model_kwargs={"use_flash_attention_2": True},
                device="cuda:0")

outputs = pipe("<FILE_NAME>",
               chunk_length_s=30,
               batch_size=24,
               return_timestamps=True)

outputs["text"]                

Roadmap

  • Add benchmarks for Whisper.cpp
  • Add benchmarks for 4-bit inference
  • Add a light CLI script
  • Deployment script with Inference API

Community showcase

@ochen1 created a brilliant MVP for a CLI here: https://github.com/ochen1/insanely-fast-whisper-cli (Try it out now!)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

insanely_fast_whisper-0.0.3.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

insanely_fast_whisper-0.0.3-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file insanely_fast_whisper-0.0.3.tar.gz.

File metadata

File hashes

Hashes for insanely_fast_whisper-0.0.3.tar.gz
Algorithm Hash digest
SHA256 9ff41da586b293768c585089f4bf2827fe8f433e6486aa8c375630c13d7c03b9
MD5 8d6224937332d3235f87b155f07a5a41
BLAKE2b-256 6733b41538ac36e514bd428aef35574cc69526b51eeb5ad53b76ccc879a10789

See more details on using hashes here.

File details

Details for the file insanely_fast_whisper-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for insanely_fast_whisper-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4aa7a00c0750588820623f3de424c9153278e5fb255e3c8a909a0c940f29a6e8
MD5 b8cdafd822a991f538bbba096e8581ee
BLAKE2b-256 110f9754980b6c81fa713b83f8fb50406c3a03a5a040b4217cec45b5431c7bc5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page