Async Python SDK for Amazon Transcribe Streaming
Project description
Amazon Transcribe Streaming SDK
The Amazon Transcribe Streaming SDK allows users to directly interface with the Amazon Transcribe Streaming service and their Python programs. The goal of the project is to enable users to integrate directly with Amazon Transcribe without needing anything more than a stream of audio bytes and a basic handler.
This project is still in early alpha so the interface is still subject to change and may see rapid iteration. It's highly advised to pin to strict dependencies if using this outside of local testing.
Installation
To install from pip:
python -m pip install amazon-transcribe
To install from Github:
git clone https://github.com/awslabs/amazon-transcribe-streaming-sdk.git
cd amazon-transcribe-streaming-sdk
python -m pip install .
To use from your Python application, add amazon-transcribe
as a dependency in your requirements.txt
file.
NOTE: This SDK is built on top of the AWS Common Runtime (CRT), a collection of C libraries we interact with through bindings. The CRT is available on PyPI (awscrt) as precompiled wheels for common platforms (Linux, macOS, Windows). Non-standard operating systems may need to compile these libraries themselves.
Usage
Setup for this SDK will require either live or prerecorded audio. Full details on the audio input requirements can be found in the Amazon Transcribe Streaming documentation.
Here's an example to get started:
import asyncio
# This example uses aiofile for asynchronous file reads.
# It's not a dependency of the project but can be installed
# with `pip install aiofile`.
import aiofile
from amazon_transcribe.client import TranscribeStreamingClient
from amazon_transcribe.handlers import TranscriptResultStreamHandler
from amazon_transcribe.model import TranscriptEvent
"""
Here's an example of a custom event handler you can extend to
process the returned transcription results as needed. This
handler will simply print the text out to your interpreter.
"""
class MyEventHandler(TranscriptResultStreamHandler):
async def handle_transcript_event(self, transcript_event: TranscriptEvent):
# This handler can be implemented to handle transcriptions as needed.
# Here's an example to get started.
results = transcript_event.transcript.results
for result in results:
for alt in result.alternatives:
print(alt.transcript)
async def basic_transcribe():
# Setup up our client with our chosen AWS region
client = TranscribeStreamingClient(region="us-west-2")
# Start transcription to generate our async stream
stream = await client.start_stream_transcription(
language_code="en-US",
media_sample_rate_hz=16000,
media_encoding="pcm",
)
async def write_chunks():
# An example file can be found at tests/integration/assets/test.wav
async with aiofile.AIOFile('tests/integration/assets/test.wav', 'rb') as afp:
reader = aiofile.Reader(afp, chunk_size=1024 * 16)
async for chunk in reader:
await stream.input_stream.send_audio_event(audio_chunk=chunk)
await stream.input_stream.end_stream()
# Instantiate our handler and start processing events
handler = MyEventHandler(stream.output_stream)
await asyncio.gather(write_chunks(), handler.handle_events())
loop = asyncio.get_event_loop()
loop.run_until_complete(basic_transcribe())
loop.close()
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file amazon-transcribe-0.1.0.tar.gz
.
File metadata
- Download URL: amazon-transcribe-0.1.0.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44c1b08bdf75c859d015fbcbc5af0916de526c42a771b3c8886758ed098700e5 |
|
MD5 | 497f3a0ec847dd95d9abb674d17db319 |
|
BLAKE2b-256 | 55ad2c626f0e1992418c5d35e5e7c85545b51774562d523bd075d46982fa53d9 |
File details
Details for the file amazon_transcribe-0.1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: amazon_transcribe-0.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 32.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d4d8a81e8e57a0e95b2ced96fc39f735ef8d162e816c1ae12f4b4b6970cd31b |
|
MD5 | 45325fc5633bb131960674219737cad5 |
|
BLAKE2b-256 | 307f7bdade67939ee4db2b61e582e2f958cfea888b6b1c79530329deab9bfb7e |