Skip to main content

An open source implementation of the audio server part of the Hermes protocol

Project description

Hermes Audio Server

Build status Maintainability Code quality Python versions PyPI package version GitHub license

Hermes Audio server implements the audio server part of the Hermes protocol defined by Snips.

It's meant to be used with Rhasspy, an offline, multilingual voice assistant toolkit that works with Home Assistant and is completely open source.

With Hermes Audio Server, you can use the microphone and speaker of your computer (such as a Raspberry Pi) as remote audio input and output for a Rhasspy system.

System requirements

Hermes Audio Server requires Python 3. It has been tested on a Raspberry Pi running Raspbian 9.8, but in principle it should be cross-platform. Please open an issue on GitHub when you encounter problems on your platform.

Installation

You can install Hermes Audio Server and its dependencies like this:

sudo apt install portaudio19-dev
sudo pip3 install hermes-audio-server

Note: this installs Hermes Audio Server globally. If you want to install Hermes Audio Server in a Python virtual environment, drop the sudo.

Configuration

Hermes Audio Server is configured in the JSON file /etc/hermes-audio-server.json, which has the following format:

{
    "site": "default",
    "mqtt": {
        "host": "localhost",
        "port": 1883,
        "authentication": {
            "username": "foobar",
            "password": "secretpassword"
        },
        "tls": {
            "ca_certificates": "",
            "client_certificate": "",
            "client_key": ""
        },
        "vad": {
            "mode": 0,
            "silence": 2,
            "status_messages": true
        }
    }
}

All keys are optional. The default behaviour is to connect with localhost:1883 without authentication and TLS and to use default as the site ID.

Currently Hermes Audio Server uses the system's default microphone and speaker. In the next version this will be configurable.

Voice Activity Detection

Voice Activity Detection is an experimental feature in Hermes Audio Server, which is disabled by default. It is based on py-webrtcvad and tries to suppress sending audio frames when there's no speech. Note that the success of this attempt highly depends on your microphone, your environment and your configuration of the VAD feature. Voice Activity Detection in Hermes Audio Server should not be considered a privacy feature, but a feature to save network bandwidth. If you really don't want to send audio frames on your network except when giving voice commands, you should run a wake word service on your device and only then start streaming audio to your Rhasspy server until the end of the command.

If the vad key is not specified in the configuration file, Voice Activity Detection is not enabled and all recorded audio frames are streamed continuously on the network. If you don't want this, specify the vad key to only stream audio when voice activity is detected. You can configure the VAD feature with the following subkeys:

  • mode: This should be an integer between 0 and 3. 0 is the least aggressive about filtering out non-speech, 3 is the most aggressive. Defaults to 0.
  • silence: This defines how much silence (no speech detected) in seconds has to go by before Hermes Audio Recorder considers it the end of a voice message. Defaults to 2. Make sure that this value is higher than or equal to min_sec in the configuration of WebRTCVAD for the command listener of Rhasspy, otherwise the audio stream for the command listener could be aborted too soon.
  • status_messages: This is a boolean: true or false. Specifies whether or not Hermes Audio Recorder sends messages on MQTT when it detects the start or end of a voice message. Defaults to false. This is useful for debugging, when you want to find the right values for mode and silence.

Running Hermes Audio Server

Hermes Audio Server consists of two commands: Hermes Audio Player that receives WAV files on MQTT and plays them on the speaker, and Hermes Audio Recorder that records WAV files from the microphone and sends them as audio frames on MQTT.

You can run the Hermes Audio Player like this:

hermes-audio-player

You can run the Hermes Audio Recorder like this:

hermes-audio-recorder

You can run both, or only one of them if you only want to use the speaker or microphone.

Usage

Both commands know the --help option that gives you more information about the recognized options. For instance:

usage: hermes-audio-player [-h] [-v] [-V] [-c CONFIG]

hermes-audio-player is an audio server implementing the playback part of
    the Hermes protocol.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         use verbose output
  -V, --version         print version information and exit
  -c CONFIG, --config CONFIG
                        configuration file [default: /etc/hermes-audio-
                        server.json]

TODO list

The following features will be developed soon:

  • Add logging
  • Make it possible to run the commands as daemons (and add systemd unit files)
  • Add an option to let the user choose the audio devices
  • Add more documentation

Changelog

  • 0.1.0 (2019-05-16): Added Voice Activity Detection option.
  • 0.0.2 (2019-05-11): First public version.

Other interesting projects

If you find Hermes Audio Server interesting, also have a look at the following projects:

  • Rhasspy: An offline, multilingual voice assistant toolkit that works with Home Assistant and is completely open source.
  • Snips Led Control: An easy way to control the leds of your Snips-compatible device, with led patterns when the hotword is detected, the device is listening, speaking, idle, ...
  • Matrix-Voice-ESP32-MQTT-Audio-Streamer: The equivalent of Hermes Audio Server for a Matrix Voice ESP32 board, including LED control and OTA updates.
  • OpenSnips: A collection of open source projects related to the Snips voice platform.

License

This project is provided by Koen Vervloesem as open source software with the MIT license. See the LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hermes-audio-server-0.1.0.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

hermes_audio_server-0.1.0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file hermes-audio-server-0.1.0.tar.gz.

File metadata

  • Download URL: hermes-audio-server-0.1.0.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for hermes-audio-server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 963652d8534cadc59dcf6b241270b3aae8d386339a2d5c9928b3d9441c32a1f7
MD5 16aeb2a88a5b87b8417bd1bb9bff2805
BLAKE2b-256 64927002f2c2a7f5259eda832ed0a39c269bf4e7ca93758f0b87dacff20ea6a1

See more details on using hashes here.

File details

Details for the file hermes_audio_server-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hermes_audio_server-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for hermes_audio_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ced11b674779a8b38637487fd25ebc858cf243a17fe3bae2aca44f9f439f7225
MD5 88f019d20a55ab51bf8ad1ee3312bc1f
BLAKE2b-256 27f4590c157f18236b06c619bbd2a3fda3ee53d1dcbc1d0a911cff5329842f57

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page