Skip to main content

No project description provided

Project description

QFC - Quantized Fourier Compression of Timeseries Data with Application to Electrophysiology

Overview

With the increasing sizes of data for extracellular electrophysiology, it is crucial to develop efficient methods for compressing multi-channel time series data. While lossless methods are desirable for perfectly preserving the original signal, the compression ratios for these methods usually range only from 2-4x. What is needed are ratios on the order of 10-30x, leading us to consider lossy methods.

Here, we introduce a simple lossy compression method, inspired by the Discrete Cosine Transform (DCT) and the quantization steps of JPEG compression for images. The method comprises the following steps:

  • Compute the Discrete Fourier Transform (DFT) of the time series data in the time domain.
  • Quantize the Fourier coefficients to achieve a target entropy (the entropy determines the theoretically achievable compression ratio). This is done by multiplying by a normalization factor and then rounding to the nearest integer.
  • Compress the reduced-entropy quantized Fourier coefficients using ZLIB (other methods could also be used).

To decompress:

  • Unzip the quantized Fourier coefficients.
  • Divide by the normalization factor.
  • Compute the Inverse Discrete Fourier Transform (IDFT) to obtain the reconstructed time series data.

This method is particularly well-suited for data that has been bandpass-filtered, as the suppressed Fourier coefficients yield an especially low entropy of the quantized signal.

For a comparison of various lossy and lossless compression schemes, see Compression strategies for large-scale electrophysiology data, Buccino et al..

For application to real data, see this notebook.

Installation

pip install qfc

Usage

# See the examples directory

from matplotlib import pyplot as plt
import numpy as np
from qfc import qfc_compress, qfc_decompress, qfc_estimate_normalization_factor


def main():
    sampling_frequency = 30000
    y = np.random.randn(5000, 10) * 50
    y = lowpass_filter(y, sampling_frequency, 6000)
    y = y.astype(np.int16)
    target_compression_ratio = 15

    ############################################################
    normalization_factor = qfc_estimate_normalization_factor(
        y,
        target_compression_ratio=target_compression_ratio
    )
    compressed_bytes = qfc_compress(
        y,
        normalization_factor=normalization_factor
    )
    y_decompressed = qfc_decompress(
        compressed_bytes,
        normalization_factor=normalization_factor,
        original_shape=y.shape
    )
    ############################################################

    y_resid = y - y_decompressed
    original_size = y.nbytes
    compressed_size = len(compressed_bytes)
    compression_ratio = original_size / compressed_size
    print(f"Original size: {original_size} bytes")
    print(f"Compressed size: {compressed_size} bytes")
    print(f'Target compression ratio: {target_compression_ratio}')
    print(f"Actual compression ratio: {compression_ratio}")
    print(f'Std. dev. of residual: {np.std(y_resid):.2f}')

    xgrid = np.arange(y.shape[0]) / sampling_frequency
    ch = 3  # select a channel to plot
    plt.figure()
    plt.plot(xgrid, y[:, ch], label="Original")
    plt.plot(xgrid, y_decompressed[:, ch], label="Decompressed")
    plt.plot(xgrid, y_resid[:, ch], label="Residual")
    plt.xlabel("Time")
    plt.title(f'QFC compression ratio: {compression_ratio:.2f}')
    plt.legend()
    plt.show()


def lowpass_filter(input_array, sampling_frequency, cutoff_frequency):
    F = np.fft.fft(input_array, axis=0)
    N = input_array.shape[0]
    freqs = np.fft.fftfreq(N, d=1/sampling_frequency)
    sigma = cutoff_frequency / 3
    window = np.exp(-np.square(freqs) / (2 * sigma**2))
    F_filtered = F * window[:, None]
    filtered_array = np.fft.ifft(F_filtered, axis=0)
    return np.real(filtered_array)


if __name__ == "__main__":
    main()

License

This code is provided under the Apache License, Version 2.0.

Author

Jeremy Magland, Center for Computational Mathematics, Flatiron Institute

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qfc-0.2.1.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

qfc-0.2.1-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file qfc-0.2.1.tar.gz.

File metadata

  • Download URL: qfc-0.2.1.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/6.5.6-76060506-generic

File hashes

Hashes for qfc-0.2.1.tar.gz
Algorithm Hash digest
SHA256 13295aedfeb2c62f70784bf92de6b03209a6dac81e558c55eaf207c751f5659d
MD5 feecd7694f622524139a74fc98f0c1bf
BLAKE2b-256 273a28c643774408d992521843afe4de681bdc9f4dbb80244b1f14270a2ab314

See more details on using hashes here.

File details

Details for the file qfc-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: qfc-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/6.5.6-76060506-generic

File hashes

Hashes for qfc-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d7c01fce6f72c4a22632c28be12705ec444700259bb0ca8975f55d2270148509
MD5 39d6a6641c0b246c863b86573b26572d
BLAKE2b-256 7cea15c6301c244daaef598a81fa09555ae9023e7e28992b33af3c869cd10444

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page