Skip to main content

No project description provided

Project description

quak logo
quak /kwæk/

an anywidget for data that talks like a duck 🦆

quak is a scalable data profiler for quickly scanning large tables, capturing interactions as executable SQL queries.

  • interactive 🖱️ mouse over column summaries, cross-filter, sort, and slice rows.
  • fast ⚡ built with Mosaic; views are expressed as SQL queries lazily executed by DuckDB.
  • flexible 🔄 supports many data types and formats via Apache Arrow and the dataframe interchange protocol.
  • reproducible 📓 a UI for building complex SQL queries; materialize views in the kernel for further analysis.

install

[!WARNING] quak is a prototype exploring a high-performance data profiler based on anywidget. It is not production-ready. Expect bugs. Open-sourced for SciPy 2024.

pip install quak

usage

The easiest way to get started with quak is using the IPython cell magic.

%load_ext quak
import polars as pl

df = pl.read_parquet("https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet")
df
olympic athletes table

quak hooks into Jupyter's display mechanism to automatically render any dataframe-like object (implementing the Python dataframe interchange protocol) using quak.Widget instead of the default display.

Alternatively, you can use quak.Widget directly:

import polars as pl
import quak

df = pl.read_csv("https://github.com/uwdata/mosaic/raw/main/data/athletes.parquet")
widget = quak.Widget(df)
widget

interacting with the data

quak is a UI for quickly scanning and exploring large tables. However, it is more than that. A side effect of quak's Mosaic-based architecture is that it captures all user interactions as SQL queries.

At any point, table state can be accessed as a query,

widget.sql # SELECT * FROM df WHERE ...

which for convenience can be executed in the kernel to materialize the view for further analysis:

widget.data() # returns duckdb.DuckDBPyRelation object

By representing UI state as SQL, quak makes it easy to generate complex queries via interactions that would be challenging to write manually, while keeping them reproducible.

contributing

Contributors welcome! Check the Contributors Guide to get started. Note: I'm wrapping up my PhD, so I might be slow to respond. Please open an issue before contributing a new feature.

references

quak pieces together many important ideas from the web and Python data science ecosystems. It serves as an example of what you can achieve by embracing these platforms for their strengths.

  • Observable's data table: Inspiration for the UI design and user interactions.
  • Mosaic: The foundation for linking databases and interactive table views.
  • Apache Arrow: Support for various data types and efficient data interchange between JS/Python.
  • DuckDB: An amazingly engineered piece of software that makes SQL go vroom.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quak-0.1.0.tar.gz (61.5 kB view details)

Uploaded Source

Built Distribution

quak-0.1.0-py2.py3-none-any.whl (62.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file quak-0.1.0.tar.gz.

File metadata

  • Download URL: quak-0.1.0.tar.gz
  • Upload date:
  • Size: 61.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for quak-0.1.0.tar.gz
Algorithm Hash digest
SHA256 955cd91e262b27273c42526f99a53909a64a69a0da9d4e343a57fd9d2a3fcd07
MD5 c92ed986f8f4f27375ea62320d14d8d2
BLAKE2b-256 08ea665bfebd23a45fd6818a35de3f8f084e6cccbb2dcf973251ea29bda142af

See more details on using hashes here.

File details

Details for the file quak-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: quak-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 62.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for quak-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7ff1cc8d27a55ecd98e00a701cff30a3512f7c20acca1c81a91f4b272cf1d850
MD5 de26cb56bc245731c1b41e5b293b5b38
BLAKE2b-256 0696689fb6bea82a7293f1944e8b1ca7fa424dba64446898c95a98a1bd82d850

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page