Skip to main content

Extremely lightweight compatibility layer between pandas, Polars, cuDF, and Modin

Project description

Narwhals

narwhals_small

Extremely lightweight compatibility layer between Polars, pandas, modin, and cuDF (and possibly more?).

Seamlessly support all, without depending on any!

  • Just use a subset of the Polars API, no need to learn anything new
  • No dependencies (not even Polars), keep your library lightweight
  • ✅ Separate lazy and eager APIs
  • ✅ Use Polars Expressions

Note: this is work-in-progress, and a bit of an experiment, don't take it too seriously.

Installation

pip install narwhals

Or just vendor it, it's only a bunch of pure-Python files.

Usage

There are three steps to writing dataframe-agnostic code using Narwhals:

  1. use narwhals.LazyFrame or narwhals.DataFrame to wrap a pandas or Polars DataFrame/LazyFrame in a Narwhals class

  2. use the subset of the Polars API supported by Narwhals. Just like in Polars, some methods (e.g. to_numpy) are only available for DataFrame, not LazyFrame

  3. use narwhals.to_native to return an object to the user in its original dataframe flavour. For example:

    • if you started with pandas, you'll get pandas back
    • if you started with Polars, you'll get Polars back
    • if you started with Modin, you'll get Modin back (and compute will be distributed)
    • if you started with cuDF, you'll get cuDF back (and compute will happen on GPU)

Example

Here's an example of a dataframe agnostic function:

from typing import Any
import pandas as pd
import polars as pl

import narwhals as nw


def my_agnostic_function(
    suppliers_native,
    parts_native,
):
    suppliers = nw.LazyFrame(suppliers_native)
    parts = nw.LazyFrame(parts_native)

    result = (
        suppliers.join(parts, left_on="city", right_on="city")
        .filter(nw.col("weight") > 10)
        .group_by("s")
        .agg(
            weight_mean=nw.col("weight").mean(),
            weight_max=nw.col("weight").max(),
        )
    )
    return nw.to_native(result)

You can pass in a pandas or Polars dataframe, the output will be the same! Let's try it out:

suppliers = {
    "s": ["S1", "S2", "S3", "S4", "S5"],
    "sname": ["Smith", "Jones", "Blake", "Clark", "Adams"],
    "status": [20, 10, 30, 20, 30],
    "city": ["London", "Paris", "Paris", "London", "Athens"],
}
parts = {
    "p": ["P1", "P2", "P3", "P4", "P5", "P6"],
    "pname": ["Nut", "Bolt", "Screw", "Screw", "Cam", "Cog"],
    "color": ["Red", "Green", "Blue", "Red", "Blue", "Red"],
    "weight": [12.0, 17.0, 17.0, 14.0, 12.0, 19.0],
    "city": ["London", "Paris", "Oslo", "London", "Paris", "London"],
}

print("pandas output:")
print(
    my_agnostic_function(
        pd.DataFrame(suppliers),
        pd.DataFrame(parts),
    )
)
print("\nPolars output:")
print(
    my_agnostic_function(
        pl.LazyFrame(suppliers),
        pl.LazyFrame(parts),
    ).collect()
)
pandas output:
    s  weight_mean  weight_max
0  S1         15.0        19.0
1  S2         14.5        17.0
2  S3         14.5        17.0
3  S4         15.0        19.0

Polars output:
shape: (4, 3)
┌─────┬─────────────┬────────────┐
│ s   ┆ weight_mean ┆ weight_max │
│ --- ┆ ---         ┆ ---        │
│ str ┆ f64         ┆ f64        │
╞═════╪═════════════╪════════════╡
│ S2  ┆ 14.5        ┆ 17.0       │
│ S3  ┆ 14.5        ┆ 17.0       │
│ S4  ┆ 15.0        ┆ 19.0       │
│ S1  ┆ 15.0        ┆ 19.0       │
└─────┴─────────────┴────────────┘

Magic! 🪄

Scope

  • Do you maintain a dataframe-consuming library?
  • Is there a Polars function which you'd like Narwhals to have, which would make your job easier?

If, I'd love to hear from you!

Note: You might suspect that this is a secret ploy to infiltrate the Polars API everywhere. Indeed, you may suspect that.

Why "Narwhals"?

Because they are so awesome.

Thanks to Olha Urdeichuk for the illustration!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

narwhals-0.6.8.tar.gz (265.9 kB view details)

Uploaded Source

Built Distribution

narwhals-0.6.8-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file narwhals-0.6.8.tar.gz.

File metadata

  • Download URL: narwhals-0.6.8.tar.gz
  • Upload date:
  • Size: 265.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for narwhals-0.6.8.tar.gz
Algorithm Hash digest
SHA256 605e75e240bffce6f479a3a87885ec8dda41c7f97e7715a43e60de9a5eedbf76
MD5 3eaa3e759ea1f0871cf7567e78bf3390
BLAKE2b-256 e96988638ee9faf8f2a206782e248d6181a90e8fb6174bfa185b63a0eeea8bfe

See more details on using hashes here.

Provenance

File details

Details for the file narwhals-0.6.8-py3-none-any.whl.

File metadata

  • Download URL: narwhals-0.6.8-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for narwhals-0.6.8-py3-none-any.whl
Algorithm Hash digest
SHA256 05e1b65421dbf1cc73c1c820d85c7b4836fd0a0a1d5fb26adf637b89d13a6d4f
MD5 7b93e142c3bf2537a2616a9cc04e4225
BLAKE2b-256 071b75e7313dc3824f3a923b9ce7f47f7758780e36eccddb32a9635252f2cabf

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page