Skip to main content

Extremely lightweight compatibility layer between pandas, Polars, cuDF, and Modin

Project description

Narwhals

narwhals_small

Extremely lightweight compatibility layer between Polars, pandas, and more.

Seamlessly support both, without depending on either!

  • Just use a subset of the Polars API, no need to learn anything new
  • No dependencies (not even Polars), keep your library lightweight
  • ✅ Separate Lazy and Eager APIs
  • ✅ Use Polars Expressions

Note: this is work-in-progress, and a bit of an experiment, don't take it too seriously.

Installation

pip install narwhals

Or just vendor it, it's only a bunch of pure-Python files.

Usage

There are three steps to writing dataframe-agnostic code using Narwhals:

  1. use narwhals.DataFrame to wrap a pandas or Polars DataFrame to a Narwhals DataFrame

  2. use the subset of the Polars API supported by Narwhals Some methods are only available if you called narwhals.translate_frame with is_eager=True

  3. use narwhals.to_native to return an object to the user in its original dataframe flavour. For example:

    • if you started with pandas, you'll get pandas back
    • if you started with Polars, you'll get Polars back

Example

Here's an example of a dataframe agnostic function:

from typing import Any
import pandas as pd
import polars as pl

import narwhals as nw


def my_agnostic_function(
    suppliers_native,
    parts_native,
):
    suppliers = nw.DataFrame(suppliers_native)
    parts = nw.DataFrame(parts_native)

    result = (
        suppliers.join(parts, left_on="city", right_on="city")
        .filter(
            nw.col("color").is_in(["Red", "Green"]),
            nw.col("weight") > 14,
        )
        .group_by("s", "p")
        .agg(
            weight_mean=nw.col("weight").mean(),
            weight_max=nw.col("weight").max(),
        )
    ).with_columns(nw.col("weight_max").cast(nw.Int64))
    return nw.to_native(result)

You can pass in a pandas or Polars dataframe, the output will be the same! Let's try it out:

suppliers = {
    "s": ["S1", "S2", "S3", "S4", "S5"],
    "sname": ["Smith", "Jones", "Blake", "Clark", "Adams"],
    "status": [20, 10, 30, 20, 30],
    "city": ["London", "Paris", "Paris", "London", "Athens"],
}
parts = {
    "p": ["P1", "P2", "P3", "P4", "P5", "P6"],
    "pname": ["Nut", "Bolt", "Screw", "Screw", "Cam", "Cog"],
    "color": ["Red", "Green", "Blue", "Red", "Blue", "Red"],
    "weight": [12.0, 17.0, 17.0, 14.0, 12.0, 19.0],
    "city": ["London", "Paris", "Oslo", "London", "Paris", "London"],
}

print("pandas output:")
print(
    my_agnostic_function(
        pd.DataFrame(suppliers),
        pd.DataFrame(parts),
    )
)
print("\nPolars output:")
print(
    my_agnostic_function(
        pl.DataFrame(suppliers),
        pl.DataFrame(parts),
    )
)
print("\nPolars lazy output:")
print(
    my_agnostic_function(
        pl.LazyFrame(suppliers),
        pl.LazyFrame(parts),
    ).collect()
)
pandas output:
    s   p  weight_mean
0  S1  P6         19.0
1  S2  P2         17.0
2  S3  P2         17.0
3  S4  P6         19.0

Polars output:
shape: (4, 3)
┌─────┬─────┬─────────────┐
│ s   ┆ p   ┆ weight_mean │
│ --- ┆ --- ┆ ---         │
│ str ┆ str ┆ f64         │
╞═════╪═════╪═════════════╡
│ S1  ┆ P6  ┆ 19.0        │
│ S3  ┆ P2  ┆ 17.0        │
│ S4  ┆ P6  ┆ 19.0        │
│ S2  ┆ P2  ┆ 17.0        │
└─────┴─────┴─────────────┘

Polars lazy output:
shape: (4, 3)
┌─────┬─────┬─────────────┐
│ s   ┆ p   ┆ weight_mean │
│ --- ┆ --- ┆ ---         │
│ str ┆ str ┆ f64         │
╞═════╪═════╪═════════════╡
│ S1  ┆ P6  ┆ 19.0        │
│ S3  ┆ P2  ┆ 17.0        │
│ S4  ┆ P6  ┆ 19.0        │
│ S2  ┆ P2  ┆ 17.0        │
└─────┴─────┴─────────────┘

Magic! 🪄

Scope

  • Do you maintain a dataframe-consuming library?
  • Is there a Polars function which you'd like Narwhals to have, which would make your job easier?

If, I'd love to hear from you!

Note: You might suspect that this is a secret ploy to infiltrate the Polars API everywhere. Indeed, you may suspect that.

Why "Narwhals"?

Because they are so awesome.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

narwhals-0.4.0.tar.gz (131.5 kB view details)

Uploaded Source

Built Distribution

narwhals-0.4.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file narwhals-0.4.0.tar.gz.

File metadata

  • Download URL: narwhals-0.4.0.tar.gz
  • Upload date:
  • Size: 131.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for narwhals-0.4.0.tar.gz
Algorithm Hash digest
SHA256 48e08d46b61c67907c20f3c8bd1df8bfc664ddf7d682713392047ae84a67b92e
MD5 84102e0f61adf887720a5535f1002ec8
BLAKE2b-256 f7503dd8e977a895c8a3136b9ddf85174ef3628631d73d6e956bf799e6e6a3e9

See more details on using hashes here.

File details

Details for the file narwhals-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: narwhals-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for narwhals-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d870f083dc942e907ed4a3f421fc08630ee5d873b75b5a4efaf31d18ccd51c64
MD5 23a662c91757448a528d32eaed31299a
BLAKE2b-256 93892157747e0c4d3fa1caab06b47c70f084c5b5c92ca11f1c2978b3df744d9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page