Skip to main content

Automatically upgrade Polars code to the latest version.

Project description

polars-upgrade

Automatically upgrade your Polars code so it's compatible with future versions.

Installation

Easy:

pip install -U polars-upgrade

Usage (command-line)

Run

polars-upgrade my_project --target-version=0.20.4

from the command line. Replace 0.20.4 and my_project with your Polars version, and the name of your directory.

NOTE: this tool will modify your code! You're advised to stage your files before running it.

Usage (pre-commit hook)

-   repo: https://github.com/MarcoGorelli/polars-upgrade
    rev: 0.3.1  # polars-upgrade version goes here
    hooks:
    -   id: polars-upgrade
        args: [--target-version=0.20.0]  # Polars version goes here

Usage (Jupyter Notebooks)

Install nbqa and then run

nbqa polars_upgrade my_project --target-version=0.20.4

Usage (library)

In a Python script:

from polars_upgrade import rewrite, Settings

src = """\
import polars as pl
df.select(pl.count())
"""
settings = Settings(target_version=(0, 20, 4))
output = rewrite(src, settings=settings)
print(output)

Output:

import polars as pl
df.select(pl.len())

If your snippet does not include import polars or import as pl, then you will also need to provide pl and/or polars to aliases, else polars-upgrade will not perform the rewrite. Example:

from polars_upgrade import rewrite, Settings

src = """\
df.select(pl.count())
"""
settings = Settings(target_version=(0, 20, 4))
output = rewrite(src, settings=settings, aliases={'pl'})
print(output)

Output:

df.select(pl.len())

Supported rewrites

Version 0.18.12+

- pl.avg
+ pl.mean

Version 0.19.0+

- df.groupby_dynamic
+ df.group_by_dynamic
- df.groupby_rolling
+ df.rolling
- df.rolling('ts', period='3d').apply
+ df.rolling('ts', period='3d').map_groups
- pl.col('a').rolling_apply
+ pl.col('a').rolling_map
- pl.col('a').apply
+ pl.col('a').map_elements
- pl.col('a').map
+ pl.col('a').map_batches
- pl.map
+ pl.map_batches
- pl.apply
+ pl.map_groups
- pl.col('a').any(drop_nulls=True)
+ pl.col('a').any(ignore_nulls=True)
- pl.col('a').all(drop_nulls=True)
+ pl.col('a').all(ignore_nulls=True)
- pl.col('a').value_counts(multithreaded=True)
+ pl.col('a').value_counts(parallel=True)

Version 0.19.2+

- pl.col('a').is_not
+ pl.col('a').not_

Version 0.19.3+

- pl.enable_string_cache(True)
+ pl.enable_string_cache()
- pl.enable_string_cache(False)
+ pl.disable_string_cache()
- pl.col('a').list.count_match
+ pl.col('a').list.count_matches
- pl.col('a').is_last
+ pl.col('a').is_last_distinct
- pl.col('a').is_first
+ pl.col('a').is_first_distinct
- pl.col('a').str.strip
+ pl.col('a').str.strip_chars
- pl.col('a').str.lstrip
+ pl.col('a').str.strip_chars_start
- pl.col('a').str.rstrip
+ pl.col('a').str.strip_chars_end
- pl.col('a').str.count_match
+ pl.col('a').str.count_matches
- pl.col("dt").dt.offset_by("1mo_saturating")
+ pl.col("dt").dt.offset_by("1mo")

Version 0.19.4+

- df.group_by_dynamic('ts', every='3d', truncate=True)
+ df.group_by_dynamic('ts', every='3d', label='left')
- df.group_by_dynamic('ts', every='3d', truncate=False)
+ df.group_by_dynamic('ts', every='3d', label='datapoint')

Version 0.19.8+

- pl.col('a').list.lengths
+ pl.col('a').list.len
- pl.col('a').str.lengths
+ pl.col('a').str.len_bytes
- pl.col('a').str.n_chars
+ pl.col('a').str.len_chars

Version 0.19.11+

- pl.col('a').shift(periods=4)
+ pl.col('a').shift(n=4)
- pl.col('a').shift_and_fill(periods=4)
+ pl.col('a').shift_and_fill(n=4)
- pl.col('a').list.shift(periods=4)
+ pl.col('a').list.shift(n=4)
- pl.col('a').map_dict(remapping={1: 2})
+ pl.col('a').map_dict(mapping={1: 2})

Version 0.19.12+

- pl.col('a').keep_name
+ pl.col('a').name.keep
- pl.col('a').suffix
+ pl.col('a').name.suffix
- pl.col('a').prefix
+ pl.col('a').name.prefix
- pl.col('a').map_alias
+ pl.col('a').name.map
- pl.col('a').str.ljust
+ pl.col('a').str.pad_end
- pl.col('a').str.rjust
+ pl.col('a').str.pad_start
- pl.col('a').zfill(alignment=3)
+ pl.col('a').zfill(length=3)
- pl.col('a').ljust(width=3)
+ pl.col('a').ljust(length=3)
- pl.col('a').rjust(width=3)
+ pl.col('a').rjust(length=3)

Version 0.19.13

- pl.col('a').dt.milliseconds
+ pl.col('a').dt.total_milliseconds
- pl.col('a').dt.microseconds
+ pl.col('a').dt.total_microseconds
- pl.col('a').dt.nanoseconds
+ pl.col('a').dt.total_nanoseconds

(and so on for other units)

Version 0.19.14

- pl.col('a').list.take
+ pl.col('a').list.gather
- pl.col('a').cumcount
+ pl.col('a').cum_count
- pl.col('a').cummax
+ pl.col('a').cum_max
- pl.col('a').cummin
+ pl.col('a').cum_min
- pl.col('a').cumprod
+ pl.col('a').cum_prod
- pl.col('a').cumsum
+ pl.col('a').cum_sum
- pl.col('a').cumcount
+ pl.col('a').cum_count
- pl.col('a').take
+ pl.col('a').gather
- pl.col('a').take_every
+ pl.col('a').gather_every
- pl.cumsum
+ pl.cum_sum
- pl.cumfold
+ pl.cum_fold
- pl.cumreduce
+ pl.cum_reduce
- pl.cumsum_horizontal
+ pl.cum_sum_horizontal
- pl.col('a').list.take(index=[1, 2])
+ pl.col('a').list.take(indices=[1, 2])
- pl.col('a').str.parse_int(radix=1)
+ pl.col('a').str.parse_int(base=1)

Version 0.19.15+

- pl.col('a').str.json_extract
+ pl.col('a').str.json_decode

Version 0.19.16

- pl.col('a').map_dict({'a': 'b'})
+ pl.col('a').replace({'a': 'b'}, default=None)
- pl.col('a').map_dict({'a': 'b'}, default='c')
+ pl.col('a').replace({'a': 'b'}, default='c')

Version 0.20.0

- df.write_database(table_name='foo', if_exists="append")
+ df.write_database(table_name='foo', if_table_exists="append")

Version 0.20.4

- pl.col('a').where
+ pl.col('a').filter
- pl.count()
+ pl.len()
- df.with_row_count('row_number')
+ df.with_row_index('row_number')
- pl.scan_ndjson(source, row_count_name='foo', row_count_offset=3)
+ pl.scan_ndjson(source, row_index_name='foo', row_index_offset=3)
[...and similarly for `read_csv`, `read_csv_batched`, `scan_csv`, `read_ipc`, `read_ipc_stream`, `scan_ipc`, `read_parquet`, `scan_parquet`]

Version 0.20.5

- df.pivot(index=index, values=values, columns=columns, aggregate_function='count')
+ df.pivot(index=index, values=values, columns=columns, aggregate_function='len')

Version 0.20.6

- pl.read_excel(source, xlsx2csv_options=options, read_csv_options=read_options)
+ pl.read_excel(source, engine_options=options, read_options=read_options)

Version 0.20.7

- pl.threadpool_size
+ pl.thread_pool_size

Version 0.20.8

- df.pivot(a, b, c)
+ df.pivot(values=a, index=b, columns=c)

Version 0.20.11

- pl.col('a').meta.write_json
+ pl.col('a').meta.serialize
- lf.approx_n_unique()
+ lf.select(pl.all().approx_n_unique())

Notes

This work is derivative of pyupgrade - many parts have been lifted verbatim. As required, I've included pyupgrade's license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_upgrade-0.3.1.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

polars_upgrade-0.3.1-py3-none-any.whl (34.9 kB view details)

Uploaded Python 3

File details

Details for the file polars_upgrade-0.3.1.tar.gz.

File metadata

  • Download URL: polars_upgrade-0.3.1.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for polars_upgrade-0.3.1.tar.gz
Algorithm Hash digest
SHA256 b4b5dc57964cacf2f7211bd49e4161bb841a0ceb6b01d3d73b601530a154fc0c
MD5 343f4bd70697cbc28c2b875c7ee2848d
BLAKE2b-256 053bd424b01b72590a914e7aa1e00050beb5b268276e9dfc42d3bf692317f5ae

See more details on using hashes here.

Provenance

File details

Details for the file polars_upgrade-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for polars_upgrade-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3084767bce8446699564f74c337304a13a5240955d4558c65a037832721ac90f
MD5 dc852833a57e42b6ad64db4642a3d25a
BLAKE2b-256 20bd8c1b6a1952d6d547ac65e353536cb53522abbbb1cd83050031839c9da3a4

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page