Skip to main content

View PyPI download statistics with ease.

Project description

https://img.shields.io/pypi/v/pypinfo.svg?style=flat-square https://img.shields.io/pypi/pyversions/pypinfo.svg?style=flat-square https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square

pypinfo is a simple CLI to access PyPI download statistics via Google’s BigQuery.

Installation

Hatch is distributed on PyPI as a universal wheel and is available on Linux/macOS and Windows and supports Python 3.5+ and PyPy.

This is relatively painless, I swear.

  1. Go to https://bigquery.cloud.google.com.

  2. Sign up if you haven’t already. The first TB of queried data each month is free. Each additional TB is $5.

  3. Go to https://console.developers.google.com/cloud-resource-manager and create a new project if you don’t already have one. Any name is fine, but I recommend you choose something to do with PyPI like pypinfo. This way you know what the project is designated for.

  4. Go to https://console.cloud.google.com/apis/api/bigquery-json.googleapis.com/overview and make sure the correct project is chosen using the drop-down on top. Click the button on top to enable.

  5. Follow https://cloud.google.com/storage/docs/authentication#generating-a-private-key to create credentials in JSON format. During creation, choose BigQuery User as role. (If BigQuery is not an option in the list, wait 15-20 minutes and try creating the credentials again.) After creation, note the download location. Move the file wherever you want.

  6. pip install pypinfo

  7. pypinfo --auth path/to/your_credentials.json, or set an environment variable GOOGLE_APPLICATION_CREDENTIALS that points to the file.

Usage

$ pypinfo
Usage: pypinfo [OPTIONS] [PROJECT] [FIELDS]... COMMAND [ARGS]...

  Valid fields are:

  project | version | pyversion | percent3 | percent2 | impl | impl-version |

  openssl | date | month | year | country | installer | installer-version |

  setuptools-version | system | system-release | distro | distro-version | cpu

Options:
  -a, --auth TEXT         Path to Google credentials JSON file.
  --run / --test          --test simply prints the query.
  -j, --json              Print data as JSON.
  -t, --timeout INTEGER   Milliseconds. Default: 120000 (2 minutes)
  -l, --limit TEXT        Maximum number of query results. Default: 20
  -d, --days TEXT         Number of days in the past to include. Default: 30
  -sd, --start-date TEXT  Must be negative. Default: -31
  -ed, --end-date TEXT    Must be negative. Default: -1
  -w, --where TEXT        WHERE conditional. Default: file.project = "project"
  -o, --order TEXT        Field to order by. Default: download_count
  --version               Show the version and exit.
  --help                  Show this message and exit.

pypinfo accepts 0 or more options, followed by exactly 1 project, followed by 0 or more fields. By default only the last 30 days are queried. Let’s take a look at some examples!

Tip: If queries are resulting in NoneType errors, increase timeout.

Downloads for a project

$ pypinfo requests
download_count
--------------
11033343

All downloads

$ pypinfo ""
download_count
--------------
662834133

Downloads for a project by Python version

$ pypinfo django pyversion
python_version download_count
-------------- --------------
2.7            788060
3.5            400008
3.6            169665
3.4            134378
None           59415
2.6            8276
3.3            4831
3.7            2680
3.2            1560
1.17           41
2.5            15
2.4            15
3.1            6

All downloads by country code

$ pypinfo "" country
country download_count
------- --------------
US      427837633
None    26184466
IE      25595967
CN      19682726
DE      17338740
GB      16848703
AU      12201849
CA      9828255
FR      9780133
BR      9276365
JP      9247794
RU      8758959
IL      7578813
IN      7468363
KR      6809831
NL      6120287
SG      5882292
TW      3961899
CZ      2352650
PL      2270622

Downloads for a project by system and distribution

$ pypinfo cryptography system distro
system_name distro_name                     download_count
----------- ------------------------------- --------------
Linux       Ubuntu                          1226983
Linux       None                            701829
Linux       CentOS Linux                    254488
Linux       Debian GNU/Linux                207352
Linux       debian                          205485
Linux       CentOS                          195178
None        None                            179178
Windows     None                            126962
Darwin      macOS                           123389
Darwin      OS X                            51606
Linux       Amazon Linux AMI                43192
Linux       Red Hat Enterprise Linux Server 39157
Linux       Alpine Linux                    37721
Linux       Fedora                          25036
Linux       Virtuozzo                       10302
Linux       Raspbian GNU/Linux              4261
Linux       Linux                           4162
Linux       Oracle Linux Server             3754
FreeBSD     None                            3513
Linux       Debian                          3479

Percentage of Python 3 downloads of the top 100 projects in the past year

Let’s use --test to only see the query instead of sending it.

$ pypinfo --test --days 365 --limit 100 "" project percent3
SELECT
  file.project as project,
  ROUND(100 * SUM(CASE WHEN REGEXP_EXTRACT(details.python, r"^([^\.]+)") = "3" THEN 1 ELSE 0 END) / COUNT(*), 1) as percent_3,
  COUNT(*) as download_count,
FROM
  TABLE_DATE_RANGE(
    [the-psf:pypi.downloads],
    DATE_ADD(CURRENT_TIMESTAMP(), -366, "day"),
    DATE_ADD(CURRENT_TIMESTAMP(), -1, "day")
  )
GROUP BY
  project,
ORDER BY
  download_count DESC
LIMIT 100

Credits

Changelog

Important changes are emphasized.

3.0.1

  • Fix: project names are now normalized to adhere to PEP 503.

3.0.0

  • Breaking: --json option is now just a flag and prints output as prettified JSON.

2.0.0

  • Added --json path option.

1.0.0

  • Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypinfo-4.0.0.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

pypinfo-4.0.0-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file pypinfo-4.0.0.tar.gz.

File metadata

  • Download URL: pypinfo-4.0.0.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pypinfo-4.0.0.tar.gz
Algorithm Hash digest
SHA256 787f9b6eb86fe3e36e3e2a15c3fe36166f668129c83ce05b456331321f1950a6
MD5 1cff2c26464c45299eb4de826fac2a9f
BLAKE2b-256 218bed42dafd42629232030e582625be01d5f450a4aed1d3a22c0c1f0ae72dfe

See more details on using hashes here.

File details

Details for the file pypinfo-4.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pypinfo-4.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a2c63d5cd7bc0e035e317606bc129391b571fb4f074dd49011cbc017797ade9c
MD5 2f70948fe17ad647aecb6b9b18b9979e
BLAKE2b-256 9bb77de283a1a3ce5ee0f126b0f8328d93a3b50dabdf8f8cd2311913f9c24151

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page