View PyPI download statistics with ease.
Project description
pypinfo is a simple CLI to access PyPI download statistics via Google’s BigQuery.
Installation
pypinfo is distributed on PyPI as a universal wheel and is available on Linux/macOS and Windows and supports Python 3.7+.
This is relatively painless, I swear.
Create project
Sign up if you haven’t already. The first TB of queried data each month is free. Each additional TB is $5.
Go to https://console.developers.google.com/cloud-resource-manager and click CREATE PROJECT if you don’t already have one:
This takes you to https://console.developers.google.com/projectcreate. Fill out the form and click CREATE. Any name is fine, but I recommend you choose something to do with PyPI like pypinfo. This way you know what the project is designated for:
The next page should show your new project. If not, reload the page and select from the top menu:
Enable BigQuery API
Go to https://console.cloud.google.com/apis/api/bigquery-json.googleapis.com/overview and make sure the correct project is chosen using the drop-down on top. Click the ENABLE button:
After enabling, click CREATE CREDENTIALS:
Choose the “BigQuery API” and “No, I’m not using them”:
Fill in a name, and select role “BigQuery User” (if the “BigQuery” is not an option in the list, wait 15-20 minutes and try creating the credentials again), and select a JSON key:
Click continue and the JSON will download to your computer. Note the download location. Move the file wherever you want:
pip install pypinfo
pypinfo --auth path/to/your_credentials.json, or set an environment variable GOOGLE_APPLICATION_CREDENTIALS that points to the file.
Usage
$ pypinfo
Usage: pypinfo [OPTIONS] [PROJECT] [FIELDS]... COMMAND [ARGS]...
Valid fields are:
project | version | file | pyversion | percent3 | percent2 | impl | impl-version |
openssl | date | month | year | country | installer | installer-version |
setuptools-version | system | system-release | distro | distro-version | cpu |
libc | libc-version
Options:
-a, --auth TEXT Path to Google credentials JSON file.
--run / --test --test simply prints the query.
-j, --json Print data as JSON, with keys `rows` and `query`.
-i, --indent INTEGER JSON indentation level.
-t, --timeout INTEGER Milliseconds. Default: 120000 (2 minutes)
-l, --limit TEXT Maximum number of query results. Default: 10
-d, --days TEXT Number of days in the past to include. Default: 30
-sd, --start-date TEXT Must be negative or YYYY-MM[-DD]. Default: -31
-ed, --end-date TEXT Must be negative or YYYY-MM[-DD]. Default: -1
-m, --month TEXT Shortcut for -sd & -ed for a single YYYY-MM month.
-w, --where TEXT WHERE conditional. Default: file.project = "project"
-o, --order TEXT Field to order by. Default: download_count
--all Show downloads by all installers, not only pip.
-pc, --percent Print percentages.
-md, --markdown Output as Markdown.
-v, --verbose Print debug messages to stderr.
--version Show the version and exit.
-h, --help Show this message and exit.
pypinfo accepts 0 or more options, followed by exactly 1 project, followed by 0 or more fields. By default only the last 30 days are queried. Let’s take a look at some examples!
Tip: If queries are resulting in NoneType errors, increase timeout.
Downloads for a project
$ pypinfo requests
Served from cache: False
Data processed: 2.83 GiB
Data billed: 2.83 GiB
Estimated cost: $0.02
| download_count |
| -------------- |
| 116,353,535 |
All downloads
$ pypinfo ""
Served from cache: False
Data processed: 116.15 GiB
Data billed: 116.15 GiB
Estimated cost: $0.57
| download_count |
| -------------- |
| 8,642,447,168 |
Downloads for a project by Python version
$ pypinfo django pyversion
Served from cache: False
Data processed: 967.33 MiB
Data billed: 968.00 MiB
Estimated cost: $0.01
| python_version | download_count |
| -------------- | -------------- |
| 3.8 | 1,735,967 |
| 3.6 | 1,654,871 |
| 3.7 | 1,326,423 |
| 2.7 | 876,621 |
| 3.9 | 524,570 |
| 3.5 | 258,609 |
| 3.4 | 12,769 |
| 3.10 | 3,050 |
| 3.3 | 225 |
| 2.6 | 158 |
| Total | 6,393,263 |
All downloads by country code
$ pypinfo "" country
Served from cache: False
Data processed: 150.40 GiB
Data billed: 150.40 GiB
Estimated cost: $0.74
| country | download_count |
| ------- | -------------- |
| US | 6,614,473,568 |
| IE | 336,037,059 |
| IN | 192,914,402 |
| DE | 186,968,946 |
| NL | 182,691,755 |
| None | 141,753,357 |
| BE | 111,234,463 |
| GB | 109,539,219 |
| SG | 106,375,274 |
| FR | 86,036,896 |
| Total | 8,068,024,939 |
Downloads for a project by system and distribution
$ pypinfo cryptography system distro
Served from cache: False
Data processed: 2.52 GiB
Data billed: 2.52 GiB
Estimated cost: $0.02
| system_name | distro_name | download_count |
| ----------- | ------------------------------- | -------------- |
| Linux | Ubuntu | 19,524,538 |
| Linux | Debian GNU/Linux | 11,662,104 |
| Linux | Alpine Linux | 3,105,553 |
| Linux | Amazon Linux AMI | 2,427,975 |
| Linux | Amazon Linux | 2,374,869 |
| Linux | CentOS Linux | 1,955,181 |
| Windows | None | 1,522,069 |
| Linux | CentOS | 568,370 |
| Darwin | macOS | 489,859 |
| Linux | Red Hat Enterprise Linux Server | 296,858 |
| Total | | 43,927,376 |
Most popular projects in the past year
$ pypinfo --days 365 "" project
Served from cache: False
Data processed: 1.69 TiB
Data billed: 1.69 TiB
Estimated cost: $8.45
| project | download_count |
| --------------- | -------------- |
| urllib3 | 1,382,528,406 |
| six | 1,172,798,441 |
| botocore | 1,053,169,690 |
| requests | 995,387,353 |
| setuptools | 992,794,567 |
| certifi | 948,518,394 |
| python-dateutil | 934,709,454 |
| idna | 929,781,443 |
| s3transfer | 877,565,186 |
| chardet | 854,744,674 |
| Total | 10,141,997,608 |
Downloads between two YYYY-MM-DD dates
$ pypinfo --start-date 2018-04-01 --end-date 2018-04-30 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01
| download_count |
| -------------- |
| 8,972,826 |
Downloads between two YYYY-MM dates
A yyyy-mm --start-date defaults to the first day of the month
A yyyy-mm --end-date defaults to the last day of the month
$ pypinfo --start-date 2018-04 --end-date 2018-04 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01
| download_count |
| -------------- |
| 8,972,826 |
Downloads for a single YYYY-MM month
$ pypinfo --month 2018-04 setuptools
Served from cache: False
Data processed: 571.37 MiB
Data billed: 572.00 MiB
Estimated cost: $0.01
| download_count |
| -------------- |
| 8,972,826 |
Percentage of Python 3 downloads of the top 100 projects in the past year
Let’s use --test to only see the query instead of sending it.
$ pypinfo --test --days 365 --limit 100 "" project percent3
SELECT
file.project as project,
ROUND(100 * SUM(CASE WHEN REGEXP_EXTRACT(details.python, r"^([^\.]+)") = "3" THEN 1 ELSE 0 END) / COUNT(*), 1) as percent_3,
COUNT(*) as download_count,
FROM `bigquery-public-data.pypi.file_downloads`
WHERE timestamp BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -366 DAY) AND TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY)
AND details.installer.name = "pip"
GROUP BY
project
ORDER BY
download_count DESC
LIMIT 100
Downloads for a given version
pypinfo supports PEP 440 version matching.
We can use it to query stats on a given major version.
$ pypinfo -pc 'pip==21.*' pyversion version
Served from cache: False
Data processed: 34.45 MiB
Data billed: 35.00 MiB
Estimated cost: $0.01
| python_version | version | percent | download_count |
| -------------- | ------- | ------- | -------------- |
| 3.6 | 21.3.1 | 78.74% | 10,430 |
| 3.8 | 21.3.1 | 7.81% | 1,034 |
| 3.7 | 21.2.1 | 3.59% | 476 |
| 3.7 | 21.3.1 | 2.60% | 345 |
| 3.7 | 21.0.1 | 2.25% | 298 |
| 3.8 | 21.0.1 | 1.58% | 209 |
| 3.8 | 21.2.1 | 1.42% | 188 |
| 3.7 | 21.1.2 | 0.81% | 107 |
| 3.9 | 21.3.1 | 0.69% | 92 |
| 3.8 | 21.1.1 | 0.51% | 67 |
| Total | | | 13,246 |
We can also use it to query stats on an exact version:
$ pypinfo -pc 'numpy==1.23rc3' pyversion version
Served from cache: False
Data processed: 34.01 MiB
Data billed: 35.00 MiB
Estimated cost: $0.01
| python_version | version | percent | download_count |
| -------------- | --------- | ------- | -------------- |
| 3.9 | 1.23.0rc3 | 63.33% | 38 |
| 3.8 | 1.23.0rc3 | 28.33% | 17 |
| 3.10 | 1.23.0rc3 | 8.33% | 5 |
| Total | | | 60 |
Credits
Donald Stufft for maintaining PyPI all these years.
Paul Kehrer for his awesome blog post.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pypinfo-21.0.0.tar.gz
.
File metadata
- Download URL: pypinfo-21.0.0.tar.gz
- Upload date:
- Size: 23.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a500a989977de2f2ea2b05e4d84a5a0caef7d43dffbe40ab581cbf60cc78293 |
|
MD5 | d7acb0cb9396e0cc5c5cb291ef542dbe |
|
BLAKE2b-256 | cdd745384ec5fd5a458c1e1078350c59c9697df88a65adf2abe4af27d0fad413 |
File details
Details for the file pypinfo-21.0.0-py3-none-any.whl
.
File metadata
- Download URL: pypinfo-21.0.0-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b37ecee9d187b9d5e2b3e3389ddba29884ba02fbd1a61303ea739c9f23dbce7 |
|
MD5 | eb583d3da770a50ad3d13ffbd033c02e |
|
BLAKE2b-256 | 4e6f7f19d72ece54dcefa9d9bd4631f30904ba0b9fd638d30c0ebe7ea15c8c23 |