Download remote data (HTTP, FTP, SFTP) and store locally for data pipeline
Project description
fetch-data
[![Documentation Status](https://readthedocs.org/projects/fetch-data/badge/?version=latest)](https://fetch-data.readthedocs.io/en/latest/?badge=latest)Download remote data (HTTP, FTP, SFTP) and store locally for data pipeline.
This package was created out of the frustration that it is very difficult to download data easily with intake
.
fetch-data
is a mash-up of fsspec
and pooch
making it easy to download multiple files and store all the info, making it good for data pipeline applications.
Installation
Currently, this package is
pip install git+https://github.com/lukegre/fetch-data.git
Basic usage
Use the download function directly:
flist = fd.download(url)
The file will be downloaded to the current directory and will be populated with a readme file, cached file list, and logging information.
Using with YAML catalogs
Use the catalog YAML entry
import fetch_data as fd
cat = fd.read_catalog(cat_fname)
flist = fd.download(**cat['entry_name'])
The catalog should be structured as shown below:
entry_name:
url: remote path to file/s. Can contain *
dest: where the file/s will be stored - can have optional {} placeholders that will be replaced
meta: # this will be written to the README file
doi: url to the data source
description: info about the data
citation: how to cite this dataset
placeholder: value # optional will replace values in dest
Project based on the cookiecutter science project template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fetch_data-0.2.1.tar.gz
.
File metadata
- Download URL: fetch_data-0.2.1.tar.gz
- Upload date:
- Size: 25.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0bd077ea79577c5bbdde944a152a278c5b5cc5d094eac70c73752fc05f653ff |
|
MD5 | 221939769ea0d3c0b23390a2bddb367f |
|
BLAKE2b-256 | d472c3ea4589ecef90e302418b558db68a28c65844ff667c945a262fca9d6263 |
File details
Details for the file fetch_data-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: fetch_data-0.2.1-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64c12c4788709273266948add1841891ddb7554e48382d0303131ff165db8ded |
|
MD5 | 39f048f0f2f1bb744c47512f33be8374 |
|
BLAKE2b-256 | 98fb787f722a0aeba36ee6f67a8adb4a9fe07bd2adba9f1cd17ad62691bee768 |