Skip to main content

Download remote data (HTTP, FTP, SFTP) and store locally for data pipeline

Project description

fetch-data

Build Status codecov License:MITpypi

[![Documentation Status](https://readthedocs.org/projects/fetch-data/badge/?version=latest)](https://fetch-data.readthedocs.io/en/latest/?badge=latest)

Download remote data (HTTP, FTP, SFTP) and store locally for data pipeline.

This package was created out of the frustration that it is very difficult to download data easily with intake. fetch-data is a mash-up of fsspec and pooch making it easy to download multiple files and store all the info, making it good for data pipeline applications.

Installation

Currently, this package is pip install git+https://github.com/lukegre/fetch-data.git

Basic usage

Use the download function directly:

flist = fd.download(url)

The file will be downloaded to the current directory and will be populated with a readme file, cached file list, and logging information.

Using with YAML catalogs

Use the catalog YAML entry

import fetch_data as fd
cat = fd.read_catalog(cat_fname)
flist = fd.download(**cat['entry_name'])

The catalog should be structured as shown below:

entry_name:
    url: remote path to file/s. Can contain *
    dest: where the file/s will be stored - can have optional {} placeholders that will be replaced
    meta:  # this will be written to the README file
        doi: url to the data source
        description: info about the data
        citation: how to cite this dataset
    placeholder: value  # optional will replace values in dest

Project based on the cookiecutter science project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetch_data-0.2.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

fetch_data-0.2-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file fetch_data-0.2.tar.gz.

File metadata

  • Download URL: fetch_data-0.2.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for fetch_data-0.2.tar.gz
Algorithm Hash digest
SHA256 79e8c538f6b9cfa56433fd8cc0cd727130c7ef52d99fae1d9d3ca25963f29a2d
MD5 a1d1d7992c11ef52620a707fee40d7a2
BLAKE2b-256 7d91585028c3b6753de03c59325ca777a712f4e654e659c10dbd1bd24dc27700

See more details on using hashes here.

File details

Details for the file fetch_data-0.2-py3-none-any.whl.

File metadata

  • Download URL: fetch_data-0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for fetch_data-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7dc43f621819b686d2812f74512d15e82f9e9059ba5c3e6ec6b6ed57e2c202cc
MD5 f20bfe2eb4b144ce74216b6cd48f3f11
BLAKE2b-256 6b0d6eab08afdd5a7af9f7ace54c9847a6c0631c16fd4ca959c47702827a8d14

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page