Skip to main content

Tools for scraping personal financial data.

Project description

Python package for scraping personal financial data from financial institutions.

License: GPL v2 Build Status

This package may be useful on its, but is specifically designed to be used with beancount-import.

Supported data sources

Setup

To install the most recent published package from PyPi, simply type:

pip install finance-dl

To install from a clone of the repository, type:

python setup.py install

or for development:

python setup.py develop

Configuration

Create a Python file like example_finance_dl_config.py.

Refer to the documentation of the individual scraper modules for details.

Basic Usage

You can run a scraping configuration named myconfig as follows:

python -m finance_dl.cli --config-module example_finance_dl_config --config myconfig

The configuration myconfig refers to a function named CONFIG_myconfig in the configuration module.

Make sure that your configuration module is accessible in your Python sys.path. Since sys.path includes the current directory by default, you can simply run this command from the directory that contains your configuration module.

By default, the scrapers run fully automatically, and the ones based on selenium and chromedriver run in headless mode. If the initial attempt for a selenium-based scraper fails, it is automatically retried again with the browser window visible. This allows you to manually complete the login process and enter any multi-factor authentication code that is required.

To debug a scraper, you can run it in interactive mode by specifying the -i command-line arugment. This runs an interactive IPython shell that lets you manually invoke parts of the scraping process.

Automatic Usage

To run multiple configurations at once, and keep track of when each configuration was last updated, you can use the finance_dl.update tool.

To display the update status, first create a logs directory and run:

python -m finance_dl.cli --config-module example_finance_dl_config --log-dir logs status

Initially, this will indicate that none of the configurations have been updated. To update a single configuration myconfig, run:

python -m finance_dl.cli --config-module example_finance_dl_config --log-dir logs update myconfig

With a single configuration specified, this does the same thing as the finance_dl.cli tool, except that the log messages are written to logs/myconfig.txt and a logs/myconfig.lastupdate file is craeted if it is successful.

If multiple configurations are specified, as in:

python -m finance_dl.cli --config-module example_finance_dl_config --log-dir logs update myconfig1 myconfig2

then all specified configurations are run in parallel.

To update all configurations, run:

python -m finance_dl.cli --config-module example_finance_dl_config --log-dir logs update --all

License

Copyright (C) 2014-2018 Jeremy Maitin-Shepard.

Distributed under the GNU General Public License, Version 2.0 only. See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finance-dl-1.0.2.zip (53.1 kB view details)

Uploaded Source

File details

Details for the file finance-dl-1.0.2.zip.

File metadata

  • Download URL: finance-dl-1.0.2.zip
  • Upload date:
  • Size: 53.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for finance-dl-1.0.2.zip
Algorithm Hash digest
SHA256 9e181a2c00c4af11f773af7eb3158e7adc9852f697bc366b53470e9aaa46e03e
MD5 0105c035cb4c557649a92f1efb556333
BLAKE2b-256 802854e8e29f370bb95b811aade14774a0f883c14f3c4efc073aea6a4fba4332

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page