Tools for scraping personal financial data.
Project description
Python package for scraping personal financial data from financial institutions.
This package may be useful on its own, but is specifically designed to be used with beancount-import.
Supported data sources
- finance_dl.ofx: uses ofxclient to download data using the OFX protocol.
- finance_dl.mint: uses mintapi to download data from the Mint.com website.
- finance_dl.venmo: downloads transaction and balance information from the Venmo.com website
- finance_dl.paypal: downloads transactions from the Paypal.com website
- finance_dl.amazon: downloads order invoices from the Amazon website
- finance_dl.healthequity: downloads transaction history and balance information from the HealthEquity website.
- finance_dl.google_purchases: downloads purchases that Google has heuristically extracted from Gmail messages.
- finance_dl.stockplanconnect: downloads PDF documents (including release and trade confirmations) from the Morgan Stanley Stockplanconnect website.
- finance_dl.pge: downloads Pacific Gas & Electric (PG&E) PDF bills.
- finance_dl.comcast: downloads Comcast PDF bills.
- finance_dl.ebmud: downloads East Bay Municipal Utility District (EBMUD) water bills.
- finance_dl.anthem: downloads Anthem BlueCross insurance claim statements.
- finance_dl.waveapps: downloads receipt images and extracted transaction data from Wave, which is a free receipt-scanning website/mobile app.
- finance_dl.ultipro_google: downloads Google employee payroll statements in PDF format from Ultipro.
Setup
To install the most recent published package from PyPi, simply type:
pip install finance-dl
To install from a clone of the repository, type:
pip install .
or for development:
pip install -e .
Configuration
Create a Python file like example_finance_dl_config.py
.
Refer to the documentation of the individual scraper modules for details.
Basic Usage
You can run a scraping configuration named myconfig
as follows:
python -m finance_dl.cli --config-module example_finance_dl_config --config myconfig
The configuration myconfig
refers to a function named
CONFIG_myconfig
in the configuration module.
Make sure that your configuration module is accessible in your Python
sys.path
. Since sys.path
includes the current directory by
default, you can simply run this command from the directory that
contains your configuration module.
By default, the scrapers run fully automatically, and the ones based
on selenium
and chromedriver
run in headless mode. If the initial
attempt for a selenium
-based scraper fails, it is automatically
retried again with the browser window visible. This allows you to
manually complete the login process and enter any multi-factor
authentication code that is required.
To debug a scraper, you can run it in interactive mode by specifying
the -i
command-line argument. This runs an interactive IPython
shell that lets you manually invoke parts of the scraping process.
Automatic Usage
To run multiple configurations at once, and keep track of when each
configuration was last updated, you can use the finance_dl.update
tool.
To display the update status, first create a logs
directory and run:
python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs status
Initially, this will indicate that none of the configurations have
been updated. To update a single configuration myconfig
, run:
python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update myconfig
With a single configuration specified, this does the same thing as the
finance_dl.cli
tool, except that the log messages are written to
logs/myconfig.txt
and a logs/myconfig.lastupdate
file is created
if it is successful.
If multiple configurations are specified, as in:
python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update myconfig1 myconfig2
then all specified configurations are run in parallel.
To update all configurations, run:
python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update --all
License
Copyright (C) 2014-2018 Jeremy Maitin-Shepard.
Distributed under the GNU General Public License, Version 2.0 only. See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file finance-dl-1.3.0.zip
.
File metadata
- Download URL: finance-dl-1.3.0.zip
- Upload date:
- Size: 62.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80b6ad927ccd2bb72ea87106b5ec49ccfada2bf3389a79f14d82f84e5c72e5c1 |
|
MD5 | d056e5e851adae40101538f25e52fe44 |
|
BLAKE2b-256 | 2c7f54a97a47b5e01393adb0a2cf59af03b1013a135a1b3e3bae596227dd0ecc |