Skip to main content

Geocode addresses from New York State's SPARCS data

Project description

geocode_sparcs is a program for geocoding health data from New York State’s Statewide Planning and Research Cooperative System (SPARCS) with a local installation of Pelias. It might also be useful as a Pelias wrapper for other regions and datasets, but for now at least, the focus is on SPARCS. The various kinds of fallback logic and string munging the program implements are specifically to improve performance on SPARCS.

Install geocode_sparcs via pip with the command pip install geocode_sparcs. Python dependencies are automatically installed, but you’ll need to set up Pelias with the instructions for Pelias on Docker. You can use the provided Pelias project directory; just be sure to set DATA_DIR in .env to where you want to store all the data. Setting up Pelias with this configuration can take a few hours of downloading and processing.

Once Pelias is up, you can geocode with the command python3 -m geocode_sparcs, passing in addresses to geocode through standard input. Each address should be a JSON object on its own line (per JSON Lines) with the keys line1, city, and zip. The values should all be strings (even zip) and are presumed to come from the columns PAT_ADDR_LINE1, PAT_ADDR_CITY, and PAT_ADDR_ZIP5 in a SPARCS_LOCATION file; it is also assumed that you already checked that PAT_ADDR_ST is equal to NY for each case. Here’s an example (with addresses that aren’t actually from SPARCS, since that’s protected health information):

$ echo '{"line1": "405 East 42nd St", "city": "New York", "zip": "10017"}' >>input.txt
$ echo '{"line1": "351 Northern Blvd", "city": "Albany", "zip": "12204"}' >>input.txt
$ python3 -m geocode_sparcs <input.txt

The output is also in JSON Lines. By default, the first features result from Pelias for each input is returned without further processing. See python3 -m geocode_sparcs --help for command-line options.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geocode_sparcs-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

File details

Details for the file geocode_sparcs-0.1.0.tar.gz.

File metadata

  • Download URL: geocode_sparcs-0.1.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for geocode_sparcs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f1ac06d3afa0b637051cf4da9d10199cfda2d50aa25a0b6e9a3a37529dc971c2
MD5 8df970418d1910006115fcc8f62bafa9
BLAKE2b-256 7d37a25aa3d9f3c86d23c86fe6bc3a82af48f3f15337b100c514d9959938136f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page