Skip to main content

Make data extracts from OSM data.

Project description

OSM RawData

HOT

A python module for accessing OSM data in a postgres database.

Build CI Build Publish Docs Publish Test Package version Downloads License


📖 Documentation: https://hotosm.github.io/osm-rawdata/

🖥️ Source Code: https://github.com/hotosm/osm-rawdata


These is a module to work with OpenStreetMap data using postgres and a custom database schema. This code is derived from the HOT Export Tool, osm-fieldwork, and Underpass, and the Raw Data API, which is the new FastAPI backend for the HOT Export Tool.

Since multiple projects need to do data extracts from OpenStreetMap in a flexible way, this was designed to have a single body of code to maintain.

Installation

To install osm-rawdata, you can use pip. Here are two options:

  • Directly from the main branch: pip install git+https://github.com/hotosm/osm-rawdata.git

  • Latest on PyPi: pip install osm-rawdata

  • Including the packages required for importer.py: pip install osm-rawdata[importer]

NOTE that importer.py will not work unless the extra dependencies are specified using osm-rawdata[importer]

Using the Container Image

  • osm-rawdata scripts can be used via the pre-built container images.
  • These images come with all dependencies bundled, so are simple to run.

Run a specific command:

docker run --rm -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest geofabrik <flags>

Run interactively (to use multiple commands):

docker run --rm -it -v $PWD:/data ghcr.io/hotosm/osm-rawdata:latest

Note: the output directory should always be /data/... to persist data.

The Database Schema

This project is heavily dependant on postgres and postgis. This schema was optimized for data anaylsis more than display purposes. The traditional schema use for OSM shows how it has evolved over the years. Some tags are columns (usually empty), and others get put into an hstore tag column where they have to be accessed directly. One big change in this datbase schema is all the tags are in a single column, reducing the data size considerably, while also being easier to query in a consistent manner. In the past a developer had to keep track of what was a column, and what was in the tags column, which was inefficient.

This schema has 4 tables, similar to the traditional ones. OSM data is imported using osm2pgsql but uses a lua script to create the custom schema. This module's usage is all read-only, as Underpass can keep the raw data updated every minute, and we just want to access that data.

Things get more interesting as this module supports both a local database and a remote one. They use different query languages. To simplify this, a configuration file is used, which then generates the proper query syntax.

The Config File

This reads in two different formats that describe the eventualy SQL query. The YAML format was originally used by Export Tool, but later abandoned for a JSON format. The YAML format was adopted by the osm-fieldwork project before this transistion happened, so uses an enhanced version to define the queries.

The JSON format is also supported, both parsing the config file and also generating that query from a YAML config file.

The Files

geofabrik.py

This is a simple utility to download a file from GeoGFabrik.

config.py

This class parses either then JSON or YAML config file formatted files, and creates a data structure used later to generater the database query.

postgres.py

This class handles working with the postgres database. It sets up the connections, and handles processing the results from the queries.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osm-rawdata-0.4.0.tar.gz (43.1 kB view details)

Uploaded Source

Built Distribution

osm_rawdata-0.4.0-py3-none-any.whl (54.9 kB view details)

Uploaded Python 3

File details

Details for the file osm-rawdata-0.4.0.tar.gz.

File metadata

  • Download URL: osm-rawdata-0.4.0.tar.gz
  • Upload date:
  • Size: 43.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.3 CPython/3.10.12

File hashes

Hashes for osm-rawdata-0.4.0.tar.gz
Algorithm Hash digest
SHA256 4f3a3f09f4693da91a14fb781efa97ae4dc4d7df74a507c29b73d3fff73a10de
MD5 46fc83111165bcd7d23dd465692e2a69
BLAKE2b-256 49d61d0edbd6bb5773bdbe49d282f83de44c8602fd57eb738f54f29e6f9f25d1

See more details on using hashes here.

File details

Details for the file osm_rawdata-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: osm_rawdata-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 54.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.9.3 CPython/3.10.12

File hashes

Hashes for osm_rawdata-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a07b00a761ee5533ed511c80ce8960b37253068bcfac149bade22a9137a7e308
MD5 36e7f67bd8562ed89daad515c93da83d
BLAKE2b-256 be7f5cdfaa4b9af1cc021b9ab8bbeb8a197e2d206b7191238efbaed5bf1bca30

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page