A simple package designed to collect the edit histories of Wikipedia pages
Project description
Wikipedia Histories
A simple tool to pull the complete edit history of a Wikipedia page in a variety of formats, including JSON, DataFrame, or directly as an object.
>>> import wikipedia_histories as wh
# Generate a list of revisions for a specified page
>>> golden_swallow = wh.get_history('Golden swallow')
# Show the revision IDs for every edit
>>> golden_swallow
# [130805848, 162259515, 167233740, 195388442, ...
# Show the user who made a specific edit
>>> golden_swallow[16].user
# u'Snowmanradio'
# Show the text of at the time of a specific edit
>>> golden_swallow[16].text
# u'The Golden Swallow (Tachycineta euchrysea) is a swallow. The Golden Swallow formerly'...
>>> golden_swallow[200].text
# u'The golden swallow (Tachycineta euchrysea) is a passerine in the swallow family'...
# Generate a dataframe with text and metadata from a the list of revisions
>>> wh.build_df(golden_swallow)
# Generate a JSON with text and metadata from the list of versions
>>> wh.build_json(golden_swallow)
Installation
To install Wikipedia Histories, simply run:
$ pip install wikipedia-histories
Wikipedia Histories is compatible with Python 3.6+.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for wikipedia_histories-0.0.10.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11a5e6b6f06f6c28bb003d74d9baa6befbce42593e26a43d0a52b4db6f2e7741 |
|
MD5 | 5d906f229339a763f77b2455c4aec2e9 |
|
BLAKE2b-256 | 99aa02b1f89378dded5660cd7ba4935f87b27b3acb9a2ca9cfe99c09b04168b4 |
Close
Hashes for wikipedia_histories-0.0.10-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e7fa4c428285c99ecc0a7dd3bfe582d1b4df088836cfb786b2c7be0577919d7 |
|
MD5 | 11ca20939e97ee6623dcf0db447e09d8 |
|
BLAKE2b-256 | 2bb4f1de867dd267f2f5242234c851bed6d2996d4e7abc630c6b082f83e91e14 |