Unshorten the URLs in your Twitter archive
Project description
twitter-archive-unshorten
Twitter's archive download includes shortened t.co
URLs instead of the original URLs that you tweeted. If Twitter ever goes away, the server at t.co
won't be available to respond to requests.
twitter-archive-unshorten
is a small Python program that will examine all the JavaScript files in the archive download and rewrite the t.co
short URLs to their original full URL form. This means the context for your archived tweets will make a little more sense after Twitter is gone. Maybe you can look up those URLs in the Internet Archive if they are no longer available. This would be impossible if all you had was the short URL.
Install
$ pip3 install twitter-archive-unshorten
Run
- Unzip your Twitter archive zip file.
- Open a terminal window and run:
twitter-archive-unshorten /path/to/your/archive/directory/
It might take a while, depending on how many tweets you have. Once it's
finished you should be able open your archive and interact with it without the
t.co
URLs.
The mapping of short URLs to long URLs that was used is saved in your archive directory as
data/shorturls.json
.
Test
If you'd like to develop further you can run the existing tests:
$ pip3 install pytest
$ pytest test.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for twitter_archive_unshorten-0.0.10.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48aebf116b1e8ff57f584d286caa7af38ad33e51ae82a2158588b67f3a807023 |
|
MD5 | e1e9b85811699ea8ecf80726c1a7fb4f |
|
BLAKE2b-256 | 03ae98dd4d1f2ea99a0c9c9ce3b923ed9fdcd8ca6fb86b0dff6f3c5a5ab4f6a3 |
Hashes for twitter_archive_unshorten-0.0.10-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15d9ed41a76cf60b6a2df67f62a3984d3934da56594b341e0433168c45e0caf3 |
|
MD5 | cd685026601b15081cfdb2fa90fad6b2 |
|
BLAKE2b-256 | 687d6c4cd1d069d8ca1417713f3cbac4ec748137b15a4314af4c6b6ab00ea156 |