Skip to main content

Unshorten the URLs in your Twitter archive

Project description

twitter-archive-unshorten

Twitter's archive download includes shortened t.co URLs instead of the original URLs that you tweeted. If Twitter ever goes away, the server at t.co won't be available to respond to requests.

twitter-archive-unshorten is a small Python program that will examine all the JavaScript files in the archive download and rewrite the t.co short URLs to their original full URL form. This means the context for your archived tweets will make a little more sense after Twitter is gone. Maybe you can look up those URLs in the Internet Archive if they are no longer available. This would be impossible if all you had was the short URL.

Install

$ pip3 install twitter-archive-unshorten

Run

  1. Unzip your Twitter archive zip file.
  2. Open a terminal window and run: twitter-archive-unshorten /path/to/your/archive/directory/

It might take a while, depending on how many tweets you have. Once it's finished you should be able open your archive and interact with it without the t.co URLs.

The mapping of short URLs to long URLs that was used is saved in your archive directory as data/shorturls.json.

Test

If you'd like to develop further you can run the existing tests:

$ pip3 install pytest
$ pytest test.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twitter_archive_unshorten-0.0.10.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file twitter_archive_unshorten-0.0.10.tar.gz.

File metadata

  • Download URL: twitter_archive_unshorten-0.0.10.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for twitter_archive_unshorten-0.0.10.tar.gz
Algorithm Hash digest
SHA256 48aebf116b1e8ff57f584d286caa7af38ad33e51ae82a2158588b67f3a807023
MD5 e1e9b85811699ea8ecf80726c1a7fb4f
BLAKE2b-256 03ae98dd4d1f2ea99a0c9c9ce3b923ed9fdcd8ca6fb86b0dff6f3c5a5ab4f6a3

See more details on using hashes here.

File details

Details for the file twitter_archive_unshorten-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: twitter_archive_unshorten-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for twitter_archive_unshorten-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 15d9ed41a76cf60b6a2df67f62a3984d3934da56594b341e0433168c45e0caf3
MD5 cd685026601b15081cfdb2fa90fad6b2
BLAKE2b-256 687d6c4cd1d069d8ca1417713f3cbac4ec748137b15a4314af4c6b6ab00ea156

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page