Skip to main content

A command line tool and library to get data from the Social Media Analysis Toolkit (SMAT).

Project description

SMAT-CLI

Provides command line tools for getting data from the Social Media Analysis Toolkit (SMAT) as well as a library for interacting with SMAT from your own code.

Python Versions Latest Release Pipeline Status

The Social Media Analysis Toolkit is a resource that allows activists, journalists, researchers, and other social good organizations to collect information about hate, mis/disinformation, and extremism from a variety of online platforms. The folks at SMAT are providing an amazing service and deserve your support! Go to their Open Collective page to support them if you're able.

SMAT-CLI is a tool that makes getting that information from the API easy, either from your terminal or as part of your own application.

Installation

OS X & Linux:

pip install smat-cli

Though, I recommend using Pipx to install it as a a system tool.

pipx install smat-cli

Windows:

Coming soon!

Usage Examples

Let's say you want to collect 1000 posts from Telegram posted between Jan 6 to March 1, 2021. You can do that with the content command like this.

smat content -s telegram -l 1000 --since 2021-01-06 --until 2021-03-01 trump

If you want some aggregated data, you can use the timeseries command to fetch a count of posts mentioning Trump from Jan 6 to March 1, 2021 and aggregate those into daily buckets, you can use the following.

smat timeseries -s telegram -i day --since 2021-01-06 --until 2021-03-01 trump

You can also aggregate by any arbitrary key present in the data for the site. To get an idea of which keys are available, you can examine the results of a content command. Once you know the key you want to aggregate on, you can use activity to, for example, count the number of posts containing the term Trump in each Telegram channel from Jan 6 to March 1, 2022.

smat activity -s telegram -a channelusername --since 2021-01-06 --until 2021-03-01 trump

All above commands print output line-by-line to stdout in JSON, so you can pipe the results to a file in the normal way.

smat content -s telegram -l 1000 --since 2021-01-06 --until 2021-03-01 trump >> data.ndjson

You can also specify different formats for the output. For example, if you'd like the results in JSON instead of JSONlines, you can pass --format json. This is currently the only way to format changepoint data when using timeseries.

smat --format json content -s telegram -l 1000 --since 2021-01-06 --until 2021-03-01 trump > data.json

In addition, this package can be used in another application by importing Smat from smat_cli and using it to query the API from inside your program.

from smat_cli import Smat
api = Smat()
data = api.content(term="trump", site="telegram", ...)
for d in data:
    print(d["message"])

Development Setup

This project uses Poetry and the code is formatted with Black.

Tests can be run via Tox, which will run the tests in Python verions: 3.7, 3.8, 3.9, 3.10. For this to work properly, all of these Python versions must be installed. A Dockerfile has been included in the test_runner directory if necessary.

poetry install
tox

Release History

  • 0.1.0
    • Initial release
  • 0.1.1
    • Updates to package info and README
  • 0.1.2
    • Updates to package info and README
    • Add formatters and JsonFormatter

Me

Daniel Hosterman – @dhostermandaniel@danielhosterman.com

Distributed under the Unlicense license. See LICENSE for more information.

https://gitlab.com/dhosterman

Contributing

  1. Fork it (https://github.com/yourname/yourproject/fork)
  2. Create your feature branch (git checkout -b feature/fooBar)
  3. Format your code with Black
  4. Ensure there are tests for your changes and that they pass
  5. Commit your changes (git commit -am 'Add some fooBar')
  6. Push to the branch (git push origin feature/fooBar)
  7. Create a new Pull Request

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smat-cli-0.1.2.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

smat_cli-0.1.2-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file smat-cli-0.1.2.tar.gz.

File metadata

  • Download URL: smat-cli-0.1.2.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.3 Darwin/21.4.0

File hashes

Hashes for smat-cli-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3d3c5c2bda7e5f07f5a43a0d22de42dcaa7d3e568147dd13af0640afdb132feb
MD5 5cf70f1df84173ca0b251da6b5b8bd20
BLAKE2b-256 e68f9a7cfa1d2106d8c3541caf841584ccb551335bbc990b2443052b9883bc4e

See more details on using hashes here.

File details

Details for the file smat_cli-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: smat_cli-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.10.3 Darwin/21.4.0

File hashes

Hashes for smat_cli-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1ea6d837d37294c433b5b0247a8aed073672011d8e3cf6de3daa1936224a8426
MD5 4380ff922b26f3bca9045475e812165c
BLAKE2b-256 366eef83dbeb3db18e50304ca51fece84e6b20c12d2325f881ef0badfe595b78

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page