Skip to main content

Python interface to filter emails on Google Mail.

Project description

Manage your emails in Gmail with Python

Python package Coverage Status Code style: black

The pydatamail_google is a python module to automate the filtering of emails on Gmail using the Gmail API. You can either write your own python script to combine the different functions or use the JSON based input or the command line input, all three provide acccess to the same functionality and are explained in more detail below.

Configuration

The pydatamail_google stores the configuration files in the users home directory ~/.pydatamail. This folder contains:

  • config.json the JSON configuration file for JSON based input, which is explained in more detial below.
  • credentials.json the authentication credentials for the Google API, which at least requires access to Gmail and additional access to Google Drive in case you want to store your attachments on Google drive.
  • token_files the token directory is used to store the active token for accessing the APIs, these are created automatically, there should be no need for the user to modify these.

Installation

Install the package from github using pip:

pip install git+https://github.com/pyscioffice/pydatamail_google.git

Finally setup the credentials.json in your Google Apps and store it in ~/.pydatamail/credentials.json.

Python interface

Import the pydatamail_google module

from pydatamail_google import Gmail

Initialize pydatamail_google

Create a gmail object from the Gmail() class

gmail = Gmail()

For testing purposes you can use the optimal client_service_file parameter to specify the location of the authentication credentials in case they are not stored in ~/.pydatamail/credentials.json.

List Labels

List the available labels in your Gmail account:

gmail.labels

Returns a list of email labels as you defined them in your email client. This is in contrast to the Gmail API which typically returns the label IDs rather than the user defined label names.

Filter Emails

Filter a set of emails in a specific label using a predefined list of dictionaries:

gmail.filter_label_by_sender(label, filter_dict_lst)

The label can be any email label and the filter_dict_lst is a list of email filters defined as dictionary. A typical email filter list might look like this:

[{"from": "my_email@provider.com", "label": "my_special_label"},
 {"to": "spam@google.com", "label": "another_email_label"},
 {"subject": "you won", "label": "success_story"}]

At the current stage only one of the three fields from, to or subject can be validated per filter and all filters are applied as "is in" rather than an exact match.

Search for Emails

Search emails either by a specific query or optionally limit your search to a list of labels.

gmail.search_email(query_string="", label_lst=[], only_message_ids=False)

The query_string supports all the functionality the gmail search has to offer, for example you can search for emails with attachments using the query "has:attachment". In addition with the option only_message_ids the return values can be reduced to just a list of email ids, otherwise both the email ids and the thread ids are returned.

Remove Labels

As Gmail provides a set of smart labels which are accessible on the web interface but typically hidden in the mobile application many people want to remove these labels. Still this functionality is more general and can be applied to any list of labels, so be warned when using it.

gmail.remove_labels_from_emails(label_lst)

To remove the Gmail smart labels just set the label_lst to ["CATEGORY_FORUMS", "CATEGORY_UPDATES", "CATEGORY_PROMOTIONS", "CATEGORY_SOCIAL"].

Load Tasks from JSON file

This is the function for the file based interface, which is explained below in a separate section.

gmail.load_json_tasks(config_json=None)

By default the json config file is expected to be located in ~/.pydatamail/config.json.

Save attachments for a specific label

Save all attachments of emails marked with a selected label to a specific folder on Google drive. This requires Google drive authorisation to be included in the authentication credentials.

gmail.save_attachments_of_label(label, path)

The label is given by its label name rather than the google internal label ID and the path has to be a relative path starting at the root of your google drive, for example backup/emails. In this path a new subfolder is created with the name of the label.

Download messages to pandas Dataframe

For offline processing it is helpful to download messages in bulk to pandas dataframes:

gmail.download_messages_to_dataframe(message_id_lst)

The message_id_lst is a list of message ids, this can be obtained from gmail.search_email().

Get email content as dictionary

The content of the email rendered as python dictionary for further postprocessing:

gmail.get_email_dict(message_id)

The message_id can be derived from a function like gmail.search_email().

Update database

Update local database stored in ~/.pydatamail/email.db:

gmail.update_database()

Command Line interface

The command line interface is currently rather limited, it supports the following options:

  • pydatamail_google run the tasks defined in ~/.pydatamail/config.json.
  • pydatamail_google --file ~/.pydatamail/config.json run the tasks defined in a user specific task file.
  • pydatamail_google --labels list all labels of your Gmail account.
  • pydatamail_google --database update local database.

File based interface

Currently the file based interface only supports two functions:

  • remove_labels_from_emails to remove specific labels from all emails on your account.
  • filter_label_by_sender to filter emails using the filter dictionary list

Both functions are explained in more detail above in the python interface section. Below is an example configuration file which would be located at ~/.pydatamail/config.json:

{
    "database": "sqlite:////~/.pydatamail/email.db",
    "remove_labels_from_emails": 
    ["CATEGORY_FORUMS", "CATEGORY_UPDATES", "CATEGORY_PROMOTIONS", "CATEGORY_SOCIAL"], 
    "filter_label_by_sender": {
        "label": "my_other_email_provider", 
        "filter_dict_lst": [
            {"from": "my_email@provider.com", "label": "my_special_label"},
            {"to": "spam@google.com", "label": "another_email_label"},
            {"subject": "you won", "label": "success_story"}
        ]
    }
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydatamail_google-0.0.6.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

pydatamail_google-0.0.6-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file pydatamail_google-0.0.6.tar.gz.

File metadata

  • Download URL: pydatamail_google-0.0.6.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for pydatamail_google-0.0.6.tar.gz
Algorithm Hash digest
SHA256 139e3197d02743e3826c6f18bd254039fbe846b84f86625b6ac323f45d3a4078
MD5 8ea0e8d6162db2ad403ac422196da768
BLAKE2b-256 2dd4b1fab710cd568a3ad757c9417c22bbfeb52f038f6b8d444581a50b13b612

See more details on using hashes here.

File details

Details for the file pydatamail_google-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for pydatamail_google-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1a6dbe31b9bf8794749b8c3c172c214ee1e9078a45b5da35596faf712226163e
MD5 4308fe9706f99e97084b59302943d11d
BLAKE2b-256 03f0dc4be1fed2b6061ee321acdabd87474db64f5c81e00823d4b232b8606751

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page