Skip to main content

A Python package and CLI for parsing aggregate and forensic DMARC reports

Reason this release was yanked:

IMAP bugs

Project description

Build Status Code Coverage PyPI Package

A screenshot of DMARC summary charts in Kibana

parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.

Features

  • Parses draft and 1.0 standard aggregate/rua reports

  • Parses forensic/failure/ruf reports

  • Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API

  • Transparently handles gzip or zip compressed reports

  • Consistent data structures

  • Simple JSON and/or CSV output

  • Optionally email the results

  • Optionally send the results to Elasticsearch and/or Splunk, for use with premade dashboards

  • Optionally send reports to Apache Kafka

Resources

DMARC guides

SPF and DMARC record validation

If you are looking for SPF and DMARC record validation and parsing, check out the sister project, checkdmarc.

Lookalike domains

DMARC protects against domain spoofing, not lookalike domains. For open source lookalike domain monitoring, check out DomainAware.

CLI help

usage: parsedmarc [-h] [-c CONFIG_FILE] [--strip-attachment-payloads] [-o OUTPUT]
                  [--aggregate-json-filename AGGREGATE_JSON_FILENAME]
                  [--forensic-json-filename FORENSIC_JSON_FILENAME]
                  [--aggregate-csv-filename AGGREGATE_CSV_FILENAME]
                  [--forensic-csv-filename FORENSIC_CSV_FILENAME]
                  [-n NAMESERVERS [NAMESERVERS ...]] [-t DNS_TIMEOUT] [--offline]
                  [-s] [--verbose] [--debug] [--log-file LOG_FILE] [-v]
                  [file_path ...]

Parses DMARC reports

positional arguments:
  file_path             one or more paths to aggregate or forensic report
                        files, emails, or mbox files'

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config-file CONFIG_FILE
                        a path to a configuration file (--silent implied)
  --strip-attachment-payloads
                        remove attachment payloads from forensic report output
  -o OUTPUT, --output OUTPUT
                        write output files to the given directory
  --aggregate-json-filename AGGREGATE_JSON_FILENAME
                        filename for the aggregate JSON output file
  --forensic-json-filename FORENSIC_JSON_FILENAME
                        filename for the forensic JSON output file
  --aggregate-csv-filename AGGREGATE_CSV_FILENAME
                        filename for the aggregate CSV output file
  --forensic-csv-filename FORENSIC_CSV_FILENAME
                        filename for the forensic CSV output file
  -n NAMESERVERS [NAMESERVERS ...], --nameservers NAMESERVERS [NAMESERVERS ...]
                        nameservers to query
  -t DNS_TIMEOUT, --dns_timeout DNS_TIMEOUT
                        number of seconds to wait for an answer from DNS
                        (default: 2.0)
  --offline             do not make online queries for geolocation or DNS
  -s, --silent          only print errors and warnings
  --verbose             more verbose output
  --debug               print debugging information
  --log-file LOG_FILE   output logging to a file
  -v, --version         show program's version number and exit

Configuration file

parsedmarc can be configured by supplying the path to an INI file

parsedmarc -c /etc/parsedmarc.ini

For example

# This is an example comment

[general]
save_aggregate = True
save_forensic = True

[imap]
host = imap.example.com
user = dmarcresports@example.com
password = $uperSecure

[mailbox]
watch = True
delete = False

[elasticsearch]
hosts = 127.0.0.1:9200
ssl = False

[splunk_hec]
url = https://splunkhec.example.com
token = HECTokenGoesHere
index = email

[s3]
bucket = my-bucket
path = parsedmarc

[syslog]
server = localhost
port = 514

[gmail_api]
credentials_file = /path/to/credentials.json # Get this file from console.google.com. See https://developers.google.com/identity/protocols/oauth2
token_file = /path/to/token.json             # This file will be generated automatically
scopes = https://mail.google.com/
include_spam_trash=True

The full set of configuration options are:

  • general
    • save_aggregate - bool: Save aggregate report data to Elasticsearch, Splunk and/or S3

    • save_forensic - bool: Save forensic report data to Elasticsearch, Splunk and/or S3

    • strip_attachment_payloads - bool: Remove attachment payloads from results

    • output - str: Directory to place JSON and CSV files in

    • aggregate_json_filename - str: filename for the aggregate JSON output file

    • forensic_json_filename - str: filename for the forensic JSON output file

    • ip_db_path - str: An optional custim path to a MMDB file from MaxMind or DBIP

    • offline - bool: Do not use online queries for geolocation or DNS

    • nameservers - str: A comma separated list of DNS resolvers (Default: Cloudflare’s public resolvers)

    • dns_timeout - float: DNS timeout period

    • debug - bool: Print debugging messages

    • silent - bool: Only print errors (Default: True)

    • log_file - str: Write log messages to a file at this path

    • n_procs - int: Number of process to run in parallel when parsing in CLI mode (Default: 1)

    • chunk_size - int: Number of files to give to each process when running in parallel.

  • mailbox
    • reports_folder - str: The mailbox folder (or label for Gmail) where the incoming reports can be found (Default: INBOX)

    • archive_folder - str: The mailbox folder (or label for Gmail) to sort processed emails into (Default: Archive)

    • watch - bool: Use the IMAP IDLE command to process messages as they arrive or poll MS Graph for new messages

    • delete - bool: Delete messages after processing them, instead of archiving them

    • test - bool: Do not move or delete messages

    • batch_size - int: Number of messages to read and process before saving. Defaults to all messages if not set.

  • imap
    • host - str: The IMAP server hostname or IP address

    • port - int: The IMAP server port (Default: 993).

    • ssl - bool: Use an encrypted SSL/TLS connection (Default: True)

    • skip_certificate_verification - bool: Skip certificate verification (not recommended)

    • user - str: The IMAP user

    • password - str: The IMAP password

  • msgraph
    • user - str: The M365 user

    • password - str: The user password

    • client_id - str: The app registration’s client ID

    • client_secret - str: The app registration’s secret

    • mailbox - str: The mailbox name. This defaults to the user that is logged in, but could be a shared mailbox if the user has access to the mailbox

  • elasticsearch
    • hosts - str: A comma separated list of hostnames and ports or URLs (e.g. 127.0.0.1:9200 or https://user:secret@localhost)

    • ssl - bool: Use an encrypted SSL/TLS connection (Default: True)

    • cert_path - str: Path to a trusted certificates

    • index_suffix - str: A suffix to apply to the index names

    • monthly_indexes - bool: Use monthly indexes instead of daily indexes

    • number_of_shards - int: The number of shards to use when creating the index (Default: 1)

    • number_of_replicas - int: The number of replicas to use when creating the index (Default: 1)

  • splunk_hec
    • url - str: The URL of the Splunk HTTP Events Collector (HEC)

    • token - str: The HEC token

    • index - str: The Splunk index to use

    • skip_certificate_verification - bool: Skip certificate verification (not recommended)

  • kafka
    • hosts - str: A comma separated list of Kafka hosts

    • user - str: The Kafka user

    • passsword - str: The Kafka password

    • ssl - bool: Use an encrypted SSL/TLS connection (Default: True)

    • skip_certificate_verification - bool: Skip certificate verification (not recommended)

    • aggregate_topic - str: The Kafka topic for aggregate reports

    • forensic_topic - str: The Kafka topic for forensic reports

  • smtp
    • host - str: The SMTP hostname

    • port - int: The SMTP port (Default: 25)

    • ssl - bool: Require SSL/TLS instead of using STARTTLS

    • skip_certificate_verification - bool: Skip certificate verification (not recommended)

    • user - str: the SMTP username

    • password - str: the SMTP password

    • from - str: The From header to use in the email

    • to - list: A list of email addresses to send to

    • subject - str: The Subject header to use in the email (Default: parsedmarc report)

    • attachment - str: The ZIP attachment filenames

    • message - str: The email message (Default: Please see the attached parsedmarc report.)

  • s3
    • bucket - str: The S3 bucket name

    • path - int: The path to upload reports to (Default: /)

  • syslog
    • server - str: The Syslog server name or IP address

    • port - int: The UDP port to use (Default: 514)

  • gmail_api
    • gmail_api_credentials_file - str: Path to file containing the credentials, None to disable (Default: None)

    • gmail_api_token_file - str: Path to save the token file (Default: .token)

    • gmail_api_include_spam_trash - bool: Include messages in Spam and Trash when searching reports (Default: False)

    • gmail_api_scopes - str: Comma separated list of scopes to use when acquiring credentials (Default: https://www.googleapis.com/auth/gmail.modify)

Sample aggregate report output

Here are the results from parsing the example report from the dmarc.org wiki. It’s actually an older draft of the the 1.0 report schema standardized in RFC 7480 Appendix C. This draft schema is still in wide use.

parsedmarc produces consistent, normalized output, regardless of the report schema.

JSON

{
  "xml_schema": "draft",
  "report_metadata": {
    "org_name": "acme.com",
    "org_email": "noreply-dmarc-support@acme.com",
    "org_extra_contact_info": "http://acme.com/dmarc/support",
    "report_id": "9391651994964116463",
    "begin_date": "2012-04-27 20:00:00",
    "end_date": "2012-04-28 19:59:59",
    "errors": []
  },
  "policy_published": {
    "domain": "example.com",
    "adkim": "r",
    "aspf": "r",
    "p": "none",
    "sp": "none",
    "pct": "100",
    "fo": "0"
  },
  "records": [
    {
      "source": {
        "ip_address": "72.150.241.94",
        "country": "US",
        "reverse_dns": "adsl-72-150-241-94.shv.bellsouth.net",
        "base_domain": "bellsouth.net"
      },
      "count": 2,
      "alignment": {
        "spf": true,
        "dkim": false,
        "dmarc": true
      },
      "policy_evaluated": {
        "disposition": "none",
        "dkim": "fail",
        "spf": "pass",
        "policy_override_reasons": []
      },
      "identifiers": {
        "header_from": "example.com",
        "envelope_from": "example.com",
        "envelope_to": null
      },
      "auth_results": {
        "dkim": [
          {
            "domain": "example.com",
            "selector": "none",
            "result": "fail"
          }
        ],
        "spf": [
          {
            "domain": "example.com",
            "scope": "mfrom",
            "result": "pass"
          }
        ]
      }
    }
  ]
}

CSV

xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,count,spf_aligned,dkim_aligned,dmarc_aligned,disposition,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-27 20:00:00,2012-04-28 19:59:59,,example.com,r,r,none,none,100,0,72.150.241.94,US,adsl-72-150-241-94.shv.bellsouth.net,bellsouth.net,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass

Sample forensic report output

Thanks to Github user xennn for the anonymized forensic report email sample.

JSON

{
     "feedback_type": "auth-failure",
     "user_agent": "Lua/1.0",
     "version": "1.0",
     "original_mail_from": "sharepoint@domain.de",
     "original_rcpt_to": "peter.pan@domain.de",
     "arrival_date": "Mon, 01 Oct 2018 11:20:27 +0200",
     "message_id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
     "authentication_results": "dmarc=fail (p=none, dis=none) header.from=domain.de",
     "delivery_result": "policy",
     "auth_failure": [
       "dmarc"
     ],
     "reported_domain": "domain.de",
     "arrival_date_utc": "2018-10-01 09:20:27",
     "source": {
       "ip_address": "10.10.10.10",
       "country": null,
       "reverse_dns": null,
       "base_domain": null
     },
     "authentication_mechanisms": [],
     "original_envelope_id": null,
     "dkim_domain": null,
     "sample_headers_only": false,
     "sample": "Received: from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby  mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon,  1 Oct 2018 11:20:27 +0200 (CEST)\nDate: 01 Oct 2018 11:20:27 +0200\nMessage-ID: <38.E7.30937.BD6E1BB5@ mailrelay.de>\nTo: <peter.pan@domain.de>\nfrom: \"=?utf-8?B?SW50ZXJha3RpdmUgV2V0dGJld2VyYmVyLcOcYmVyc2ljaHQ=?=\" <sharepoint@domain.de>\nSubject: Subject\nMIME-Version: 1.0\nX-Mailer: Microsoft SharePoint Foundation 2010\nContent-Type: text/html; charset=utf-8\nContent-Transfer-Encoding: quoted-printable\n\n<html><head><base href=3D'\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"=\n><HTML><HEAD><META NAME=3D\"Generator\" CONTENT=3D\"MS Exchange Server version=\n 08.01.0240.003\"></html>\n",
     "parsed_sample": {
       "from": {
         "display_name": "Interaktive Wettbewerber-Übersicht",
         "address": "sharepoint@domain.de",
         "local": "sharepoint",
         "domain": "domain.de"
       },
       "to_domains": [
         "domain.de"
       ],
       "to": [
         {
           "display_name": null,
           "address": "peter.pan@domain.de",
           "local": "peter.pan",
           "domain": "domain.de"
         }
       ],
       "subject": "Subject",
       "timezone": "+2",
       "mime-version": "1.0",
       "date": "2018-10-01 09:20:27",
       "content-type": "text/html; charset=utf-8",
       "x-mailer": "Microsoft SharePoint Foundation 2010",
       "body": "<html><head><base href='\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"><HTML><HEAD><META NAME=\"Generator\" CONTENT=\"MS Exchange Server version 08.01.0240.003\"></html>",
       "received": [
         {
           "from": "Servernameone.domain.local Servernameone.domain.local 10.10.10.10",
           "by": "mailrelay.de mail.DOMAIN.de",
           "with": "SMTP id 38.E7.30937.BD6E1BB5",
           "date": "Mon, 1 Oct 2018 11:20:27 +0200 CEST",
           "hop": 1,
           "date_utc": "2018-10-01 09:20:27",
           "delay": 0
         }
       ],
       "content-transfer-encoding": "quoted-printable",
       "message-id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
       "has_defects": false,
       "headers": {
         "Received": "from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby  mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon,  1 Oct 2018 11:20:27 +0200 (CEST)",
         "Date": "01 Oct 2018 11:20:27 +0200",
         "Message-ID": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
         "To": "<peter.pan@domain.de>",
         "from": "\"Interaktive Wettbewerber-Übersicht\" <sharepoint@domain.de>",
         "Subject": "Subject",
         "MIME-Version": "1.0",
         "X-Mailer": "Microsoft SharePoint Foundation 2010",
         "Content-Type": "text/html; charset=utf-8",
         "Content-Transfer-Encoding": "quoted-printable"
       },
       "reply_to": [],
       "cc": [],
       "bcc": [],
       "attachments": [],
       "filename_safe_subject": "Subject"
     }
   }

CSV

feedback_type,user_agent,version,original_envelope_id,original_mail_from,original_rcpt_to,arrival_date,arrival_date_utc,subject,message_id,authentication_results,dkim_domain,source_ip_address,source_country,source_reverse_dns,source_base_domain,delivery_result,auth_failure,reported_domain,authentication_mechanisms,sample_headers_only
auth-failure,Lua/1.0,1.0,,sharepoint@domain.de,peter.pan@domain.de,"Mon, 01 Oct 2018 11:20:27 +0200",2018-10-01 09:20:27,Subject,<38.E7.30937.BD6E1BB5@ mailrelay.de>,"dmarc=fail (p=none, dis=none) header.from=domain.de",,10.10.10.10,,,,policy,dmarc,domain.de,,False

Bug reports

Please report bugs on the GitHub issue tracker

https://github.com/domainaware/parsedmarc/issues

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsedmarc-8.0.2.tar.gz (3.6 MB view details)

Uploaded Source

Built Distribution

parsedmarc-8.0.2-py3-none-any.whl (3.7 MB view details)

Uploaded Python 3

File details

Details for the file parsedmarc-8.0.2.tar.gz.

File metadata

  • Download URL: parsedmarc-8.0.2.tar.gz
  • Upload date:
  • Size: 3.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.8.2 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for parsedmarc-8.0.2.tar.gz
Algorithm Hash digest
SHA256 b64797e6c3bc1c29d9d50f2210c01dee8b9bd68776ffd6ed46f41f306869468a
MD5 d2f39c85d74a0183f738ee3e31a24bed
BLAKE2b-256 a822f5580947ae1e3215062aa525b7d9aa39b8ff0c0502821a05ffa17a798824

See more details on using hashes here.

File details

Details for the file parsedmarc-8.0.2-py3-none-any.whl.

File metadata

  • Download URL: parsedmarc-8.0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.8.2 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18

File hashes

Hashes for parsedmarc-8.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2d9af731eefc753783e101ad20ff012c826830332a2414097a68a465bc3765f5
MD5 6c1d78d31d5ffc847365e912f91daef9
BLAKE2b-256 52db5cc35326b67d2a4f58570a3889a80492d3ba332fd880b2cff378355579d3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page