A Python package and CLI for parsing aggregate and forensic DMARC reports
Reason this release was yanked:
IMAP bugs
Project description
parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.
Features
Parses draft and 1.0 standard aggregate/rua reports
Parses forensic/failure/ruf reports
Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
Transparently handles gzip or zip compressed reports
Consistent data structures
Simple JSON and/or CSV output
Optionally email the results
Optionally send the results to Elasticsearch and/or Splunk, for use with premade dashboards
Optionally send reports to Apache Kafka
Resources
DMARC guides
Demystifying DMARC - A complete guide to SPF, DKIM, and DMARC
SPF and DMARC record validation
If you are looking for SPF and DMARC record validation and parsing, check out the sister project, checkdmarc.
Lookalike domains
DMARC protects against domain spoofing, not lookalike domains. For open source lookalike domain monitoring, check out DomainAware.
CLI help
usage: parsedmarc [-h] [-c CONFIG_FILE] [--strip-attachment-payloads] [-o OUTPUT] [--aggregate-json-filename AGGREGATE_JSON_FILENAME] [--forensic-json-filename FORENSIC_JSON_FILENAME] [--aggregate-csv-filename AGGREGATE_CSV_FILENAME] [--forensic-csv-filename FORENSIC_CSV_FILENAME] [-n NAMESERVERS [NAMESERVERS ...]] [-t DNS_TIMEOUT] [--offline] [-s] [--verbose] [--debug] [--log-file LOG_FILE] [-v] [file_path ...] Parses DMARC reports positional arguments: file_path one or more paths to aggregate or forensic report files, emails, or mbox files' optional arguments: -h, --help show this help message and exit -c CONFIG_FILE, --config-file CONFIG_FILE a path to a configuration file (--silent implied) --strip-attachment-payloads remove attachment payloads from forensic report output -o OUTPUT, --output OUTPUT write output files to the given directory --aggregate-json-filename AGGREGATE_JSON_FILENAME filename for the aggregate JSON output file --forensic-json-filename FORENSIC_JSON_FILENAME filename for the forensic JSON output file --aggregate-csv-filename AGGREGATE_CSV_FILENAME filename for the aggregate CSV output file --forensic-csv-filename FORENSIC_CSV_FILENAME filename for the forensic CSV output file -n NAMESERVERS [NAMESERVERS ...], --nameservers NAMESERVERS [NAMESERVERS ...] nameservers to query -t DNS_TIMEOUT, --dns_timeout DNS_TIMEOUT number of seconds to wait for an answer from DNS (default: 2.0) --offline do not make online queries for geolocation or DNS -s, --silent only print errors and warnings --verbose more verbose output --debug print debugging information --log-file LOG_FILE output logging to a file -v, --version show program's version number and exit
Configuration file
parsedmarc can be configured by supplying the path to an INI file
parsedmarc -c /etc/parsedmarc.ini
For example
# This is an example comment
[general]
save_aggregate = True
save_forensic = True
[imap]
host = imap.example.com
user = dmarcresports@example.com
password = $uperSecure
[mailbox]
watch = True
delete = False
[elasticsearch]
hosts = 127.0.0.1:9200
ssl = False
[splunk_hec]
url = https://splunkhec.example.com
token = HECTokenGoesHere
index = email
[s3]
bucket = my-bucket
path = parsedmarc
[syslog]
server = localhost
port = 514
[gmail_api]
credentials_file = /path/to/credentials.json # Get this file from console.google.com. See https://developers.google.com/identity/protocols/oauth2
token_file = /path/to/token.json # This file will be generated automatically
scopes = https://mail.google.com/
include_spam_trash=True
The full set of configuration options are:
- general
save_aggregate - bool: Save aggregate report data to Elasticsearch, Splunk and/or S3
save_forensic - bool: Save forensic report data to Elasticsearch, Splunk and/or S3
strip_attachment_payloads - bool: Remove attachment payloads from results
output - str: Directory to place JSON and CSV files in
aggregate_json_filename - str: filename for the aggregate JSON output file
forensic_json_filename - str: filename for the forensic JSON output file
ip_db_path - str: An optional custim path to a MMDB file from MaxMind or DBIP
offline - bool: Do not use online queries for geolocation or DNS
nameservers - str: A comma separated list of DNS resolvers (Default: Cloudflare’s public resolvers)
dns_timeout - float: DNS timeout period
debug - bool: Print debugging messages
silent - bool: Only print errors (Default: True)
log_file - str: Write log messages to a file at this path
n_procs - int: Number of process to run in parallel when parsing in CLI mode (Default: 1)
chunk_size - int: Number of files to give to each process when running in parallel.
- mailbox
reports_folder - str: The mailbox folder (or label for Gmail) where the incoming reports can be found (Default: INBOX)
archive_folder - str: The mailbox folder (or label for Gmail) to sort processed emails into (Default: Archive)
watch - bool: Use the IMAP IDLE command to process messages as they arrive or poll MS Graph for new messages
delete - bool: Delete messages after processing them, instead of archiving them
test - bool: Do not move or delete messages
batch_size - int: Number of messages to read and process before saving. Defaults to all messages if not set.
- imap
host - str: The IMAP server hostname or IP address
port - int: The IMAP server port (Default: 993).
ssl - bool: Use an encrypted SSL/TLS connection (Default: True)
skip_certificate_verification - bool: Skip certificate verification (not recommended)
user - str: The IMAP user
password - str: The IMAP password
- msgraph
user - str: The M365 user
password - str: The user password
client_id - str: The app registration’s client ID
client_secret - str: The app registration’s secret
mailbox - str: The mailbox name. This defaults to the user that is logged in, but could be a shared mailbox if the user has access to the mailbox
- elasticsearch
hosts - str: A comma separated list of hostnames and ports or URLs (e.g. 127.0.0.1:9200 or https://user:secret@localhost)
ssl - bool: Use an encrypted SSL/TLS connection (Default: True)
cert_path - str: Path to a trusted certificates
index_suffix - str: A suffix to apply to the index names
monthly_indexes - bool: Use monthly indexes instead of daily indexes
number_of_shards - int: The number of shards to use when creating the index (Default: 1)
number_of_replicas - int: The number of replicas to use when creating the index (Default: 1)
- splunk_hec
url - str: The URL of the Splunk HTTP Events Collector (HEC)
token - str: The HEC token
index - str: The Splunk index to use
skip_certificate_verification - bool: Skip certificate verification (not recommended)
- kafka
hosts - str: A comma separated list of Kafka hosts
user - str: The Kafka user
passsword - str: The Kafka password
ssl - bool: Use an encrypted SSL/TLS connection (Default: True)
skip_certificate_verification - bool: Skip certificate verification (not recommended)
aggregate_topic - str: The Kafka topic for aggregate reports
forensic_topic - str: The Kafka topic for forensic reports
- smtp
host - str: The SMTP hostname
port - int: The SMTP port (Default: 25)
ssl - bool: Require SSL/TLS instead of using STARTTLS
skip_certificate_verification - bool: Skip certificate verification (not recommended)
user - str: the SMTP username
password - str: the SMTP password
from - str: The From header to use in the email
to - list: A list of email addresses to send to
subject - str: The Subject header to use in the email (Default: parsedmarc report)
attachment - str: The ZIP attachment filenames
message - str: The email message (Default: Please see the attached parsedmarc report.)
- s3
bucket - str: The S3 bucket name
path - int: The path to upload reports to (Default: /)
- syslog
server - str: The Syslog server name or IP address
port - int: The UDP port to use (Default: 514)
- gmail_api
gmail_api_credentials_file - str: Path to file containing the credentials, None to disable (Default: None)
gmail_api_token_file - str: Path to save the token file (Default: .token)
gmail_api_include_spam_trash - bool: Include messages in Spam and Trash when searching reports (Default: False)
gmail_api_scopes - str: Comma separated list of scopes to use when acquiring credentials (Default: https://www.googleapis.com/auth/gmail.modify)
Sample aggregate report output
Here are the results from parsing the example report from the dmarc.org wiki. It’s actually an older draft of the the 1.0 report schema standardized in RFC 7480 Appendix C. This draft schema is still in wide use.
parsedmarc produces consistent, normalized output, regardless of the report schema.
JSON
{
"xml_schema": "draft",
"report_metadata": {
"org_name": "acme.com",
"org_email": "noreply-dmarc-support@acme.com",
"org_extra_contact_info": "http://acme.com/dmarc/support",
"report_id": "9391651994964116463",
"begin_date": "2012-04-27 20:00:00",
"end_date": "2012-04-28 19:59:59",
"errors": []
},
"policy_published": {
"domain": "example.com",
"adkim": "r",
"aspf": "r",
"p": "none",
"sp": "none",
"pct": "100",
"fo": "0"
},
"records": [
{
"source": {
"ip_address": "72.150.241.94",
"country": "US",
"reverse_dns": "adsl-72-150-241-94.shv.bellsouth.net",
"base_domain": "bellsouth.net"
},
"count": 2,
"alignment": {
"spf": true,
"dkim": false,
"dmarc": true
},
"policy_evaluated": {
"disposition": "none",
"dkim": "fail",
"spf": "pass",
"policy_override_reasons": []
},
"identifiers": {
"header_from": "example.com",
"envelope_from": "example.com",
"envelope_to": null
},
"auth_results": {
"dkim": [
{
"domain": "example.com",
"selector": "none",
"result": "fail"
}
],
"spf": [
{
"domain": "example.com",
"scope": "mfrom",
"result": "pass"
}
]
}
}
]
}
CSV
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,count,spf_aligned,dkim_aligned,dmarc_aligned,disposition,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-27 20:00:00,2012-04-28 19:59:59,,example.com,r,r,none,none,100,0,72.150.241.94,US,adsl-72-150-241-94.shv.bellsouth.net,bellsouth.net,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
Sample forensic report output
Thanks to Github user xennn for the anonymized forensic report email sample.
JSON
{
"feedback_type": "auth-failure",
"user_agent": "Lua/1.0",
"version": "1.0",
"original_mail_from": "sharepoint@domain.de",
"original_rcpt_to": "peter.pan@domain.de",
"arrival_date": "Mon, 01 Oct 2018 11:20:27 +0200",
"message_id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"authentication_results": "dmarc=fail (p=none, dis=none) header.from=domain.de",
"delivery_result": "policy",
"auth_failure": [
"dmarc"
],
"reported_domain": "domain.de",
"arrival_date_utc": "2018-10-01 09:20:27",
"source": {
"ip_address": "10.10.10.10",
"country": null,
"reverse_dns": null,
"base_domain": null
},
"authentication_mechanisms": [],
"original_envelope_id": null,
"dkim_domain": null,
"sample_headers_only": false,
"sample": "Received: from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon, 1 Oct 2018 11:20:27 +0200 (CEST)\nDate: 01 Oct 2018 11:20:27 +0200\nMessage-ID: <38.E7.30937.BD6E1BB5@ mailrelay.de>\nTo: <peter.pan@domain.de>\nfrom: \"=?utf-8?B?SW50ZXJha3RpdmUgV2V0dGJld2VyYmVyLcOcYmVyc2ljaHQ=?=\" <sharepoint@domain.de>\nSubject: Subject\nMIME-Version: 1.0\nX-Mailer: Microsoft SharePoint Foundation 2010\nContent-Type: text/html; charset=utf-8\nContent-Transfer-Encoding: quoted-printable\n\n<html><head><base href=3D'\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"=\n><HTML><HEAD><META NAME=3D\"Generator\" CONTENT=3D\"MS Exchange Server version=\n 08.01.0240.003\"></html>\n",
"parsed_sample": {
"from": {
"display_name": "Interaktive Wettbewerber-Übersicht",
"address": "sharepoint@domain.de",
"local": "sharepoint",
"domain": "domain.de"
},
"to_domains": [
"domain.de"
],
"to": [
{
"display_name": null,
"address": "peter.pan@domain.de",
"local": "peter.pan",
"domain": "domain.de"
}
],
"subject": "Subject",
"timezone": "+2",
"mime-version": "1.0",
"date": "2018-10-01 09:20:27",
"content-type": "text/html; charset=utf-8",
"x-mailer": "Microsoft SharePoint Foundation 2010",
"body": "<html><head><base href='\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"><HTML><HEAD><META NAME=\"Generator\" CONTENT=\"MS Exchange Server version 08.01.0240.003\"></html>",
"received": [
{
"from": "Servernameone.domain.local Servernameone.domain.local 10.10.10.10",
"by": "mailrelay.de mail.DOMAIN.de",
"with": "SMTP id 38.E7.30937.BD6E1BB5",
"date": "Mon, 1 Oct 2018 11:20:27 +0200 CEST",
"hop": 1,
"date_utc": "2018-10-01 09:20:27",
"delay": 0
}
],
"content-transfer-encoding": "quoted-printable",
"message-id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"has_defects": false,
"headers": {
"Received": "from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon, 1 Oct 2018 11:20:27 +0200 (CEST)",
"Date": "01 Oct 2018 11:20:27 +0200",
"Message-ID": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"To": "<peter.pan@domain.de>",
"from": "\"Interaktive Wettbewerber-Übersicht\" <sharepoint@domain.de>",
"Subject": "Subject",
"MIME-Version": "1.0",
"X-Mailer": "Microsoft SharePoint Foundation 2010",
"Content-Type": "text/html; charset=utf-8",
"Content-Transfer-Encoding": "quoted-printable"
},
"reply_to": [],
"cc": [],
"bcc": [],
"attachments": [],
"filename_safe_subject": "Subject"
}
}
CSV
feedback_type,user_agent,version,original_envelope_id,original_mail_from,original_rcpt_to,arrival_date,arrival_date_utc,subject,message_id,authentication_results,dkim_domain,source_ip_address,source_country,source_reverse_dns,source_base_domain,delivery_result,auth_failure,reported_domain,authentication_mechanisms,sample_headers_only auth-failure,Lua/1.0,1.0,,sharepoint@domain.de,peter.pan@domain.de,"Mon, 01 Oct 2018 11:20:27 +0200",2018-10-01 09:20:27,Subject,<38.E7.30937.BD6E1BB5@ mailrelay.de>,"dmarc=fail (p=none, dis=none) header.from=domain.de",,10.10.10.10,,,,policy,dmarc,domain.de,,False
Bug reports
Please report bugs on the GitHub issue tracker
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file parsedmarc-8.0.2.tar.gz
.
File metadata
- Download URL: parsedmarc-8.0.2.tar.gz
- Upload date:
- Size: 3.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.8.2 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b64797e6c3bc1c29d9d50f2210c01dee8b9bd68776ffd6ed46f41f306869468a |
|
MD5 | d2f39c85d74a0183f738ee3e31a24bed |
|
BLAKE2b-256 | a822f5580947ae1e3215062aa525b7d9aa39b8ff0c0502821a05ffa17a798824 |
File details
Details for the file parsedmarc-8.0.2-py3-none-any.whl
.
File metadata
- Download URL: parsedmarc-8.0.2-py3-none-any.whl
- Upload date:
- Size: 3.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.8.2 requests/2.27.1 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/2.7.18
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d9af731eefc753783e101ad20ff012c826830332a2414097a68a465bc3765f5 |
|
MD5 | 6c1d78d31d5ffc847365e912f91daef9 |
|
BLAKE2b-256 | 52db5cc35326b67d2a4f58570a3889a80492d3ba332fd880b2cff378355579d3 |