DMARC XML aggregate reports at scale: mailbox sizing, compression handling (zip/gzip), and data retention practices

dmarctutorial

The first few DMARC aggregate reports usually feel harmless.

Then a domain starts sending more mail, more receivers participate, forwarding paths multiply, and suddenly the rua mailbox turns into an operations problem instead of a reporting checkbox.

At that point the question is not "how do we get DMARC XML at all?" It is "how do we keep receiving it without letting storage, parsing, and retention drift into a mess?"

This post is about the practical side of operating DMARC aggregate reporting at volume: mailbox sizing, handling both .zip and .gz attachments safely, and deciding how long to keep the data.

If the XML structure itself is still new, start with DMARC reporting 101 and DMARC aggregate reports explained. This article assumes the basic DMARC report format is already familiar.

Why DMARC report operations get harder as volume grows

DMARC aggregate reports are designed to scale better than per-message failure reporting, but "scalable" does not mean "operationally free."

Volume grows from a few different directions at once:

  • more participating receivers send daily summaries
  • large receivers may report separately for different source ranges or report generators
  • one domain can have many legitimate platforms, each generating its own rows and identifiers in the XML
  • forwarding, abuse, and long-tail spoofing add extra records even when the domain owner did nothing new

RFC 7489 explicitly positions DMARC as something that must work at Internet scale and defines aggregate feedback as the normal reporting mechanism for that reason. It also allows domain owners to publish destination URIs with optional maximum-size hints such as mailto:reports@example.com!50m for report delivery requests.

That is useful context, but it does not solve the mailbox engineering for you.

Start by sizing for compressed input and expanded output separately

This is the mistake that causes the most avoidable pain.

Teams often size only for the attachment as delivered over email. But the thing that actually matters to downstream processing is the expanded XML size and the number of reports arriving per day.

Those are not the same number.

For example:

  1. A compressed report arrives as a 200 KB.gz attachment.
  2. After decompression, the XML is 3 MB.
  3. After parsing and normalizing, the stored records plus metadata are larger again.
  4. If the original attachment is retained too, the same report now exists in multiple forms.

The practical outcome is simple:

  • size the inbound mailbox for compressed attachments and temporary delivery spikes
  • size the processing pipeline and storage layer for decompressed XML and normalized data
  • size retention policy for the form you actually need long-term, not for every intermediate artifact forever

Compressed attachment size is not a safe proxy for final storage footprint. Plan for the expanded XML and the parsed dataset, not just the email attachment.

A safer mailbox sizing model

There is no universal number that fits every domain, but a planning model works well.

Estimate these four things:

  1. Daily report count: how many distinct aggregate reports arrive per day.
  2. Average compressed attachment size: what the mailbox receives.
  3. Average decompressed XML size: what your parser must actually read.
  4. Burst margin: room for retries, delayed deliveries, provider changes, and temporary processing failures.

Then plan mailbox capacity around at least several days of buffered intake, not just one day.

Why several days? Because the real outage pattern is usually this:

  • parsing job fails on Friday
  • mailbox keeps receiving reports over the weekend
  • Monday starts with both the backlog and the new day arriving together

For many teams, a reasonable operational target is:

  • enough mailbox capacity for multiple days of compressed attachments
  • separate filesystem or object storage sized for decompressed processing artifacts if those are retained temporarily
  • monitoring on message count, total mailbox size, and oldest unprocessed message age

The last metric is the most important one. A mailbox can still look "small enough" while processing is already falling behind.

Prefer automated ingestion over treating the mailbox as the archive

The rua mailbox should be an intake queue, not the permanent system of record.

That distinction matters because mailboxes are usually weak at all of the things DMARC operations eventually need:

  • deduplication
  • structured search by report metadata
  • retention by policy
  • downstream parsing retries
  • controlled access for security or privacy review

The better pattern is:

  1. Receive reports in a dedicated mailbox.
  2. Pull them into a parser or processing job quickly.
  3. Store only the forms you actually need for operations and history.
  4. Age out raw mailbox content on a short schedule once ingestion is confirmed.

That keeps the mailbox from becoming a second unmanaged archive.

Handle .gz and .zip as normal, expected report formats

At scale, compressed reports are not an edge case. They are normal.

In practice, DMARC aggregate XML often arrives as one of these:

  • plain XML attachment
  • gzip-compressed XML such as report.xml.gz
  • ZIP archive containing one or more XML files

Your ingestion logic should treat gzip and ZIP handling as baseline functionality, not as optional polish.

The operational differences matter:

Gzip is usually a single compressed stream

The gzip format is straightforward and common for DMARC reporting. Tooling such as Python's gzip support reads and decompresses gzip content directly, and it can also handle multi-member gzip data.

For DMARC operations, gzip is usually the simpler case:

  • one attachment
  • one compressed stream
  • typically one XML payload

ZIP needs archive-aware handling

ZIP is more flexible, which also means more things can vary:

  • there may be multiple members in one archive
  • member names may not be consistent
  • compression methods can differ
  • archive metadata can be odd even when the XML itself is fine

That does not make ZIP unsafe by definition, but it does mean your code should inspect the archive rather than assuming a single expected filename.

Python's zipfile documentation also calls out decompression pitfalls such as invalid archives, unsupported compression methods, and resource exhaustion from oversized extraction. That is directly relevant to DMARC pipelines that process attachments automatically.

Compression-handling rules worth enforcing in production

This is the boring part that saves incidents.

1. Detect format from content, not just filename

Filenames help, but they are not a trust boundary.

A report named something.xml.gz might not actually be valid gzip. A .zip file might contain unexpected members. Validate the attachment format before processing it.

2. Set decompression limits

Do not allow an attachment with a modest compressed size to expand without bounds.

Even if the sender is a legitimate receiver, corrupt files and accidental oversized payloads still happen. Put limits around:

  • maximum compressed attachment size accepted
  • maximum decompressed size per report
  • maximum number of ZIP members processed
  • maximum processing time or memory use per attachment

This is not paranoia. It is standard hygiene for any automated archive handling.

3. Expect one XML, but verify it

Most DMARC aggregate reports contain one XML document. Still, the parser should verify:

  • whether the archive contains XML at all
  • whether there is exactly one relevant XML file or several
  • whether the XML is well formed before deeper parsing begins

If the archive contains multiple XML files, decide deliberately whether to reject, process all, or process only known-valid members.

4. Preserve enough metadata to deduplicate

At minimum, store enough report-level metadata to avoid processing the same report repeatedly.

Useful keys usually include:

  • report generator or reporting organization
  • report ID
  • covered date range
  • header-from or target domain
  • attachment checksum

That matters because retries, forwarding, mailbox rules, and manual re-imports all create duplicate-processing risk.

5. Keep the raw payload for a short troubleshooting window

Throwing away raw attachments immediately can make parser debugging painful.

Keeping them forever is usually unnecessary.

The balanced pattern is short-lived raw retention for troubleshooting, with longer retention applied only to parsed or normalized report data.

Mailbox operations that age well

If the rua mailbox is growing fast, a few habits make a disproportionate difference.

Use a dedicated address just for aggregate reports

Do not mix DMARC reports with human support mail or abuse workflows.

A dedicated address makes it much easier to:

  • measure real report volume
  • apply simple mailbox rules
  • automate ingestion safely
  • rotate credentials or mailbox access without collateral impact

If you use an external destination, DMARC external report destinations covers the authorization side.

Monitor backlog, not just quota

Mailbox quota alerts are too late.

Add monitoring for:

  • count of unread or unprocessed report messages
  • age of oldest unprocessed report
  • number of parse failures by day
  • compressed bytes received per day
  • decompressed bytes processed per day

Those metrics show whether the system is keeping up.

Separate ingest failures from policy failures

A broken parser can make a healthy DMARC deployment look blind.

If the issue is mailbox growth or decompression failure, that is an ingestion problem, not evidence that SPF, DKIM, or DMARC alignment suddenly got worse.

Keep those dashboards separate.

What to retain, and for how long

Retention policy should follow operational need and privacy discipline, not vague instinct.

RFC 7489's privacy considerations exist for a reason: even aggregate reports are still operational telemetry about who is sending with your domain, from which source IPs, and with what authentication outcomes.

That does not mean aggregate reports are too sensitive to keep.

It means they should be retained intentionally.

A practical retention split

For most teams, it helps to think in three layers:

Layer 1: raw email message and attachment

Keep short-term.

Purpose:

  • parser troubleshooting
  • evidence for malformed or duplicate deliveries
  • reprocessing after temporary bugs

This is usually the layer to expire first.

Layer 2: decompressed raw XML

Keep only if there is a concrete operational reason.

Purpose:

  • parser regression testing
  • re-ingestion after schema or mapping changes
  • audit trail for disputed normalization results

Many teams can keep this for less time than normalized results, or avoid keeping it at all after successful ingestion.

Layer 3: normalized report data

Keep the longest.

Purpose:

  • trend analysis
  • sender inventory history
  • enforcement decisions
  • provider-specific troubleshooting over time

This is generally the form that delivers the real operational value.

Choosing a retention window

The right window depends on how the data is used, but these questions help:

  1. How far back does the team need to compare authentication behavior when investigating a new issue?
  2. How long does it take to notice missing authorized senders or new spoofing patterns?
  3. Are there internal data-minimization or compliance rules that should shorten retention?
  4. Is long-term value coming from raw XML, or only from normalized aggregates and trends?

In practice, many teams benefit from keeping normalized DMARC reporting data longer than the raw attachments.

That keeps operational history while reducing clutter and unnecessary copies of the same information in different formats.

For the privacy angle in more depth, DMARC report privacy & compliance is the companion article.

A simple operating policy that works well

If the current state is "reports go into a mailbox and someone hopes the parser keeps up," start here:

  1. Use a dedicated rua mailbox.
  2. Budget capacity for several days of compressed intake, not one day.
  3. Support both gzip and ZIP attachments as standard inputs.
  4. Validate archive type and XML structure before deep parsing.
  5. Apply decompression and resource limits.
  6. Deduplicate on report metadata plus attachment fingerprinting.
  7. Retain raw attachments briefly for troubleshooting.
  8. Retain normalized data based on operational trend needs.
  9. Monitor backlog age and parser failures, not just mailbox size.

That is enough to move from "DMARC reporting exists" to "DMARC reporting is actually operable at scale."

Bottom line

At scale, DMARC aggregate reporting is partly an email-authentication topic and partly a data-ingestion topic.

The domains that stay sane operationally are the ones that treat the rua mailbox as a short-lived intake point, handle .gz and .zip as first-class formats, and keep only the data forms that continue to provide value.

If mailbox growth is becoming noticeable, that is the right time to tighten the pipeline. Waiting until the mailbox is full usually means the parser, retention policy, and observability were already behind.

Previous Post