This is a continuation post covering a process to setup DMARC on active domains. Visit A (sane) DMARC setup process for busy email domains to know more about this post series.
Depending on the volume of emails you send, how old your domain is, and how much exposure it has, your DMARC reports can be very long. Although DMARC tools make it very easy to browse and filter DMARC records, there’s always the chance of a server being missed in the previous stage.
To avoid that, we operate the domain in test/sampling mode in this stage.
We do the above by applying a more restrictive DMARC policy, to be applied just to a sample of domain emails. The objective is to trigger feedback from domain users.
More specifically, we set the policy to quarantine emails (
p=quarantine), which basically instruct email providers to send the messages to spam/junk folders (or any special treatment other than simply delivering the message to the user mailbox).
Applying this policy at once would cause all emails from a forgotten server to be quarantined. This can make a lot of users upset! Luckily, DMARC provides a sampling mechanism to allow the policy to be applied to just a percentage of all emails. This is done by adding a
pct=X tag to the DMARC record, where
X is a number between 0 and 100.
The exact percentage to use depends on domain email usage. There’s no one-size-fits-all value to use. Our recommendation is that you never sample more than 20% of your emails that aren’t passing DMARC checks, capping at 200 emails quarantined per week.
For example, if you are having 1,500 failed DMARC checks in a week, sampling at 20% would quarantine 300 emails, which is too much. For this domain you would choose a percentage of 200 ➗ 1,500 = 13%.
Another step on this stage is to warn users beforehand that the domain will enter this stage of DMARC deployment, when a small number of emails may end up being routed differently.
After updating your DMARC record, you should keep monitoring your domain for a while, looking at failed DMARC checks, and making adjustments.
In parallel, you should stay in touch with your users, and try to gather reports of emails being sent to spam/junk folders, when previously they would be sent to user inboxes directly.
By analyzing those reports — more specifically by checking email headers related to SPF, DKIM, and DMARC, such as the Received chain, DKIM signatures, authentication results, etc. — hopefully you can pinpoint missing servers and add them to your SPF and/or DKIM records.
After enough time has passed in this stage, and being confident that all legitimate servers were configured, you will be ready for the next state: full domain protection.
We’re ready to change example.com’s DMARC policy to a more strict quarantine policy, while sampling emails to try catch any missed SPF/DKIM server configuration.
Using DMARCPal we see that Example Inc. has on average 3,000 failed messages per week. If we quarantine 20% of this volume we would be affecting 600/week, which is too much. We want to stay below 200 emails, therefore the sampling value we use is 200 ➗ 3,000 = 6%.
With the sampling percentage defined, we are ready to update example.com’s DMARC record.
However, before we do that, there is a very important step we need to do: notify domain users.
We then send an email to users explaining that a more strict policy is going to be enabled on example.com. We make sure to communicate that the objective of the change is to improve security by preventing forged emails. We also ask that if they detect any anomaly in their email, such as any recipients complaining about emails being sent to spam/junk folders, that they contact the IT team so we can debug the issue.
After notifying the users, we modify example.com’s DMARC record. The final record looks like this:
A couple days after, we go back to DMARCPal to check DMARC records sent after the change above.
We see that indeed the new policy is being applied. So far, no issue is reported by anyone. Everything looks great.
Near the middle of Stage 3 we receive a report from a user saying that she had received complaints that her Google Calendar invitations replies never reached the intended recipients. After talking to some other Google Calendar users, we discover that they too had some complaints.
We decide to test the process and indeed the reply never reaches the invitee's mailbox.
We then go back to DMARCPal to look at Google records. After a few minutes, we confirm that indeed some Google Calendar invitation replies are being quarantined.
As we can see in the record above, SPF is passing for
calendar-server.bounces.google.com, which is the domain used by Google Calendar emails in the Envelope From.
However, although we’re getting a pass from SPF, the domain is not aligned with the “From” header, so we can’t rely on SPF. The DKIM record in the DNS is the one copied from Google Workspace, however we see that the DKIM domain used for signing is also not the same as in the “From” line. Because of these two conditions, DMARC fails.
For better understanding of the user report, public names and addresses in the screenshot above weren't masked. Client domain has been blurred for privacy reasons instead.
Fortunately we have seen this before — and we know how to solve it.
This situation is caused by a partially configured domain in Google Workspace. See Google Calendar invites failing DMARC checks for more details.
After solving the Google Calendar issue, and waiting a day for the changes to propagate, we do the same tests we did before. Now the emails are delivered normally.
On the next day we check new DMARC records and can confirm that
calendar-server.bounces.google.com emails aren’t failing DMARC anymore.
Since this Google Workspace issue is well known, we decide not to extend Stage 4. We keep monitoring the domain, and checking on users for any unexpected email behaviour.
We then reach the end of Stage 3 with no other reports from users. Now we’re ready to up Example Inc. email security by changing DMARC to a 100% strict policy.
Next: Stage 4 — Protect