SPF `~all` vs `-all`: rollout strategy to move to hard fail without breaking legitimate senders

If a domain is still publishing ~all, the usual reason is not that softfail is inherently better. It is that the team does not yet trust its sender inventory enough to say, with confidence, "everything not listed is unauthorized."

That instinct is healthy.

Moving from ~all to -all should be treated as an inventory and rollout project, not as a one-line DNS cleanup. The safe goal is simple: get to hard fail only after every legitimate sender is known, aligned where needed, and tested under real traffic.

What `~all` and `-all` actually mean

In RFC 7208, the SPF qualifiers map to different results:

~all returns softfail
-all returns fail

The same RFC describes softfail as a weak statement that the host is probably not authorized, while fail is an explicit statement that the host is not authorized.

That difference matters operationally even though receivers still apply their own local policy. A hard-fail SPF record is the domain owner saying: "this list is complete enough that everything else should be treated as unauthorized."

The first important nuance: DMARC does not treat softfail as a pass

This is where many rollout discussions drift off course.

DMARC only passes when at least one aligned authentication mechanism produces a pass result, per RFC 7489. A message with SPF softfail does not have an SPF pass for DMARC purposes. A message with SPF fail also does not.

So if the question is "will switching from ~all to -all make DMARC start passing?" the answer is no.

The real difference is:

~all gives receivers a weaker SPF signal for non-authorized sources
-all gives receivers a definitive SPF fail for non-authorized sources
both still require sender inventory discipline if legitimate mail lacks aligned DKIM

If DMARC alignment needs a refresher, Return-Path vs From: practical implications is the companion read.

Why many domains start with `~all`

There is a practical reason Google documentation still shows ~all in many setup examples, while Microsoft documentation recommends -all for Microsoft 365 domains once the setup is understood.

Those docs are solving slightly different problems:

Google's setup guidance assumes many admins are still identifying all senders and should avoid breaking mail while that discovery work is incomplete.
Microsoft's guidance assumes you should get to a definitive authorization boundary, especially when DKIM and DMARC are also in place.

Neither position is irrational. They reflect different stages of maturity.

That is the right mental model for rollout too: ~all is often a transitional state, not the destination.

What actually breaks when teams switch too early

Changing ~all to -all does not usually break well-configured mail streams that already have aligned DKIM and correct SPF authorization.

What it exposes is all the mail the organization forgot about.

Typical casualties are:

old CRM or ticketing systems still sending from the main domain
web forms using the organizational domain from an app server not in SPF
printers, scanners, and appliances relaying directly
regional business tools bought outside central IT
vendors that send with the right visible From but the wrong bounce domain
sources that were surviving only because receivers were lenient about softfail

This is why Building a sender inventory with DMARC reports should usually happen before the SPF qualifier change, not after it.

The safe rollout strategy

The shortest good strategy is:

inventory every sender
fix alignment and authorization gaps
separate risky streams onto subdomains where appropriate
validate with live traffic and reports
only then change ~all to -all

The rest of this post expands each step.

Step 1: Build a real sender inventory

Do not trust tribal knowledge here.

Ask a narrower question than "what systems send email?" Ask: "what systems send mail that uses this exact domain in the SMTP envelope sender or visible From path?"

Sources to check:

DMARC aggregate reports
existing SPF include: and ip4: or ip6: terms
ESP and CRM admin panels
application configs and SMTP relays
support platforms, billing systems, and forms
security devices and multifunction printers

If the apex domain SPF record already feels crowded or unclear, that is usually a sign the domain is carrying too many unrelated streams. In that case, SPF flattening vs includes: tradeoffs, failure modes, and safer alternatives and Transactional vs marketing email separation are directly relevant.

Step 2: For each sender, decide how it is supposed to pass DMARC

This is the step teams skip when they focus only on SPF syntax.

Every live sender should have an intended authentication path:

SPF pass and aligned 5321.MailFrom
DKIM pass and aligned d= domain
ideally both

If a sender can only survive because receivers are tolerant of SPF softfail, that sender is already fragile.

Common examples:

Sender A: properly authorized platform

example.com. IN TXT "v=spf1 include:_spf.example.net ~all"

The sender is authorized by the include and signs with aligned DKIM. Changing the terminal qualifier to -all is probably safe for this source because authorized mail will still SPF-pass before evaluation reaches all.

Sender B: visible From looks right, bounce path is wrong

Mail is sent as billing@example.com, but the actual 5321.MailFrom is mailer.vendor-example.net, and there is no aligned DKIM.

That sender was never in a healthy state. Moving to -all did not create the problem. It merely removed the ambiguity around it.

Step 3: Remove stale senders before adding new exceptions

A common mistake is to discover uncertainty and respond by stuffing more includes into the main SPF record "just in case."

That usually creates two new problems:

more lookup pressure toward the SPF 10-lookup limit
a permanent record full of legacy authorizations nobody wants to revisit

If an old platform has not sent legitimate mail in months, delete it rather than preserving it as a hypothetical future need.

If lookup budgeting is already tight, SPF 10-DNS-lookup limit: why it happens, how to audit includes, and mitigation patterns is the better next step than adding more guesswork.

Step 4: Move third-party or higher-risk streams to subdomains

This is often the cleanest way to reach -all on the main domain faster.

For example:

example.com for employee and core transactional mail
marketing.example.com for campaign traffic
support.example.com for ticketing systems
alerts.example.com for product notifications

Microsoft explicitly recommends subdomains for email services that are not under your direct control. Operationally, that advice is solid even outside Microsoft 365.

It reduces blast radius and lets the main domain reach a tighter SPF posture without waiting for every external platform to become equally well managed.

Step 5: Keep DKIM from being the hidden blocker

Many teams frame this migration as an SPF-only change, but the dangerous cases are usually hybrid ones:

SPF is incomplete
DKIM is absent or not aligned
DMARC reports are not being watched closely

If a sender has aligned DKIM, then switching from ~all to -all is much less likely to affect legitimate delivery. If a sender has no aligned DKIM and marginal SPF, then the environment is already brittle.

That is one reason Microsoft says to deploy DKIM and DMARC alongside SPF, and why Google's sender guidance also expects authenticated mail rather than SPF alone.

Step 6: Roll out with a transition window, not a cliff

RFC 7208 explicitly warns that when SPF records change, there should be a transition period so the old policy remains valid long enough for legitimate mail already in transit to be checked under the expected policy.

In practice, that means:

avoid making the change during an outage or major campaign launch
allow for DNS TTL and queued-mail lag
watch DMARC aggregate data and support tickets after the cutover
keep rollback simple if a forgotten sender appears

The DNS edit is small. The monitoring window after it is the real rollout.

A practical sequence that works well

For a busy production domain, this pattern is usually safer than debating ~all versus -all in the abstract:

Confirm exactly one SPF record exists.
Remove obviously dead includes or IPs.
Confirm aligned DKIM on critical employee, billing, and transactional streams.
Move marketing or vendor-heavy streams to dedicated subdomains where possible.
Review DMARC aggregate reports until unknown sources are explained.
Change the terminal qualifier from ~all to -all.
Monitor for a few days and fix any missed sender rather than reverting immediately unless there is active impact.

That sequence is boring. Boring is good here.

Example: safe before-and-after design

Before

example.com. IN TXT "v=spf1 include:_spf.google.com include:spf.protection.outlook.com include:mailer.vendor-example.net ~all"

Problems:

employee mail and vendor mail are mixed together
the vendor may not be under direct operational control
the team is using ~all because they do not trust that the record is complete

Better target state

example.com. IN TXT "v=spf1 include:_spf.google.com include:spf.protection.outlook.com -all"
marketing.example.com. IN TXT "v=spf1 include:mailer.vendor-example.net -all"

Why this is better:

the main domain gets a clearer authorization boundary
the vendor stream has its own policy surface
troubleshooting becomes much easier
reputation and breakage are isolated by stream

When it is still reasonable to stay on `~all`

Staying on ~all is defensible for a while if any of these are true:

the sender inventory is still incomplete
multiple business units can launch mail without central review
key senders still lack aligned DKIM
a domain migration or ESP migration is in progress
DMARC reporting is not yet giving enough visibility

But treat that as temporary technical debt, not a best practice to preserve forever.

When `-all` is the right move

Moving to -all is usually justified when:

legitimate senders are known and documented
stale authorizations have been removed
important streams have aligned DKIM
third-party streams are segmented where practical
DMARC reports have stopped revealing surprises

At that point, -all is not an aggressive setting. It is simply an honest one.

Softfail is often a discovery posture. Hard fail is an enforcement posture. The operational question is not which one feels safer. It is whether the sender inventory is complete enough for enforcement.

Bottom line

The move from SPF ~all to -all should happen after inventory, segmentation, and validation, not before.

If a domain still depends on ~all to avoid breaking legitimate senders, the real work is to identify those senders, fix their SPF or DKIM path, and move high-variance services onto their own subdomains. Once that is done, -all becomes the natural end state rather than a risky leap.

Previous Post Next Post

SPF `~all` vs `-all`: rollout strategy to move to hard fail without breaking legitimate senders

What `~all` and `-all` actually mean

The first important nuance: DMARC does not treat softfail as a pass

Why many domains start with `~all`

What actually breaks when teams switch too early

The safe rollout strategy

Step 1: Build a real sender inventory

Step 2: For each sender, decide how it is supposed to pass DMARC

Sender A: properly authorized platform

Sender B: visible From looks right, bounce path is wrong

Step 3: Remove stale senders before adding new exceptions

Step 4: Move third-party or higher-risk streams to subdomains

Step 5: Keep DKIM from being the hidden blocker

Step 6: Roll out with a transition window, not a cliff

A practical sequence that works well

Example: safe before-and-after design

Before

Better target state

When it is still reasonable to stay on `~all`

When `-all` is the right move

Bottom line

Related Posts

Popular Tags

SPF `~all` vs `-all`: rollout strategy to move to hard fail without breaking legitimate senders

What ~all and -all actually mean

The first important nuance: DMARC does not treat softfail as a pass

Why many domains start with ~all

What actually breaks when teams switch too early

The safe rollout strategy

Step 1: Build a real sender inventory

Step 2: For each sender, decide how it is supposed to pass DMARC

Sender A: properly authorized platform

Sender B: visible From looks right, bounce path is wrong

Step 3: Remove stale senders before adding new exceptions

Step 4: Move third-party or higher-risk streams to subdomains

Step 5: Keep DKIM from being the hidden blocker

Step 6: Roll out with a transition window, not a cliff

A practical sequence that works well

Example: safe before-and-after design

Before

Better target state

When it is still reasonable to stay on ~all

When -all is the right move

Bottom line

Related Posts

Popular Tags

What `~all` and `-all` actually mean

Why many domains start with `~all`

When it is still reasonable to stay on `~all`

When `-all` is the right move