The PI Earns His Keep: Tuning MDI, Taming False Positives, and Making Your Detection Stack Actually Livable

(Or: How to Stop the Noise Without Going Blind)

Let’s set the scene.

You’ve deployed MDI. The sensors are on every domain controller. It’s talking to Defender XDR. Sentinel is listening. Your Conditional Access policies are wired to respond to risk signals. Your playbooks are humming. You did the work, and you did it right.

And now your phone will not stop going off.

Unknown account enumeration. Suspicious LDAP query. Anomalous authentication behavior. Reconnaissance using DNS. Unusual Kerberos activity. It’s 7 AM on a Monday and you’ve already got fourteen alerts and a Teams message from an analyst asking if something is actually wrong or if the system is just “doing the thing again.”

Welcome to week two of your MDI deployment. Population: everyone who’s ever done this.

Here’s what I want to say clearly before anything else: this is not a failure. This is MDI working exactly as designed — watching everything, flagging anything that deviates from established norms, building its understanding of your environment in real time. The noise isn’t a sign that something is broken. It’s a sign that the baselines aren’t fully formed yet and that you haven’t told the system what “normal” looks like in your specific environment.

The difference between organizations that come out the other side of this phase with a well-tuned, trusted detection system and organizations that quietly dial the sensitivity down to “low” and go back to flying blind is simple: one treats the noise as a problem to be solved. The other treats it as a reason to give up.

This post is about solving it.

The 30-Day Learning Period: What’s Actually Happening

MDI doesn’t arrive knowing what normal looks like in your environment. It learns. For roughly the first 30 days after deployment, MDI is building behavioral baselines for every entity it’s watching — every user account, every device, every domain controller. It’s learning which accounts do LDAP queries and how many, which service accounts authenticate to which systems, what Kerberos traffic looks like on a typical Tuesday, what your IT admin’s behavior looks like when they’re doing routine maintenance versus when something unusual is happening.

During this period, some detections are limited by design. MDI won’t fire behavioral anomaly alerts on things it hasn’t yet established a baseline for. But it will fire on known attack technique signatures — the DCSync attempts, the Skeleton Key patterns, the explicit Kerberoasting sequences — because those don’t require a baseline. Those are just wrong, full stop, baseline or not.

What this means practically: in the first 30 days, your alert volume will be a mix of real detections, learning-period noise, and legitimate activity that looks suspicious because the context isn’t established yet. Your job during this period is not to tune aggressively — it’s to observe. Review the alerts. Understand what’s firing and why. Start building your mental map of what legitimate looks like in your environment, because that map is what you’ll use to make good tuning decisions later.

Resist the urge to start suppressing alerts in week one. The suppressions you create before you understand the environment are the blind spots you’ll regret later.

Understanding What You’re Actually Looking At

Before we talk about tuning, let’s talk about alert anatomy — because good tuning decisions start with understanding what MDI is actually telling you.

Every MDI alert has a few key components worth paying attention to:

Confidence level. High, medium, or low. This is MDI’s assessment of how certain it is that what it’s seeing is malicious rather than legitimate. High confidence means the pattern closely matches known attack behavior. Low confidence means it’s anomalous, worth looking at, but could have a legitimate explanation. Your tuning approach should differ significantly based on this.

Severity. High, medium, or low. Reflects the potential impact if the detection is accurate. A high-severity, high-confidence alert is a different conversation than a low-severity, low-confidence one.

Evidence. What specifically triggered the alert — the account, the machine, the traffic pattern, the specific behavior MDI flagged. This is where you determine whether you’re looking at Kevin or your IT admin running an authorized script.

Entity timeline. Every account and device in MDI has a timeline — a history of alerts, behaviors, and activities associated with that entity. Before you tune anything, pull the entity timeline. Is this account a repeat offender, or is this a one-time anomaly? Context changes everything.

The investigation question before any tuning decision is always: is this a false positive, or is this a true detection on behavior that happens to be authorized? Those are not the same thing, and they require different responses.

A false positive is MDI getting it wrong — flagging something as suspicious that is neither suspicious nor authorized to be suspicious. A true detection on authorized behavior is MDI getting it right — the behavior genuinely matches an attack pattern, but in this case it’s your IT admin doing something legitimate that looks like an attack. The difference matters because a false positive might warrant a tuning adjustment, while a true detection on authorized behavior warrants a process change — doing the legitimate activity differently so it doesn’t look like an attack in the first place.

Building Exclusions: The Right Way and the Wrong Way

Exclusions are how you tell MDI “this specific thing, in this specific context, is known and authorized — don’t alert on it.” Used correctly, they’re essential. Used carelessly, they’re how you manually recreate the blind spots you deployed MDI to eliminate.

The wrong way to build exclusions:

Broad exclusions on accounts, machines, or subnets because they’re generating a lot of alerts. “The IT admin account keeps triggering reconnaissance alerts, so we excluded the IT admin account from reconnaissance detections.” Congratulations — you’ve just told MDI to ignore everything the IT admin account does that looks like reconnaissance. Including the day someone compromises the IT admin account and actually runs reconnaissance.

Exclusions built during the first two weeks — before baselines are established — are premature at best and dangerous at worst. The alert that looks like noise in week one might look very different in week four once MDI has context. Build exclusions based on understood, validated behavior — not on volume.

The right way to build exclusions:

Targeted and specific. Exclude the specific behavior, from the specific account or machine, in the specific context you’ve validated as legitimate. “This service account runs scheduled LDAP queries against these specific OUs as part of this specific authorized process” is a good exclusion. “This account does AD stuff” is not.

Documented. Every exclusion should have a record: what it covers, why it was created, who authorized it, and when it should be reviewed. Exclusions without documentation are future blind spots waiting to be discovered by your next penetration test.

Time-bounded where possible. If a legitimate administrative activity is going to generate alerts for a defined period — a migration, a large-scale password reset, an AD cleanup project — create a temporary exclusion with an expiration date rather than a permanent one. Permanent exclusions have a way of outlasting the activity that justified them.

Reviewed regularly. Build exclusion review into your quarterly security review cycle. Ask: is this exclusion still valid? Is the activity it covers still happening? Is it still authorized? Is the scope still appropriate?

Alert Fatigue: The Political Problem Is As Real As the Technical One

Here’s something that doesn’t get discussed enough in MDI deployment guides: analyst trust is a resource, and alert fatigue burns through it fast.

When an analyst investigates fourteen alerts in a morning and twelve of them turn out to be noise, they start developing a prior. The prior is: most alerts are probably noise. And once that prior is established, real alerts start getting the same skeptical, low-effort treatment as the noise — because the brain is efficient and pattern-matching is how humans cope with volume.

This is how things get missed. Not through malice or incompetence, but through a perfectly rational response to an environment that’s been crying wolf.

The tuning work we’ve been talking about is partly technical — getting the detection quality to a place where alerts are meaningful. But it’s also partly organizational — restoring analyst confidence in the system so that when a real alert fires, it gets the attention it deserves.

A few things that help on the organizational side:

Establish a feedback loop. When an analyst closes an alert as a false positive, capture why. Build a log of false positive patterns. Use that log to drive tuning priorities — address the highest-volume false positive sources first, because that’s where you’ll get the most analyst confidence back per unit of tuning effort.

Track your signal-to-noise ratio over time. What percentage of your MDI alerts result in confirmed detections versus false positives? Trend that number. If it’s improving, the tuning is working. If it’s not moving, the approach needs to change. Make the metric visible to the team — progress is motivating, and analysts who can see the noise decreasing trust the system more.

Close the loop on real detections. When MDI catches something real — when Kevin actually shows up and the PI actually called it correctly — make sure the team knows. Post-incident reviews that specifically call out “MDI caught this, here’s the alert, here’s the timeline” build trust in the system far more effectively than any amount of documentation.

What Good Looks Like: The PI, Fully Operational

After the learning period. After the exclusion work. After the tuning iterations and the analyst feedback loops and the signal-to-noise improvements. After the automation is calibrated and the playbooks are running and the Conditional Access policies are actually acting on risk signals.

What does good look like?

It looks like an alert queue where most things that fire are worth looking at. It looks like analysts who trust the system enough to take high-severity alerts seriously without second-guessing whether it’s noise. It looks like Kevin having a very bad night — every time, automatically, whether anyone’s watching the board or not. It looks like incident timelines that tell the whole story instead of leaving the middle chapters blank.

It looks like the PI doing his job — not perfectly, because nothing is perfect — but reliably. Consistently. As a trusted member of the team rather than that guy in the basement whose reports everyone ignores because there are too many of them and half of them are wrong.

Getting there takes time. It takes iteration. It takes the willingness to do the unsexy work of reviewing exclusions and tracking false positive rates and closing the loop with your analysts. None of that is glamorous. None of it makes for a great conference talk.

But it’s the work that makes the difference between a tool you bought and a capability you actually have.

A Few Environments That Deserve Their Own Conversation

One thing worth acknowledging as we close this series: everything we’ve covered assumes a reasonably standard hybrid environment. On-prem AD, Azure, a mix of domain-joined endpoints, service accounts doing service account things.

Not every environment is that.

If you’re running MDI in a pharmaceutical company with GxP-validated systems, or in a manufacturing environment with OT infrastructure and PLCs that talk to AD, or in a healthcare organization where clinical systems are domain-joined and change control is a formal process — the tuning conversation gets more complicated. The exclusion decisions carry more weight. The automation thresholds need more thought. The balance between security responsiveness and operational stability tilts differently.

Those environments exist. A lot of them are running MDI right now, figuring it out as they go. The fundamentals we’ve covered here still apply — but the application of those fundamentals looks different when a false automation trigger can affect a validated system or an operational technology environment.

That’s a conversation worth having separately, with the specificity it deserves.

The Case File Is Closed

Four posts. One PI. A complete picture of what Microsoft Defender for Identity actually does, how it works in a real attack scenario, how to make it respond automatically when it catches something, and how to tune it into a detection system your team actually trusts.

The PI came up from the basement. He got his close-up. He worked the case, called it in, and — after a few weeks of everyone learning to read his reports correctly — became one of the most reliable members of the whole security operation.

Not bad for the guy everyone forgot was down there.

Running MDI in a complex or regulated environment? Got tuning wins or war stories you’re willing to share? Drop them in the comments — the more specific the better. That’s where the real learning happens.

The PI Earns His Keep: Tuning MDI, Taming False Positives, and Making Your Detection Stack Actually Livable

Share this:

Leave a comment Cancel reply