Infrastructure as Code

Terraform Drift Detection — Know When Someone Bypasses Your Code

Someone logged into the AWS console and changed a security group rule. Someone used the CLI to resize an RDS instance during an incident. Someone modified a Lambda function's environment variables for debugging and forgot to revert. Your Terraform state now disagrees with reality, and <code>terraform plan</code> will show changes you did not expect. We set up automated drift detection that catches these discrepancies before they cause problems.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Why Drift Happens and Why It Matters

Drift is inevitable. Even teams with strict IaC discipline occasionally make manual changes during incidents, debugging sessions, or when a quick fix is needed and the Terraform workflow feels too slow. The problem is not that drift happens — it is that drift goes undetected.

Undetected drift creates three risks. First, security gaps: a security group opened for debugging stays open indefinitely. Second, unexpected plan changes: the next terraform apply reverts the manual change, potentially causing an outage if the change was intentional. Third, state confusion: when state and reality diverge, Terraform's plan output becomes unreliable, eroding trust in the tool.

Manual drift detection — running terraform plan periodically and eyeballing the output — does not scale. A large infrastructure with 500+ resources generates plan output that nobody reads carefully. Automated detection with filtering, severity classification, and targeted alerts is the only approach that works long-term.

Our Drift Detection Implementation

We implement drift detection as a scheduled pipeline that runs terraform plan against every root module on a daily or weekly cadence. The plan output is parsed programmatically to extract changed resources, and the results are sent to Slack, PagerDuty, or your ticketing system depending on severity.

Not all drift is equal. A changed security group rule is critical. A modified tag is low priority. We configure drift severity rules that classify changes by resource type and attribute. Security-related drift (security groups, IAM policies, encryption settings) triggers immediate alerts. Cosmetic drift (tags, descriptions) creates a low-priority ticket. This prevents alert fatigue while ensuring critical drift is never missed.

For remediation, teams choose between two approaches: auto-correct (run terraform apply to revert the drift and restore the declared state) or codify (update the Terraform code to match the manual change and commit it as the new desired state). The choice depends on whether the manual change was a mistake or an intentional fix that should be preserved.

We also configure AWS Config Rules or CloudTrail alerts as a real-time complement to the scheduled Terraform scans. These catch manual changes within minutes rather than waiting for the next scheduled scan. The alerts include the IAM principal that made the change, making it easy to follow up and codify the change if it was intentional.

What You Get

A complete drift detection and remediation system:

Scheduled drift scans — daily or weekly terraform plan across all root modules
Severity classification — critical, important, and low-priority drift categories
Targeted alerts — Slack/PagerDuty for critical drift, tickets for everything else
Remediation workflows — auto-correct or codify options per drift category
Real-time detection — CloudTrail/Config Rules for immediate manual change alerts
Drift dashboard — summary view of drift status across all environments
Runbook — procedures for investigating and resolving drift of each severity level

Why Anubiz Engineering

100% async — no calls, no meetings

Delivered in days, not weeks

Full documentation included

Production-grade from day one

Security-first approach

Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Start a Brief Managed Retainer Service