Infrastructure as Code

Terraform State Management — The Part Everyone Gets Wrong

Terraform state is the most critical and most misunderstood part of any Terraform setup. It maps your code to real cloud resources. Corrupt it and Terraform loses track of your infrastructure. Store it locally and you cannot collaborate. Forget to lock it and concurrent applies corrupt it silently. We set up bulletproof state management and fix existing state problems so you never lose sleep over <code>terraform.tfstate</code> again.

Need this done for your project?

We implement, you ship. Async, documented, done in days.

Start a Brief

Why State Management Matters

Terraform state is a JSON file that maps every resource in your code to a real resource in your cloud account. When you run terraform plan, Terraform reads the state to determine what exists, compares it to your code, and calculates the diff. Without accurate state, Terraform cannot manage your infrastructure — it will try to create resources that already exist or fail to update resources it does not know about.

The default state behavior is a local file in your working directory. This is fine for learning but dangerous for production. If two engineers run terraform apply simultaneously, both read the same state, both make changes, and the second apply overwrites the first's state — silently losing track of resources created by the first apply. This is not a theoretical risk; it happens regularly on teams without state locking.

State also contains sensitive data. Database passwords, private keys, and other secrets passed as Terraform variables end up in the state file in plaintext. If your state is stored in an unencrypted S3 bucket or committed to Git, those secrets are exposed. State encryption at rest is not optional — it is a security requirement.

We configure remote state backends with encryption, locking, and access controls as the foundation of every Terraform engagement. For existing setups with state problems, we perform state surgery to fix corruption, migrate between backends, and restructure state to match your evolving architecture.

Remote Backend Setup

For AWS, we configure an S3 backend with DynamoDB locking. The state bucket has versioning enabled (so you can recover previous state versions), server-side encryption with KMS, and a bucket policy that restricts access to specific IAM roles. The DynamoDB table handles state locking with a LockID partition key. We create this infrastructure using a bootstrap module that is the only Terraform code you run manually — everything else goes through CI/CD.

For GCP, the backend uses a GCS bucket with object versioning and uniform bucket-level access. Locking is built into the GCS backend natively. Authentication uses Workload Identity Federation for CI/CD and application-default credentials for local development.

For Azure, we use an Azure Storage Account with blob versioning and a storage account key stored in Key Vault. State locking uses Azure Blob lease mechanism, which is automatic when using the azurerm backend.

Regardless of provider, we configure the backend with the minimum permissions principle. The CI/CD service account can read and write state. Developers can read state (for terraform plan) but cannot write state directly — only through the CI/CD pipeline. This prevents accidental local applies that bypass the review process.

State Surgery and Migration

Existing Terraform setups often need state surgery. Common scenarios we handle:

State migration — moving from local state to remote backend, or between remote backends (e.g., Terraform Cloud to S3). We use terraform init -migrate-state and validate the migration with a plan that shows zero changes.
State splitting — a monolithic state file managing 200+ resources needs to be split into smaller, focused state files. We use terraform state mv to move resources between states without destroying and recreating them.
Resource renaming — when you refactor your code (moving resources into modules, renaming resources), Terraform sees a destroy + create. We use moved blocks (Terraform 1.1+) or terraform state mv to update the state without affecting real resources.
Import missing resources — resources created manually that need to be brought into state. We use terraform import or import blocks to add them without modification.
State corruption recovery — when state is corrupted by concurrent applies or interrupted operations, we restore from versioned backups and reconcile with the actual cloud state.

Every state operation is performed in a maintenance window with a backup taken first. We verify the result with terraform plan — a successful surgery shows zero planned changes.

Why Anubiz Engineering

100% async — no calls, no meetings

Delivered in days, not weeks

Full documentation included

Production-grade from day one

Security-first approach

Post-delivery support included

Ready to get started?

Skip the research. Tell us what you need, and we'll scope it, implement it, and hand it back — fully documented and production-ready.

Start a Brief DevOps Setup Service