Infrastructure as Code Patterns
Infrastructure as Code Patterns
Modern infrastructure is no longer about provisioning servers โ itโs about designing scalable, reusable, and governed systems. As organizations grow, Terraform codebases tend to sprawl, teams step on each otherโs toes, and deployments become risky.
This post dives into battle-tested Infrastructure as Code (IaC) patterns that help you scale Terraform in enterprise environments.
๐งฉ 1. Module Composition (Not Copy-Paste)
At scale, Terraform modules are your unit of abstraction.
โ Anti-pattern
- Copying
.tffiles across environments - Hardcoding environment-specific values
- Monolithic root modules
โ Pattern: Composable Modules
Structure modules like building blocks:
modules/ vpc/ eks/ rds/ iam/
Then compose them:
module "vpc" {
source = "../modules/vpc"
cidr_block = var.cidr_block
}
module "eks" {
source = "../modules/eks"
vpc_id = module.vpc.id
}
Key principle: Modules should be small, focused, and reusable โ not environment-aware.
๐ 2. Remote State Isolation
State is Terraformโs source of truth โ treat it like production data.
โ Anti-pattern
- Shared state across environments
- Local state files
- No locking
โ Pattern: Isolated Remote State
Use separate state per:
- Environment (dev, staging, prod)
- Region (if needed)
- Critical components (network vs compute)
Example (S3 backend):
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/network/terraform.tfstate"
region = "ap-south-1"
dynamodb_table = "terraform-locks"
}
}
Why it matters:
- Prevents accidental cross-environment changes
- Enables parallel team workflows
- Reduces blast radius
๐ 3. Environment Promotion Flow
Avoid โterraform apply in prodโ chaos.
โ Anti-pattern
- Direct changes in production
- No promotion pipeline
- Drift between environments
โ Pattern: Progressive Promotion
dev โ staging โ prod
Each stage:
- Same code
- Different variables
- Validated before promotion
Example variable separation:
envs/ dev.tfvars staging.tfvars prod.tfvars
CI/CD flow:
terraform plan -var-file=dev.tfvars
terraform apply -var-file=dev.tfvars
Key principle: Promote artifacts (plans or commits), not ad-hoc changes.
๐ 4. Drift Detection
Infrastructure drift is silent but dangerous.
What causes drift?
- Manual console changes
- Hotfixes in production
- External automation
โ Pattern: Continuous Drift Detection
Run scheduled checks:
terraform plan -detailed-exitcode
Exit codes:
- 0 โ No changes
- 2 โ Drift detected
Automation example:
- Nightly GitHub Actions / GitLab CI job
- Alert on drift
Bonus:
- Integrate with Slack alerts
- Add visibility in observability dashboards
๐ฆ 5. Governance Gates
As teams scale, you need guardrails, not just trust.
โ Anti-pattern
- Anyone can apply anything
- No policy enforcement
- Security checks after deployment
โ Pattern: Policy-as-Code
Use tools like:
- Sentinel (Terraform Cloud)
- OPA (Open Policy Agent)
Example rules:
- No public S3 buckets
- Enforce tagging
- Restrict instance types
Pseudo-policy:
deny if resource.aws_s3_bucket.public == true
CI/CD gate:
terraform validate โ security scan โ policy check โ apply
๐๏ธ 6. Reusable Infrastructure Blueprints
Think beyond modules โ build platform-level abstractions.
Pattern: Opinionated Blueprints
Example:
blueprints/ eks-platform/ microservice-stack/ data-platform/
Each blueprint includes:
- Networking
- IAM roles
- Monitoring
- Logging
- Security defaults
Usage:
module "service" {
source = "../blueprints/microservice-stack"
service_name = "payments"
environment = "prod"
}
Outcome:
- Faster onboarding
- Consistent architecture
- Reduced cognitive load
๐ฅ 7. Team Collaboration Model
Terraform isnโt just code โ itโs shared responsibility.
Recommended workflow
- Git-driven changes
- PR-based workflow
- Mandatory reviews
- Plan visibility
- Post terraform plan output in PR comments
- Controlled applies
- Only via CI/CD
- No local applies in production
- Ownership boundaries
- Teams own specific modules or stacks
โ๏ธ Reference Architecture
A scalable Terraform repo might look like:
terraform/ modules/ blueprints/ envs/ dev/ staging/ prod/ live/ dev/ network/ apps/ prod/ network/ apps/
๐ฅ Failure Scenarios (and How These Patterns Help)
| Problem | Without Patterns | With Patterns |
|---|---|---|
| Accidental prod change | High risk | Isolated state + gated CI |
| Drift | Undetected | Automated detection |
| Team conflicts | Frequent | State + ownership boundaries |
| Slow delivery | Reinventing infra | Reusable blueprints |
๐ฐ Cost vs Complexity Trade-off
| Pattern | Complexity | Benefit |
|---|---|---|
| Modules | Low | Reuse |
| Remote State | Medium | Safety |
| Promotion Flow | Medium | Reliability |
| Governance | High | Compliance |
| Blueprints | High | Scale |
๐ง Final Thoughts
Terraform at small scale is easy.
Terraform at enterprise scale is a systems design problem.
The shift is:
- From writing resources โ designing platforms
- From applying changes โ governing change
- From scripts โ infrastructure products
If you get these patterns right, your IaC becomes:
- Predictable
- Auditable
- Scalable
And most importantly โ boring in production (which is exactly what you want).
๐ Next Steps
If youโre evolving your platform:
- Start by isolating state
- Introduce module boundaries
- Add drift detection early
- Gradually layer governance
You donโt need all patterns at once โ but you will need them eventually.