Your first instinct — especially if you've come through courses, reference architectures, or cloud conferences — is usually the same:
It sounds smart. It looks professional. And it is one of the most dangerous decisions a platform engineer can make.
Why "Just Rewrite It" Is a Trap
That platform you're frustrated with is not theoretical. It has been running real production workloads for months — sometimes years. It survived:
- Incidents and outages
- Compliance audits
- Security exceptions
- Scaling challenges
- Real customer pressure
Every ugly conditional, strange module boundary, or pipeline workaround likely exists because something broke in the past. That platform is the reason applications deploy today, teams deliver features, and the business generates revenue.
Engineering Maturity Starts With Respect
When I mentor platform engineers struggling with legacy Infrastructure as Code or DevOps setups, the first thing I do is stop the wrong kind of enthusiasm. Not because improvement is bad — but because reckless improvement is expensive.
Real expertise is improving a live platform without breaking the business.
The "Big Rewrite" Fallacy
A full rewrite assumes you can:
- Rediscover years of hidden business rules
- Pause feature delivery without business impact
- Not miss edge cases that were learned the hard way
- Deliver something strictly better — not just prettier
In reality, big rewrites often result in lost functionality, new security gaps, more incidents, and — most painfully — lost trust from stakeholders.
How Strong Platform Engineers Handle Legacy Platforms
Understand Before You Judge — Chesterton's Fence
Before removing anything, ask why it exists. That strange Terraform condition or pipeline exception is probably there for a reason:
- A regulatory requirement someone negotiated under pressure
- A workaround for a critical enterprise customer
- A scar from a historical incident that took the platform down
Use the Strangler Pattern for Platforms
You don't rebuild a landing zone overnight. Instead:
- Introduce new, well-designed Terraform modules alongside the old ones
- Create new pipelines with better standards for new workloads
- Let new workloads adopt the new patterns from day one
Apply the Boy Scout Rule to IaC
Every time you touch the platform — even for a small fix — improve something:
- Fix a variable name
- Extract a reusable module
- Remove duplication
- Add a missing comment or README entry
Build Safety Nets Before You Refactor
In platform engineering, safety nets are non-negotiable before touching anything:
terraform planvalidations in every pipeline- Policy-as-Code to catch drift and violations
- Drift detection on live environments
- Integration and canary environments to validate changes safely
The Real Takeaway
Engineering maturity is not about hating legacy platforms. It's about respecting what kept the company alive — and evolving it carefully.
This mindset shift is one of the hardest lessons in platform engineering:
From "I know best practices" → To "I know how to apply them in production."
That shift separates engineers who can design platforms from engineers who can own and evolve them responsibly. And that is what real Platform Engineering is about.

RSS Feed