Loading...


Updated 7 May 2026 • 5 mins read

Tagging is intuitive but it scales poorly. As infrastructure grows, the cost of maintaining tag coverage grows faster than the value tagging produces, creating the Tagging Tax. We define the curve, walk through the four failure modes we see most often, and present a tagless allocation framework that uses ownership graphs, deployment metadata, and behavioural signals to allocate cost without depending on tag hygiene.
If you have ever sat through a cost allocation steering committee, you already know the conversation. Someone will say we just need better tags. Someone else will agree. A policy will be drafted. Six months later, tag coverage will still be at 64 percent, the Athena queries will still be brittle, and the showback report will still have an Untagged bucket eating 30 percent of the bill. The committee will reconvene and say we just need better tags.
We have watched this loop play out in dozens of organisations, from twenty-person startups to global enterprises with thousands of accounts. The pattern is consistent enough that we gave it a name: the Tagging Tax Curve. This article unpacks why the curve exists, why it gets worse with scale rather than better, and what the alternative looks like. The short version is that tagging is not the foundation of cost allocation. It is one signal among many, and treating it as the foundation is exactly why so many programs stall.
The Tagging Tax is the gap between the cost of maintaining tag coverage and the value tag coverage produces. In small environments, the cost is low and the value is high, so tagging works. As environments grow, three things happen simultaneously. Resource volume grows linearly. The number of teams creating resources grows linearly. The number of resource types and IaC patterns grows non-linearly because of new services, new accounts, and new acquisitions.
Maintaining tag coverage in this environment requires policy enforcement, automated remediation, exception handling, audit cycles, and continuous education. The cost grows faster than linearly. The value, meanwhile, plateaus, because once you have allocated 90 percent of spend, the remaining 10 percent is the hardest and least valuable to chase.
The crossover point, where maintenance cost exceeds incremental value, is what we call the Tagging Tax Curve. We have measured it in real environments. It typically arrives around the 600-account or 50,000-resource mark, though the exact threshold depends on team structure and IaC maturity. After that point, tagging programs feel exhausting because they are exhausting. The math has flipped against you.
Every failed tagging program we have audited fits into one of four buckets. Understanding which bucket you are in matters because the remediation is different for each.
The first failure mode is tag drift. Teams agree on a taxonomy, document it, and then drift apart over time as services evolve, ownership changes, and new patterns emerge. Six months in, team means three different things in three different parts of the organisation, and joins across the dataset stop working.
The second failure mode is untaggable resources. Network traffic, data transfer, support charges, RI and Savings Plan amortisation, marketplace subscriptions, and shared services like KMS and CloudTrail simply cannot be tagged at the source. They show up in the bill, they cost real money, and they are invisible to any tag-based allocation. This category alone often represents 25 to 40 percent of spend. Our breakdown of shared cost allocation patterns covers this in depth.
The third failure mode is late tagging. Resources get created during incidents, hackathons, migrations, and PoCs without tags. They run for weeks or months before anyone notices. Retroactive tagging is possible but expensive, and the cost data for the untagged window is permanently ambiguous.
The fourth failure mode is tag conflict. Two systems tag the same resource differently. CI/CD pipelines tag with one schema, Terraform modules with another, manual operators with a third. The cost data ends up with three competing answers to who owns this, and finance has to pick one, usually arbitrarily.
The shift we advocate is conceptual. Stop treating tags as the source of truth for ownership. Treat tags as one input signal among several, and reconstruct ownership from a richer context graph.
The signals we combine are these:
The deployment graph tells us which IaC repository, pipeline, and commit created each resource. This is observable from CloudTrail, Terraform state, and Git metadata, with no dependency on the tag being correct. If a resource was deployed by the payments-service Terraform module, we know it belongs to the payments team regardless of whether the tag was applied.
The behavioural graph tells us which workloads talk to which other workloads. VPC flow logs, service mesh telemetry, and database connection metadata reveal the actual blast radius of each resource. A database that is only queried by the checkout service is, in practice, a checkout service resource.
The identity graph tells us which IAM role, SSO group, or human user is operating each resource. This is observable from CloudTrail and access logs, and it is far more stable than a tag because it reflects actual usage rather than declared intent.
The organisational graph tells us how teams map to services, codebases, on-call rotations, and budget owners. This is usually maintained outside the cloud, in tools like Backstage, Opsgenie, or simple spreadsheets, and it can be joined back to the cloud signals.
Combining these four graphs produces an allocation answer for every dollar of spend, including the dollars that tags could never reach. Untaggable categories like data transfer can be allocated by tracing them through the behavioural graph. Shared services can be allocated by usage rather than by even split. Resources created during incidents can be attributed by the IAM identity that created them.
| Dimension | Tag-Based Allocation | Tagless Allocation |
|---|---|---|
| Source of ownership | Manually applied tags | Deployment, behavioural, identity, and org graphs |
| Coverage at scale | Plateaus at 70 to 90 percent | Approaches 100 percent, including untaggable spend |
| Maintenance cost | Grows non-linearly with environment size | Largely automated once instrumented |
| Resilience to drift | Low, drift accumulates silently | High, signals are observed continuously |
| Handles untaggable resources | No | Yes, through behavioural attribution |
| Handles shared services | Only by manual rules | Yes, through usage-based attribution |
| Time to first useful allocation | Months of tag remediation | Days, using existing telemetry |
| Engineering burden | Continuous, distributed across teams | Centralised in the allocation engine |
The difference is not subtle. Once we make the switch, organisations stop running tag remediation programs and start running allocation programs. The conversation moves from why is your tag coverage low to here is your team's spend, here is how it changed, here is what is driving it. This is the conversation FinOps is supposed to enable.
The Tagging Tax Curve is real, and most cost allocation programs run head-first into it without realising what they are fighting. The good news is that the curve is not a law of physics. It is a consequence of treating one signal as the entire foundation. When we widen the foundation to include deployment, behaviour, identity, and organisational context, allocation becomes a continuous observation problem rather than a continuous policy enforcement problem. The work shifts from chasing engineers for tags to delivering insight to engineers about their spend. That is the version of FinOps that actually works at scale, and it is the version we build for our customers every day.
No. Tagless allocation uses tags as one input among many. Your existing tags continue to add value. The difference is that allocation no longer breaks when tag coverage is incomplete.
For most environments, the first useful allocation report can be produced within two to three weeks. Refinement continues over the following quarter as edge cases are addressed.
Tagless allocation handles these by attributing usage based on observed access patterns. The data lake cost gets distributed across the teams whose queries actually consumed it, weighted by query volume or scanned bytes.
Not at all. The allocation engine produces a unified ownership view that can be joined back into Athena, QuickSight, or any existing reporting tool. Tag-based filters continue to work alongside the graph-based attribution