Loading...


Updated 1 Jun 2026 • 8 mins read

AI teams are moving training workloads to Nebius for the 57% price gap. That spend never reaches AWS Cost Explorer or legacy FinOps tools, breaking chargeback and budget governance. This guide covers the price math, why tooling breaks, and a governance-safe migration and unified showback playbook.
As cloud environments become more complex, teams need a common language to manage costs, improve operational efficiency, and accelerate innovation. The OpsLyft Cloud FinOps & DevOps Glossary is a comprehensive A–Z resource designed to help engineers, FinOps practitioners, platform teams, architects, and business leaders understand the terminology that drives modern cloud operations.
From cloud cost optimization and Kubernetes management to AI infrastructure, security, governance, and platform engineering, this glossary brings together the most important concepts in one place. By creating a shared vocabulary across finance, engineering, and operations teams, organizations can make better decisions, improve accountability, reduce cloud waste, and maximize the value of their technology investments.
Whether you're exploring FinOps for the first time, building cloud governance practices, or optimizing large-scale Kubernetes and AI workloads, this glossary provides clear, practical definitions to help you navigate the evolving cloud ecosystem with confidence.
The price gap is the entire reason this migration is happening. Here is the verified per-GPU-hour comparison on identical NVIDIA H100 80GB hardware, pulled from each provider's live public pricing on May 18, 2026.
| Provider | Instance (H100 80GB) | On-Demand ($/GPU-hr) | vs AWS | Source |
|---|---|---|---|---|
| Nebius | NVIDIA HGX H100 (8-GPU) | $2.95 | 57.1% cheaper | nebius.com/prices |
| Nebius (Preemptible) | NVIDIA HGX H100 (8-GPU) | $1.25 | 81.8% cheaper | nebius.com/prices |
| Lambda Labs | H100 SXM (8x) | $3.99 | 42.0% cheaper | lambda.ai |
| Crusoe | H100 80GB HGX | $3.90 | 43.3% cheaper | crusoe.ai |
| CoreWeave | HGX H100 (8-GPU) | $6.155 | 10.5% cheaper | coreweave.com |
| AWS | p5.48xlarge ($55.04/hr ÷ 8) | $6.88 | Baseline | AWS Price List API |
| Azure | Standard_ND96isr_H100_v5 | $12.29 | 78.6% more | Azure Retail Prices API |
GCP is excluded from the table because the H100 a3-highgpu-8g price is JavaScript-rendered on the GCP pricing page and not exposed via the public Cloud Billing Catalog without an API key. We refuse to publish unverified neocloud numbers.
The $2.95 vs $6.88 comparison is real, but it is only the GPU compute line. The total cost of a training run also includes storage, egress, networking, idle time, and engineering hours. Here is the honest cost stack for a 30-day Llama 3 70B fine-tune on 8x H100:
| Cost Component | AWS p5.48xlarge | Nebius HGX H100 | Notes |
|---|---|---|---|
| GPU Compute (720 hr × 8 GPU) | $39,640 | $16,992 | $6.88 vs $2.95 per GPU-hr, verified |
| Object Storage (5 TB dataset + checkpoints) | S3: ~$115/mo | Published rate applies | Both under $200/mo at this scale |
| Egress to Fetch Base Model Weights Once | $0 (internal) | $0 if uploaded directly | Only matters if you move data mid-run |
| Cross-Cloud Egress (if data lives in AWS) | $0 | ~$460 one-time | The hidden tax of moving compute, not data |
| Engineering Hours to Set Up Nebius | $0 (done) | 20–40 hrs, first project | Amortized across future runs |
| Net Delta for a Single 30-Day Run | Baseline | ~$22,000 cheaper | Even after egress and setup, Nebius wins ~55% |
The setup cost matters for the first run and disappears for runs two through twenty. The egress cost matters if training data lives in AWS S3 and you forget to mirror it to neocloud storage. Both are recoverable. The structural problem, the one this article is about, is not the egress bill. It is what happens to allocation, chargeback, and budget governance once GPU spend leaves the AWS ledger.
Price is the obvious answer, but it is not the only one. Four reasons drive the migration:
Shadow GPU spend: AI compute purchased outside the company's primary cloud billing and FinOps tooling, typically on AI-native clouds. Engineering expenses it directly. Finance has no consolidated view of GPU spend across providers.
The legacy FinOps stack (CloudZero, Vantage, Apptio Cloudability, AWS Cost Explorer) was built around three assumptions that all break on Nebius:
1. There is no AWS CUR equivalent for Nebius. Nebius publishes invoices and a billing portal but does not produce an AWS Cost and Usage Report. That means Nebius spend never reaches Cost Explorer, Cost Categories, Savings Plans coverage dashboards, or any of the BI integrations finance teams have wired into CUR.
2. There are no AWS-style resource tags. Nebius lets you label VMs, but the labels do not map to the tag-based allocation logic CloudZero and Vantage use to slice spend by team, product, or customer. Teams adopting Nebius typically have 40 to 60% of GPU spend untagged from a chargeback perspective in the first quarter of adoption.
3. Billing lives in a separate portal. Nebius invoices arrive on a different schedule, in a different format, from a different vendor. Finance reconciliation becomes a manual CSV join: AWS CUR plus Azure cost export plus Nebius invoice plus internal allocation key. By the time the join is done, the month is over.
To understand why this is hard to retrofit, look at what each tool ingests. None of these schemas have a place to put Nebius data:
| Tool | Primary Ingest Format | Native Nebius Support (Mid-2026) | Workaround |
|---|---|---|---|
| AWS Cost Explorer / Cost Categories | AWS CUR (Parquet, hourly) | None. CUR is AWS-only by definition. | Not retrofittable. You need a separate ledger. |
| CloudZero | AWS CUR + Azure + GCP + Snowflake | No public Nebius connector | Manual CSV upload, then CostFormation rules |
| Vantage | AWS CUR + Azure + GCP + ~30 SaaS | Datadog, Snowflake, MongoDB yes. Nebius no. | Custom integration via Vantage API |
| Apptio Cloudability | Hyperscaler billing exports | No native Nebius support | CSV import with manual mapping |
| Kubecost | Prometheus + cloud billing APIs | Works on Nebius K8s, not unified with CUR | Cluster-level only; not cross-provider |
The pattern is consistent. Every tool above was architected on the assumption that “cloud spend” means “hyperscaler spend.” Neocloud invoices are a different shape (no instance-level hourly granularity in some cases, no resource tags, different invoice cadence). Retrofitting a Nebius feed into CloudZero is not a configuration change. It is an architectural change to how the tool partitions, tags, and reconciles spend.
We call this the 3 Allocation Questions framework. Every CFO and FinOps lead loses the ability to answer all three the moment a meaningful chunk of AI workload moves to Nebius (or CoreWeave, Lambda, RunPod, or Crusoe).
Without unified tags and a unified ledger, you cannot answer this. Engineering says “the AI team.” The AI team says “experimentation.” Finance writes it off as R&D and loses chargeback discipline.
AWS Budgets and Azure Cost Management only see their own provider. A team that is 30% under AWS budget and 200% over Nebius budget shows green in two dashboards and red in a third no one is watching.
If 40% of your Nebius H100s sit at 20% GPU utilization while AWS p5 capacity is queue-bound, no FinOps tool today will surface that cross-provider arbitrage. The data lives in two separate Prometheus stacks.
Nebius is not strictly better than AWS for every AI workload. Six characteristics of the workload determine the right answer. This is the matrix we use internally when advising teams:
| Workload Characteristic | Nebius Advantage | AWS Advantage |
|---|---|---|
| Pre-training a Foundation Model (Multi-Week) | Strong. 57% per-GPU-hour savings compound. InfiniBand by default. | Weak. Capacity holds and EDP negotiation slow iteration. |
| Fine-Tuning Open-Source Weights (3–30 Days) | Strong. Cost-dominant. Preemptible H100 at $1.25/hr. | Weak unless EDP rates approach Nebius list pricing. |
| Production Inference (Latency-Sensitive, SLA-Bound) | Mixed. Lower $/hr, but residency and SLA requirements must be validated. | Strong. Multi-AZ and multi-region SLA support is mature. |
| Regulated Data (HIPAA, SOC 2, GDPR Residency) | Weak by default. Verify Nebius compliance for each workload. | Strong. AWS compliance documentation and certifications are extensive. |
| Tightly Integrated with AWS (S3, Bedrock, SageMaker) | Weak. Cross-cloud egress and latency can offset savings. | Strong. Co-location with S3, Bedrock, and SageMaker provides operational benefits. |
| Bursty Experimentation by a Small AI Team | Strong. Preemptible pricing and immediate availability. | Weak. P5 reservations are often excessive for this usage pattern. |
The honest answer for most enterprises is hybrid: pre-training and fine-tuning go to Nebius, production inference and regulated workloads stay on the hyperscaler closest to the source data. Which is exactly why allocation across providers becomes the dominant FinOps problem.
If you are about to migrate an AI workload to Nebius, run this sequence before the first VM is provisioned. Skipping any step is how the 4-month chargeback gap scenario happens.
Run a 30-day reconciliation drill. Before promoting Nebius from pilot to production, do an end-to-end finance close that includes Nebius spend in the chargeback report. Find the broken joins, the missing labels, and the manual steps before they compound.
Across enterprise FinOps programs we have audited, four failure patterns recur when AI teams adopt neocloud GPUs without an allocation strategy:
Commitment planning regresses. Without a combined view, teams over-commit on AWS p5 Savings Plans while paying on-demand Nebius rates for the same workload type.
This is solvable. The fix is architectural, not vendor-specific. Three components:
Pick one (project ID, team code, or product line) and enforce it as a required label on every Nebius VM, AWS EC2 instance, Azure VM, GCP project, and CoreWeave cluster. No label, no provision. This is the single highest-leverage control in unified FinOps.
Ingest AWS CUR, Azure Cost Management API exports, GCP Billing Export, and Nebius CSV invoices into a single warehouse (Snowflake, BigQuery, or Redshift). Reconcile on the allocation key. This is the minimum viable cross-provider view.
Most FinOps tools were architected for hyperscaler-only environments. Opslyft FinOps360 ingests hyperscaler plus AI-native cloud spend natively and produces per-team, per-product, and per-customer reports across all providers in one view. Read more on cost allocation without perfect tagging and the broader FinOps practice playbook.
Concretely, the unified ledger needs five columns. This is the smallest schema that answers all 3 Allocation Questions across hyperscalers and neoclouds:
| Column | Type | Source per Provider |
|---|---|---|
usage_date |
date | AWS CUR UsageStartDate, Azure usageDateTime, Nebius invoice period |
provider |
string | Literal value: aws | azure | gcp | nebius | coreweave | lambda |
allocation_key |
string | AWS resourceTags/user:team, Azure tags.team, Nebius VM label team |
service_category |
string | Normalized values: gpu_compute | storage | egress | other |
net_cost_usd |
numeric(12,4) | AWS UnblendedCost, Azure costInBillingCurrency, Nebius line total |
With that schema in a single warehouse table, the chargeback query becomes a single GROUP BY across provider and allocation_key. Without it, finance is in CSV-join purgatory. The hard part is not the SQL. It is enforcing the allocation_key at provisioning time on every provider.
opslyft was built for the exact problem this article describes: cloud spend that no longer lives in one place. As AI teams spread training across AWS, Azure, GCP, Nebius, and other neoclouds, opslyft FinOps360 brings every dollar back into one allocation view. Instead of stitching CSV exports together at month-end, finance and engineering see unified GPU spend in real time.
The platform supports teams across the full FinOps lifecycle:
The goal is simple: keep the 57% savings from Nebius without losing the cost discipline that finance depends on. See FinOps360 or book a 20-minute demo.
The Nebius price gap is real, and the migration off hyperscalers is rational. The risk is not the egress bill. It is the moment GPU spend leaves the AWS ledger and finance loses ownership, budget, and utilization visibility.
Capture the savings, but lock the allocation key and unified ledger first. In a multi-cloud AI world, visibility is the new cost lever
Nebius lists NVIDIA HGX H100 at $2.95 per GPU-hour on-demand versus AWS p5.48xlarge at $6.88 per GPU-hour, a 57.1% discount. Preemptible Nebius H100 capacity drops to $1.25 per GPU-hour, an 82% discount versus AWS list.
Lower per-GPU-hour price, faster H100 and H200 availability, InfiniBand by default for multi-node training, and simpler procurement that does not require EDP negotiation. The trade-off is finance visibility.
No. Nebius does not produce an AWS Cost and Usage Report and does not flow into AWS Cost Explorer. Spend lives in the Nebius billing portal with its own invoice schedule and must be ingested into your FinOps stack separately.
As of mid-2026, none publish a native Nebius connector. Customers using these tools export Nebius CSV invoices manually and join them to internal allocation keys outside the platform.
Among verified public list prices on May 18, 2026: Nebius preemptible at $1.25 per GPU-hour is the lowest, followed by Nebius on-demand at $2.95, Crusoe at $3.90, Lambda Labs at $3.99, CoreWeave at $6.155, AWS at $6.88, and Azure at $12.29. Preemptible and spot capacity comes with interruption risk that not every workload tolerates.