Guide
AI Has Broken SaaS Unit Economics
The interesting question in SaaS used to be CAC. That era is over.
When AI inference runs at 30–50% of revenue, the management challenge is no longer acquisition. It is gross margin per customer. A $50K ARR contract with $30K of AI spend is a fundamentally different business outcome than a $50K ARR contract with $5K of AI spend. Classical SaaS finance cannot tell you which one you have. It was never designed to.
This guide covers the framework that replaces it: cost per request, contribution margin per customer, and what "healthy" looks like when inference is a line item.
What classical SaaS assumed
The classical unit-economics model worked because it described a business with a specific structure:
- Variable cost per customer was dominated by hosting and support, both scaling sublinearly with revenue.
- Gross margin was structural: set by the product architecture, stable once achieved, and high enough to make the cost side a solved problem.
- The interesting question was acquisition. CAC payback, churn, expansion. That is what the whole canon (LTV/CAC, payback period, magic number) was built to answer.
Those formulas are not wrong. They describe the business that existed before AI inference was a cost of revenue.
What changes with AI
Three things break the classical model.
Variable cost is now significant and customer-specific. A heavy AI user can cost 50% of their MRR in inference. A light user costs 5%. Per-seat or per-account averaging hides which customers are actually profitable.
Gross margin is dynamic, not structural. In classical SaaS, gross margin improves slowly with scale. In AI-native SaaS, it moves week to week with model price changes, traffic mix shifts, agent behavior, and prompt engineering. A new model release can move gross margin two points in a quarter. A misconfigured agent can move it five points in a week. Quarterly review is too slow.
The interesting question shifts from CAC to gross margin per customer. When variable cost is large and customer-specific, "is this customer profitable?" stops being trivial. The number that matters for valuation, retention strategy, and pricing is contribution margin per customer, not ARR. And contribution margin per customer requires per-request cost attribution, which classical SaaS finance never had to build. (See Per-customer attribution.)
The new unit-economics formulas
The replacement framework has four primitives.
Cost per request
The atomic unit. Every LLM call has a cost:
cost_per_request = (input_tokens x p_in) + (output_tokens x p_out)
For agentic workflows, "request" is the user-initiated action, not the individual LLM call. A single user prompt that triggers 200 internal calls has a cost that is the sum of all 200. Cost per request is the only number you can instrument cleanly at the source. Everything else aggregates from it.
Cost per customer
Sum of cost-per-request over all requests attributable to that customer in a period:
cost_per_customer_month = sum of cost_per_request for that customer in the month
The dimension that matters is customer, not user, not seat. A 50-seat account with one heavy power user can cost more than five light 50-seat accounts combined.
Contribution margin per customer
contribution_margin_per_customer = MRR - cost_per_customer_month - other_variable_cost_per_customer
contribution_margin_% = contribution_margin_per_customer / MRR
This is the number that says whether each customer is profitable. Aggregate it across the customer base and you get gross margin in the classical sense, but the useful analysis is the distribution, not the average.
Cost per feature / per workflow
cost_per_feature_month = sum of cost_per_request tagged to that feature
contribution_margin_per_feature = revenue - cost - other_variable
Feature-level unit economics tells you which features to invest in, which to deprecate, and which to price. Without it, every feature is a black box that may or may not be paying for itself.
A worked example
Take a hypothetical AI-native B2B SaaS company. $32M ARR, 240 customers, average $11K MRR. Looks healthy on the top line.
| Customer cohort | Customers | Avg MRR | Avg AI cost | Contribution margin | |---|---|---|---|---| | Top decile (heavy users) | 24 | $24,500 | $19,400 | 21% | | 2nd–4th decile | 72 | $14,800 | $7,200 | 51% | | 5th–8th decile | 96 | $7,500 | $1,500 | 80% | | Bottom decile (light users) | 48 | $4,200 | $300 | 93% |
The top decile is generating 56% of revenue but only 30% of contribution. The bottom decile is small in revenue but 93% margin. The same business looks like a high-growth SaaS company at the bottom and a commodity reseller at the top.
Three things become obvious from this table that are invisible in aggregate:
- The top decile has a pricing problem, not a cost problem. They are using the product as designed; the pricing is wrong for their usage pattern. The fix is contract restructuring or usage-based pricing, not engineering optimization.
- The middle two cohorts are the strategic core. Profitable, scaling, and the place where retention investment pays off most.
- The aggregate gross margin (58% in this example) hides everything important. It is a useful headline number for investors. It is a useless number for management decisions.
What "healthy" looks like
No single benchmark fits every company. ICONIQ's 2025 data showed a median AI spend at roughly 30% of revenue, with the heavy tail going above 50%. The ranges below are calibration points, not targets, drawn from what we see across AI-native B2B SaaS today.
| Metric | Concerning | Watching | Healthy | |---|---|---|---| | Aggregate gross margin (AI as COGS) | < 40% | 40–55% | 55–75% | | Top-decile customer contribution margin | < 10% | 10–25% | 25–40% | | Bottom-decile customer contribution margin | < 70% | 70–85% | 85–95% | | AI spend as % of revenue | > 50% | 30–50% | 15–30% | | AI spend growth rate vs. revenue growth rate | AI > 1.5x revenue | AI = 1.0–1.5x revenue | AI < revenue |
The last row is the most important leading indicator. If AI spend is growing faster than revenue, gross margin is compressing every quarter. If AI spend growth is below revenue growth, you are gaining operating leverage on the variable cost side and the business is improving.
The value capture question
Beneath the math is a strategic question every AI-native founder has to answer: how much of the value your product creates gets captured by you versus by the inference provider?
Three patterns are emerging.
The thin-wrapper problem. If your product is mostly a prompt template plus a chat UI, the customer is paying for a marginal value-add over what they could build with an OpenAI key. Gross margin will compress to roughly the value of that marginal effort. The wrapper either thickens or the business commodifies.
The proprietary-data moat. If your product depends on data the customer cannot replicate, you can charge a premium that LLM cost does not erode. Most AI-native B2B winners are following this pattern.
The orchestration moat. If your product wraps LLM calls in workflow orchestration, integrations, and operational knowledge that takes time to replicate, you are charging for the orchestration, not the inference. Gross margins recover as you automate the orchestration cost down.
Unit economics analysis shows which pattern you are in. Top-decile customers at 20% contribution margin means value is not flowing to you fast enough. Top-decile customers at 50% means the math is working.
How to instrument
The data requirements for this framework are not optional. To compute the formulas above, you need:
- Per-request cost record with token counts, model identifier, latency, status, and cost in dollars
- Customer dimension on every request so cost can be aggregated to the customer level
- Feature/workflow dimension on every request for product analysis
- Revenue mapping per customer per period (typically from the billing system)
- Reconciliation against provider invoices so the per-request total matches the bill
Most companies have (1) somewhere, usually in observability tools. (2) and (3) are the gap; engineering teams have to instrument them deliberately. (4) and (5) are the finance integration that closes the loop, and they only work if the close is operationalized monthly. (See The AI month close.)
The instrumentation is the work. Once it exists, the unit economics analysis is mechanical. Without it, every analysis is a multi-week reconstruction project from logs.
Review cadence
Unit economics in classical SaaS is a quarterly artifact. In AI-native SaaS, it needs to be monthly at minimum and weekly where possible. Model pricing changes happen with little notice and shift gross margin materially. A customer onboarding to a new feature can move their cost profile inside a billing cycle. Agentic workflows have non-deterministic cost profiles: yesterday's agent that took 12 steps may take 30 tomorrow from a prompt or model change.
The companies handling this well have moved AI unit economics into the same dashboard tier as DAU and revenue, with weekly review as standard and monthly reporting to the board.
What this changes for the business
If you do this analysis for the first time, three things tend to emerge:
- Some customers move to renegotiation. Customers consuming more AI than they pay for are running an arbitrage on you. Once you can name them, you can fix them.
- Some features move to deprecation. Features with negative contribution margin and small revenue are net costs. Hiding them behind a paywall is a margin lift.
- Pricing strategy changes. Once you understand cost-per-customer, you can build tiers that align with cost. Most AI-native companies arrive at some form of usage-based or hybrid pricing within 18 months of doing this analysis seriously.
Every AI-native company that has scaled past $20M ARR has had to confront these numbers. The teams that did it early preserved their margins. The teams that delayed are now reverse-engineering data from logs and trying to reprice contracts mid-flight.
A closing note for finance leaders
If your company has not done this analysis, the practical first step is small: pick your top 20 customers by ARR, get the AI cost attributable to them for the last full month, and compute contribution margin for each. The exercise takes a week if your data is clean and a month if it is not. Either way, the picture reshapes how you think about the business.
That picture cannot come from provider invoices or quarterly spreadsheets. It requires request-level attribution, revenue mapping, and month-end reconciliation. AI-native SaaS companies do not just need better dashboards. They need a new financial operating layer for inference.
Spendline captures cost at the request level as traffic flows through the proxy and aggregates contribution margin per customer on demand. No log reconstruction, no end-of-quarter spreadsheet rebuild. If you are computing these numbers from logs today, request a pilot and we will set you up with a working model in two weeks.