Deep Research

The Paradox of Technological Deflation: Historical Cost Trajectories and the Economic Scaling of Generative Artificial Intelligence

The historical progression of technological advancement is frequently characterized by a singular, seductive narrative: that through the relentless application of human ingenuity and industrial scale, capability inevitably becomes cheaper over time. This observation, grounded in the exponential growth of the semiconductor industry and the rapid democratization of digital communication, has shaped the strategic expectations of modern economies. However, an exhaustive analysis of cross-sector cost curves reveals that this deflationary path is far from universal. While digital logic has followed the predictable descent of Moore’s Law, other critical technologies—ranging from nuclear energy and pharmaceutical discovery to heavy transportation infrastructure—have exhibited "negative learning," where costs escalate despite technical maturation. As generative artificial intelligence (AI) transitions from an experimental novelty to the foundational architecture of global productivity, it stands at a precarious intersection of these two conflicting economic realities. By 2026, the industry faces an "inference iceberg," where the precipitous decline in per-token pricing is counteracted by a structural surge in physical resource demands, regulatory overhead, and the diminishing marginal returns of data acquisition.

The Theoretical Foundations of Technological Cost Reduction

To evaluate whether generative AI will follow the historical pattern of technology becoming cheaper, one must first deconstruct the mechanisms that drive deflation in other sectors. The primary drivers of cost reduction are typically categorized into two distinct but related empirical frameworks: Moore’s Law and Wright’s Law.

Moore’s Law and the Experience Curve in Digital Logic

Moore’s Law is the observation that the number of transistors on an integrated circuit (IC) doubles approximately every two years.1 Initially articulated by Gordon Moore in 1965, the projection was based on a log-linear relationship between device complexity and time.1 This observation is fundamentally an "experience curve" effect, quantifying efficiency gains from learned experience in production.1 Between 1960 and 1975, Moore calculated that components per chip increased by a factor of 65,000, driven by a combination of shrinking transistor dimensions (Dennard scaling), increased chip area, and "cleverness" in architectural design.2

The economic implication of Moore’s Law was profound: as density increased, the cost per function declined as the inverse of the number of devices per chip, until limited by manufacturing yields.2 By the mid-1970s, David House deduced that computer chip performance would double every 18 months, not only increasing capability but also improving energy efficiency.1 However, this trend began to deviate from its historical cadence around 2010 due to escalating technical challenges and the end of Dennard scaling, which had previously ensured that power consumption per unit area remained constant as transistors shrank.1

Wright’s Law and the Learning Rate

While Moore’s Law focuses on the passage of time, Wright’s Law (or the "learning curve" effect) posits that cost reduction is a function of cumulative production volume. Discovered by aeronautical engineer Theodore Paul Wright in 1936, the law observes that for every doubling of the total quantity of products produced, the unit cost falls by a fixed proportion.3 Wright initially observed this in aircraft production, where the labor required for each unit was reduced by approximately 20 percent with each doubling of experience.3

Wright’s Law is considered more universally applicable than Moore’s Law because it accounts for the frequency of activity rather than mere chronological time.3 Technologies that follow Wright’s Law, such as solar panels, batteries, and semiconductors, exhibit a constant "learning rate" over decades.4

Technology Category Metric for Doubling Empirical Learning Rate (%) Historical Persistence
Solar Photovoltaics (PV) Cumulative Installed Capacity 20.2% - 24% 40+ Years 4
Lithium-ion Batteries Cumulative Battery Production 7.5% - 19% Driven by EVs 4
Utility-Scale Wind Cumulative Installed Capacity 15% Variable by epoch 6
Semiconductors (Transistors) Cumulative Transistor Count ~40% Moore's Law proxy 4
Internet Transit Cumulative Traffic/Port Density 25% - 50% Annual decline 9

The durability of these learning rates suggests that modular, mass-produced technologies are the most likely to become cheaper in the long term. However, the exact rate differs based on geographic location, time-span, and the chosen proxy for experience.4

Counter-Examples to the Deflationary Rule: Rock’s Law and Eroom’s Law

The assumption that all technology becomes cheaper is challenged by the "darker side" of Moore’s Law, known as Rock’s Law. This principle holds that as chips get denser, the cost of the manufacturing equipment and facilities required to produce them rises exponentially.10 While the price to the consumer falls, the capital expenditure required from the producer follows an opposite trend. Leading-edge fabrication facilities (fabs) now cost between $10 billion and $20 billion, with the high-NA EUV scanners at the heart of modern lithography costing north of $400 million each.10

Eroom’s Law and the Crisis of Pharmaceutical Productivity

The most prominent example of technology becoming more expensive over time is found in the pharmaceutical industry. Eroom’s Law (Moore’s Law spelled backward) describes the observation that the cost of developing a new drug doubles approximately every nine years—a trend that has persisted since the 1950s.11 Despite exponential improvements in high-throughput screening, biotechnology, and computational drug design, fewer drugs make it to market per billion dollars spent.11 By 2024, the average cost to bring a new asset to market had risen to $2.23 billion.13

The drivers of Eroom’s Law provide a sobering parallel for the future of AI:

  1. The "Better than the Beatles" Problem: New innovations must compete against existing, highly effective products (such as off-patent generic drugs like Lipitor) that already have excellent safety records and low prices.11
  2. The "Cautious Regulator" Problem: Increasing risk intolerance by regulatory bodies (e.g., following safety crises like Thalidomide or Vioxx) has progressively raised the bar for approval, mandating larger and more expensive clinical trials.11
  3. The Exhaustion of Low-Hanging Fruit: Many of the most accessible drug targets have been exploited, forcing researchers to tackle increasingly complex and higher-risk biological pathways.11

Baumol’s Cost Disease and the Service Sector Stagnation

A further constraint on cost reduction is Baumol’s Cost Disease, which explains why costs rise in labor-intensive sectors such as healthcare, education, and the performing arts.15 In these sectors, human labor is the end product; a string quartet still requires four musicians and nine minutes to perform Beethoven, just as it did in the 19th century.15 While manufacturing productivity explodes, these stagnant sectors must still increase wages to compete for labor, leading to costs that outpace inflation.15

The Historical Cost Trajectory of Digital Infrastructure

To understand the context in which generative AI is scaling, it is necessary to examine the long-term price curves of its underlying "bones": cloud storage and internet bandwidth.

Cloud Storage: The 15-Year Descent of Amazon S3

When Amazon S3 (Simple Storage Service) launched in 2006, it offered a revolutionary price of 15 cents per gigabyte per month.18 Over the subsequent decade, intensive competition and the economies of scale described by Moore’s Law drove prices down by approximately 85 percent.20

Date AWS S3 Tier Storage Price ($/GB per month) Cumulative Change
March 2006 Standard (Launch) $0.150 -
Nov 2008 First 50TB $0.150 (Tiers introduced) -
Nov 2010 First 1TB $0.140 -6.7%
Feb 2012 First 1TB $0.125 -16.7%
April 2014 Standard $0.030 -80%
Dec 2016 Standard $0.023 -84.7%
Jan 2021 Standard $0.023 Stagnation 19

Despite this dramatic decline, recent evidence suggests a "silicon plateau" in cloud storage. The price for S3 Standard has remained largely unchanged for nearly eight years.21 While the cost of underlying hard disk drives (HDDs) has continued to fall by approximately 13 percent annually, AWS has arguably lacked the competitive incentive to pass these savings to consumers, instead focusing on "Intelligent Tiering" to optimize margins.20

Internet Transit and Bandwidth

The cost of internet transit has followed a similar "gravitational pull" downward. In 1998, internet transit cost approximately $1,200 per Mbps; by 2015, this had fallen to $0.63 per Mbps.9 This deflation was driven by a massive increase in global traffic—from 15 Gigabytes per month in 1984 to 15 Gigabytes per month per user by 2014.22

Year Transit Price (per Mbps) Annual % Decline
1998 $1,200.00 -
2002 $200.00 50% (Max single year)
2006 $50.00 33%
2010 $5.00 44%
2014 $0.94 40% 9

This consistent decline led to the commoditization of the CDN and transit markets, where margins tended toward zero, forcing providers to seek value in "add-on" services like reliability guarantees and consumption data analytics.9

The Economic Architecture of Generative AI: 2023–2026

Generative AI is unique because it combines the extreme capital intensity of the semiconductor industry (training) with the variable operational costs of a utility (inference). As the landscape transitions from experimental prototyping in 2024 to sustained, industrial-scale deployment in 2026, the industry's economic model is undergoing a radical realignment.23

The Training Factory vs. The Inference Engine

The "Training Factory" phase is defined by massive, one-time capital expenditures (CapEx) required to teach a Large Language Model (LLM) how to think.23 In 2024, training costs for frontier models like GPT-4 ranged from $78 million to $100 million, while Google’s Gemini Ultra 1.0 cost approximately $191 million.25 Doubling a model's size more than doubles its training cost due to the necessity of multi-GPU parallelism, longer convergence times, and the exponential increase in required data.25

However, the industry’s focus is shifting to the "Inference Engine." By 2026, inference workloads—the ongoing operating cost of running AI in the real world—are projected to account for two-thirds of all compute.24 This shift is critical because training creates capability, but inference determines profitability.24

Token Economics: The New KPI

In 2026, the primary metric for AI success has evolved from raw FLOPS (Floating Point Operations Per Second) to "Tokens Per Second per Dollar" (TPS/$).23 Cost per token represents the total cost required to generate a unit of AI output, capturing compute consumption, energy usage, cooling overhead, and infrastructure amortization.27

Model Class Representative Model Input Price ($/1M tokens) Output Price ($/1M tokens)
Budget Tier Gemini Flash-Lite 3.1 $0.075 $0.30
Budget Tier Llama 3.2 3B $0.06 $0.06
Mid-Tier DeepSeek R1 $0.55 $2.19
Mid-Tier Claude 3.5 Sonnet $3.00 $15.00
Frontier Claude 4.5 Opus $5.00 $25.00
Frontier (2023) GPT-4 (Initial) $30.00 $60.00

Data compiled from.28

While headlines celebrate a 10x annual decline in token prices—faster than the deflation of PC compute or dotcom bandwidth—the total bills for enterprises are climbing.28 This is the "Token Consumption Paradox": as per-token prices drop, the number of tokens consumed by modern "reasoning" models is exploding. Models like the OpenAI o1 series may consume 100x more internal "thinking" tokens than they output, creating a scenario where cheaper unit prices lead to higher total invoices.28

The Physical Wall: Energy and Infrastructure Constraints

The most significant threat to the continued cheapening of AI technology is the "Shift from Silicon to Watts".30 By 2026, the constraints on the AI boom have shifted from the availability of chips to the availability of electricity and grid capacity.30

The Energy Shortfall and Grid Dysfunction

Modern AI data centers operate far more like industrial-scale power consumers than traditional office-server facilities.31 A single AI-focused data center can demand 50 to 100 megawatts of electricity—comparable to the load of a manufacturing plant or a small city.31

Metric 2024 Value 2026 Projection 2030 Projection
Global DC Power Use ~1.5% of total ~2% (>500 TWh) -
US DC Power Demand 25 GW 45 GW 74 - 120 GW
Projected US Shortfall - Emerging 49 GW

Data derived from.32

The "temporal mismatch" between data center construction (under two years) and transmission infrastructure permitting/construction (15 to 30 years) has created grid dysfunction.31 In the PJM region of the U.S., data center demand has increased energy market costs by $9.3 billion, translating into an additional $18 per month on some household electricity bills.32 To avoid this "power cliff," hyperscalers like Microsoft, Google, and Meta are scrambling to secure long-term power purchase agreements (PPAs) and are increasingly pursuing a "self-generation model" (BYOG - Bring Your Own Generator) involving natural gas, solar, and nuclear power.30

The Resource Entropy of Data

A second physical limit is the "entropy of internet text." Research suggests that internet text contains approximately 1.82 bits of information per token.36 As models improve, the gap between their current performance and this "irreducible loss" (E) shrinks, leading to diminishing returns.36 When model loss falls close to the entropy of the dataset, there is less signal available for the model to learn from, making further scaling exponentially more expensive in terms of both compute and data acquisition.36

Software and Architectural Mitigation Strategies

In response to these physical and economic constraints, the industry is pivoting toward "advanced packaging" and software optimization to sustain performance gains.

Advanced Packaging and Chiplets

As Moore’s Law reaches its limit at 3nm, the industry has shifted from pure-play transistor scaling to a system-level approach.37 "Advanced packaging" technologies, such as TSMC's CoWoS (Chip-on-Wafer-on-Substrate), allow multiple specialized "chiplets" to be stacked together.37 This bypasses the need for traditional external cables, reducing latency and time overhead while delivering performance scaling from 1x to over 40x.37

The P-KD-Q Optimization Sequence

Enterprises are increasingly adopting the "P-KD-Q" (Pruning → Knowledge Distillation → Quantization) sequence to reduce the Total Cost of Ownership (TCO) of AI deployments.38

  1. Pruning: Removes redundant parameters to achieve 50-60 percent sparsity with minimal accuracy loss.38
  2. Knowledge Distillation: A large "teacher" model trains a smaller "student" model to mimic its logic. Well-distilled models (7B–20B parameters) can solve up to 80-90 percent of reasoning queries previously sent to 70B+ models.38
  3. Quantization: Reduces the precision of weights (e.g., from FP16 to INT4). This can cut inference costs by 75 percent while maintaining 95 percent of model quality.40

When combined with "speculative decoding"—using a smaller model to predict the likely output of a larger one—these techniques can reduce latency by 2x to 3x and energy usage by up to 73 percent.29

Regulatory and Environmental Friction in 2026

The year 2026 marks the arrival of the "regulatory invoice" for AI. The European Union AI Act, which entered into force in 2024, becomes fully applicable by August 2026.42

The EU AI Act Risk Hierarchy

The Act imposes a risk-based regulatory framework that significantly impacts operational costs:

  • Unacceptable Risk: Banned applications, including social scoring and harmful manipulation.43
  • High Risk: Systems used in critical infrastructure, education, and employment. These must be registered in an EU database and undergo strict conformity assessments.42
  • General-Purpose AI (GPAI): Foundational models like GPT-4 face specific transparency obligations and must report energy consumption and technical data.42
Compliance Requirement Sector Cost Implication
AI Literacy Training All EU Firms Aug 2025 Start 43
High-Risk Registration Infrastructure/Med Operational Overhead 42
Energy Transparency GPAI Providers Mandatory Audits 45
Non-Compliance Penalty All Up to 7% of Turnover 43

The "Subsidy Cliff" and Energy Accountability

In the United States, legislation like the PRICE Act (New Jersey/Texas) proposes requiring data centers to generate their own renewable energy and transition to 100 percent carbon-free sources by 2040.46 State regulators are shifting from a model of "unconditional incentives" to "accountability," where data center developers must pay for the grid infrastructure upgrades their facilities necessitate.47 This transition creates a "subsidy cliff," fundamentally altering the internal rate of return (IRR) for new AI infrastructure projects.47

Synthesis: Will Generative AI Follow the Deflationary Pattern?

The convergence of historical patterns and current trends suggests that generative AI is experiencing a bifurcation in its cost structure.

The Path of Digital Logic (Cheaper in the Long Term)

The "raw" unit of intelligence—the individual token generated—is following a classic deflationary path driven by Wright’s Law and intensive market competition. The 10x annual drop in token pricing suggests that for basic, high-volume tasks (summarization, simple coding, translation), AI will become as cheap and ubiquitous as internet bandwidth or cloud storage.48 The "democratization of intelligence" is real, as open-source models through providers like Together.ai achieve performance comparable to 2023's frontier models at 1/1000th of the cost.29

The Path of Heavy Infrastructure (More Expensive in the Long Term)

Conversely, the "frontier" of AI capability is following the path of the nuclear industry and heavy infrastructure. The "Physical Wall" of energy, the "Rock’s Law" of semiconductor manufacturing, and the "Eroom-like" diminishing returns of data entropy mean that leading-edge capability is becoming structurally more expensive to produce.10

  1. The "Inference Iceberg": The total spend for enterprises is rising despite lower unit costs because the complexity of "agentic" and "reasoning" workloads requires exponentially more tokens.24
  2. The "Siphon Effect": Just as high-speed rail draws resources to major cities at the expense of rural counties, AI investment is concentrating in geographic "hotspots" where power is available, creating a new class of digital inequality.47
  3. The "Regulatory Paradox": Transparency and safety requirements, while necessary for "human-centric AI," introduce the same bureaucratic friction that has slowed the pharmaceutical and construction industries.11

Conclusion: The New Economic Equilibrium

Technology does not always become cheaper; it only becomes cheaper when it is modular, mass-produced, and operating within a regime of high productivity gains. Generative AI is currently the most dynamic technology in history because it acts as a bridge between the deflationary digital world and the stagnant physical world. By 2026, the industry will have moved past the initial hype to a "fundamentals-based" economy.30 The "AI bill" will come due for customer experience leaders, who must navigate usage-based volatility and premium "gated" intelligence tiers.24 While the cost per token will likely continue its descent toward the marginal cost of energy, the "total cost of intelligence" for a meaningfully transformed enterprise will remain a substantial, and potentially escalating, capital commitment. Success in this new era will depend less on advances in pure computing and more on the ability to modernize the legal, institutional, and energy frameworks that underpin the global power system.31

Sources

  1. Moore's law - Wikipedia
  2. Chapter: Moore's Law and the Economics of Semiconductor Price Trends - National Academies of Sciences, Engineering, and Medicine
  3. Wright's Law - why Moore's Law is outdated - Systems Engineering Trends
  4. Learning curves: What does it mean for a technology to follow Wright's Law? - Our World in Data
  5. learningCurve
  6. New study refocuses learning curve analysis on LCOE rather than up-front installed costs in order to provide a more-holistic view of technology advancement | Energy Markets & Planning - Lawrence Berkeley National Laboratory
  7. Drive Down the Cost: Learning by Doing and Government Policies in the Global EV Battery Industry*
  8. Levelized cost-based learning analysis of utility-scale wind and solar in the United States
  9. DrPeering White Paper - Internet Transit Prices - Historical and Projections
  10. From Moore's Law to Market Rivalry: The Economic Forces That Shape the Semiconductor Manufacturing Industry
  11. Eroom's law - Wikipedia
  12. Moore's and Eroom's Law in a Graph -Skyrocketing Pharma R&D Costs Despite Quantum Leaps in Technology | by Bálint Botz | Medium
  13. Eroom's Law in the Pharmaceutical Industry – And How AI Can Beat It - Quantiphi
  14. Stagnation, Drugs and Eroom's law - Lindus Health
  15. Baumol's cost disease: long-term economic implications where machines - UNESCO
  16. Can AI cure healthcare's Baumol's cost disease? - IG&H
  17. Information technology and Baumol's cost disease in healthcare services: a research agenda - Emerald Publishing
  18. AWS History and Timeline regarding Amazon S3 - Focusing on the evolution of features, roles, and prices beyond mere storage
  19. Cloud Providers Pricing Over Time - DataHub
  20. Twenty years of Amazon S3 and building what's next | AWS News Blog
  21. S3 last lowered its price 8 years ago : r/aws - Reddit
  22. The History and Future of Internet Traffic - Cisco Blogs
  23. On-Premise vs Cloud: Generative AI Total Cost of Ownership (2026 ...
  24. The AI Bill Comes Due: Will Costs Derail CX Innovation in 2026?
  25. Cost of Training LLM From Scratch in 2026: Real Numbers
  26. Why AI's next phase will likely demand more ... - Deloitte
  27. The Economics of AI Compute: Why Cost Per Token Is the New KPI - Datacenters.com
  28. LLM Inference Cost 2026: Complete Pricing Guide
  29. Inference Unit Economics: The True Cost Per Million Tokens | Introl Blog
  30. The AI market is shifting from 'silicon' to 'physical limits... - moomoo Community
  31. AI Data Centers and the Looming Energy Crisis in the United States
  32. Growing Energy Demand of AI - Data Centers 2024–2026 - TTMS
  33. Energy Markets Race to Solve the AI Power Bottleneck | Morgan Stanley
  34. AI scale and climate commitments: A 2026 outlook | Carbon Direct
  35. The Past, Present, and Future of Nuclear Technology - Excelsior University
  36. On AI Scaling — LessWrong
  37. AI's Semiconductor Revolution: Beyond Moore's Law - Montaka Global
  38. Model distillation for LLMs: A practical guide to smaller, faster AI - Redis
  39. LLM Inference Optimization Techniques - Redwerk
  40. How AI Inference Costs Are Reshaping The Cloud Economy - Forbes
  41. Best Tools for Managing AI Inference Costs in 2025 - Flexprice
  42. EU Artificial Intelligence (AI) Act - Government of Ireland
  43. 2026 Guide to AI Regulations and Policies in the US, UK, and EU
  44. EU AI Act - Updates, Compliance, Training
  45. Aligning AI Adoption with ESG and Environmental Impact - CMS
  46. Menendez, Casar Introduce Groundbreaking Legislation to Protect Americans From Financial and Environmental Impacts of AI Data Centers
  47. Data Center Regulation 2026: Why States Demand Accountability - EnkiAI
  48. The cost of AI is decreasing - Ramp
  49. The Falling Cost of AI Favors Software Companies - Harding Loevner
  50. The Impact of High-Speed Rail on Economic Development: A County-Level Analysis - MDPI
  51. Cost overruns and delays in infrastructure projects: the case of Stuttgart 21

Researched with Google Gemini Deep Research, prompted and edited by Giorgio Polvara.