We’ve spent six weeks tracing AI chips from raw silicon to packaged perfection. But don’t relax yet — the hardest part is still ahead.
Final assembly — a brutal gate standing between AI ambition and actual revenue — is one of the strongest limiting factors in AI capacity growth.
A Month Before The Crash
Over the past 25 years, I've made it my mission to speak up when something feels off in the markets.
A month before the dot-com bubble burst, I published a warning essentially saying: "This can't last."
In 2008, I rang the alarm on housing calling the fall of Bear Stearns and Lehman Brothers.
I've exposed shady CEOs, market frauds, and financial bubbles before most investors saw the cracks.
Eventually, CNBC gave me a nickname I didn't ask for: "The Prophet."
But what I see happening right now... it's much bigger.
Some are even calling it, "The bubble to burst them all."
And that's why I've stepped forward in a way I never have before... to show you exactly what's coming... and how to stay on the right side of it.
Because if I'm right again — and I've put together all my proof for you — this may be your final chance to prepare.
From Silicon to Sellable (If Everything Goes Right)
Following fabrication and packaging, the GPU undergoes final assembly, where it is combined with high-bandwidth memory, power management systems, thermal solutions, and networking interfaces.
At the board level, GPUs are mounted on accelerator modules requiring more than 20-layer PCBs. These modules are then integrated into server baseboards, often exceeding 26 layers, and can host up to eight GPUs per system.
A single AI rack is valued at over $3 million. A failure late in testing turns real money into ashes.
The Yield Math
The truth is that not every assembled accelerator becomes a flagship product — even when wafers and memory are available.
Yield is crucial, and the math is brutal.
Let’s be optimistic and assume the chiplets in a design have a 95% yield. Then a package with:
5 chiplets delivers only 77% usable units
10 chiplets delivers only 60% usable output
These losses occur before system-level testing even begins.
Further attrition hits high-end accelerators during burn-in and system tests. Failures in power stability, thermal tolerance, or multi-GPU coordination result in chips being downgraded — or scrapped entirely.
No wonder headline wafer capacity doesn’t translate directly into shipped accelerators. The conversion rate is far lower than most realize.
Memory: Still the Primary Chokepoint
Memory remains the most significant constraint at final assembly.
All major suppliers — including Micron and Samsung — have fully allocated HBM capacity through 2026.
HBM3 and HBM3E yields remain constrained, with usable output often ranging between 40–60%. This limits GPU performance despite silicon readiness.
To understand the scarcity, just look at pricing:
HBM prices: Up nearly 20% year-over-year
Server DRAM prices: Up 50%+ in some segments
Hyperscalers: Receiving only ~70% of requested DRAM volumes
GDDR7 shortages: Cutting gaming GPU production by nearly 40%
It’s not just AI taking the hit. Memory constraints are impacting entire markets, forcing brutal resource-allocation decisions.
Packaging: The Sequential Trap
Packaging has become a major contributor to painful bottlenecks.
Demand for TSMC’s CoWoS packaging grew more than 100% year-over-year in 2025. Capacity is fully booked through mid-2026, while equipment lead times for advanced packaging tools exceed 12 months.
This creates a sequential gap: GPUs wait for packaging slots even when wafers and memory are available, compounding delays and pushing shipments into later quarters.
A delay at one stage becomes a delay at every stage.
The Market Is Trying to Break Through
Let’s look at market forecasts constrained by physical throughput:
2026 AI accelerator market estimates: $13.9B–$39B (methodology dependent)
CAGR through the early 2030s: Consistently 23–30%+
End-of-decade projections: $40B–$90B
These figures assume gradual relief in memory and packaging constraints.
If that relief doesn’t materialize, shipment volumes will decline — because physics won’t allow growth, no matter how much demand exists.
The New Constraints: Power and Cooling
Assembly constraints don’t stop at chips and memory — they now extend to infrastructure.
AI racks require 50–100 kW per rack, making liquid cooling a necessity. Yet the supply chain is unprepared to scale advanced liquid-cooling systems at speed.
Shortages persist in:
Power shelves
Busbars
High-voltage transformers
A rack can be fully assembled and still sit idle if data centers lack certified cooling infrastructure.
If you can’t cool it, you can’t control it.
Who Controls the Final Gate
Design Control — Where the Money Is
NVIDIA: Estimated 80–92% share of the AI accelerator market
AMD / Intel: Not immediate challengers
Hyperscalers: Designing custom accelerators but reliant on the same ecosystem
Manufacturing — Where the Margin Isn’t
Contract assemblers: Foxconn, Quanta, Wistron
OEMs: Dell, HPE
Component specialists: Vertiv, Amphenol
Profitability follows design, not assembly — and scarcity beats volume.
Your Investment Cheat Sheet
The Core Truth
Memory yields, packaging slots, power hardware, and testing throughput make final assembly the last hard gate between AI ambition and AI revenue.
Three Critical Signals to Watch
Memory Pricing & Allocation
- Continued HBM sell-outs through 2026 signal sustained pricing power
- Any easing in allocation is bearish for memory suppliersPackaging Utilization
- CoWoS expansion announcements indicate future shipment relief
- Reduced oversubscription may signal a shifting bottleneckMargin Divergence
- Rising revenue but falling gross margins at assemblers reflect bottleneck pressure- Margin compression offers no structural advantage — even amid growth
Where to Invest
Winners
Designers like NVIDIA with platform lock-in
Scarce-input suppliers such as HBM makers and substrate providers
Component specialists like Vertiv and Amphenol
Losers
Volume assemblers (including Super Micro and contract manufacturers)
Server OEMs without differentiation
Companies without secured component allocation
Warning Signs
Margin compression at chip designers
Inventory build-ups at assemblers
Extended testing delays
Bullish Signals
Accelerating HBM yield improvements
Packaging capacity additions outpacing demand growth
Liquid-cooling infrastructure scaling faster than expected
The Bottom Line
Seven stages — and each one we’ve examined over the past weeks — has revealed where value truly lies.
In final assembly, the truth is undeniable: designers and scarce-input suppliers capture the profits. Assemblers capture volume — but not attractive margins.
Before investing, focus on shipment numbers, not wafer starts. Constraints matter more than demand headlines.
In AI growth, percentages, lead times, and megawatts are what truly matter.
Ultimately, the companies that control the final gate decide the winners.
Important disclosures: This newsletter is provided for informational purposes only and does not constitute investment advice. All investments involve risk, including possible loss of principal. Please consult with your financial advisor before making investment decisions.
