Monolithic 3D AI Chip with 330 GB On-Die DRAM Challenges HBM Dominance
PhantaField unveils a monolithic 3D AI ASIC with 330 GB on-die DRAM, claiming up to 53x faster inference than HBM-bound GPUs. Meanwhile, Tesla's modular data center kit hits trademark and competition snags.
PhantaField has released the whitepaper for its Sophon PFG-1, a monolithic 3D AI ASIC that packs 330 GB of on-die DRAM and delivers 2,100 TFLOPS of BF16 compute, aiming to eliminate the memory wall that plagues HBM-bound GPUs. The chip, built on a 28 nm CMOS base with 32 tiers of 2D-TMD transistors, uses compute-in-memory architecture to keep weights and optimizer state on-die, enabling both training and inference on a single die.
According to the whitepaper, the Sophon PFG-1 achieves 7,219 tokens per second in BF16 inference on an 80 billion parameter model, compared to roughly 150 tokens per second for an NVIDIA Rubin GPU at low batch sizes. This represents a 48x to 53x improvement in single-stream FP8 decode throughput, driven by an on-die bandwidth that exceeds HBM4 by over 190x.
Memory wall breakthrough in AI accelerators
The chip's design addresses a fundamental bottleneck: inference is read-dominated and bound by weight memory bandwidth, while training requires writable memory with endurance beyond what non-volatile alternatives offer. The 2T0C gain-cell DRAM stores data for seconds without refresh, drawing only about 3 W at idle, and supports unlimited write cycles at 20 fJ per bit.
- PhantaField claims a peak efficiency of 3.72 TFLOPS/W in BF16 training, compared to approximately 1 TFLOPS/W for current HBM4 GPUs, due to lower energy per MAC and reduced power overhead.
- The die size is 750 mm², with the entire memory and compute integrated on a single substrate, eliminating HBM packages and their associated cost, power, and bandwidth constraints.
- Morgan Stanley estimates a single NVIDIA Rubin NVL72 rack costs $7.8 million, with HBM memory accounting for $2 million (25.7%). The Sophon PFG-1's BOM is projected at $8,358, offering a potential 9.9x hardware cost reduction.
Market headwinds for new infrastructure
While PhantaField's chip promises to reshape AI hardware, other trends reveal friction in the data center ecosystem. Tesla's Megapod modular AI data center kit, proposed by Elon Musk, faces trademark conflicts with existing products and competition from established modular builders, complicating its path to market. Separately, Musk's acquisition of Mesh Optical, an optics transceiver business founded by SpaceX employees, received FTC approval, signaling continued investment in data center interconnect technology.
DeepSeek detailed its DSpark speculative decoding framework for V4 models, claiming up to 85% faster inference. Tested on Gemma and Qwen models, the framework reduces token latency by generating candidate tokens in parallel, a technique that complements hardware improvements like those in Sophon.
On the macroeconomic side, Japan saw only 18 IPOs in H1 2026, the lowest since 2011, partly due to a scarcity of AI, data center, and chip startups. The lack of new ventures in these high-growth sectors dampens the IPO market despite surging stock prices.
The PhantaField design moves to tape-out later this year, with initial samples expected in 2027. Industry analysts caution that scaling monolithic 3D production and competing with entrenched GPU ecosystems will be its largest hurdles.
Fact check
-
PhantaField's Sophon PFG-1 achieves 7,219 tokens per second in BF16 inference on an 80B parameter model.
reported · source
-
The chip has 330 GB of on-die DRAM and uses a 2T0C gain-cell architecture.
verified · source
-
Tesla's Megapod modular AI data center kit faces trademark conflicts and established competition.
reported · source
-
DeepSeek's DSpark framework speeds up AI inference by up to 85%.
reported · source
-
Japan saw only 18 IPOs in H1 2026, partly due to a lack of AI, data center, and chip startups.
reported · source
Source reporting (5)
- Hacker News Front Page · Sophon PFG-1: a monolithic-3D AI ASIC with 330 GB of on-die DRAM and no HBM
- TechRadar Pro · Megapod is the modular AI data center kit that Elon Musk's Tesla wants to sell — but there's a tiny problem (actually, three)
- Techmeme · DeepSeek details DSpark, a speculative decoding framework for its V4 models, saying it speeds up AI inference by up to 85% and was tested on Gemma and Qwen (Ben Jiang/South China Morning Post)
- Data Center Dynamics · Elon Musk gets FTC nod to acquire data center optics business Mesh Optical
- Techmeme · Dealogic: Japan saw 18 IPOs in H1 2026, the lowest since 2011, despite stock market surges, partly due to Japan's lack of AI, data center, and chip startups (David Keohane/Financial Times)
Join the conversation
You need to be registered and logged in to comment on blog articles.
0 Comments
No comments yet
Be the first to share your thoughts on this article.