News Article · Jun 8, 2026 at 1:49 PM
3 min read 0
Member
DigitalOcean, Broadcom, FuriosaAI Target Cost and Placement Bottlenecks in AI Inference
News #DigitalOcean #AI inference #Ethernet fabric #Broadcom #FuriosaAI #model routing #edge inference #batch inference

DigitalOcean, Broadcom, FuriosaAI Target Cost and Placement Bottlenecks in AI Inference

DigitalOcean's Inference Router and Batch Inference aim to cut AI inference costs by routing requests to the right model and handling async workloads. Meanwhile, Broadcom and FuriosaAI ship Ethernet-based rack-scale hardware, and Akamai argues that inference placement is the next major infrastructure constraint.

DigitalOcean, Broadcom and FuriosaAI, and Akamai each announced new products or analyses this month that target the rising cost and complexity of AI inference. DigitalOcean released an Inference Router and Batch Inference service, Broadcom and FuriosaAI shipped an Ethernet-based rack-scale inference platform, and Akamai published a report arguing that compute placement has become the primary bottleneck for distributed inference.

DigitalOcean's Inference Router, now in public preview, allows developers to set a single endpoint that analyzes each request and routes it to the cheapest model capable of completing the task. The company said that without such routing, coding agents often route trivial requests to expensive frontier models, driving up token usage and inference costs. Batch Inference, also announced at Deploy 2026, processes high-volume asynchronous workloads at a fraction of the cost of synchronous requests, targeting use cases such as data transformation, content generation, and offline evaluations.

Ethernet and Chiplets Versus Proprietary Fabrics

Broadcom and FuriosaAI are betting that AI inference infrastructure will shift away from proprietary InfiniBand fabrics toward standard Ethernet. Their jointly developed rack-scale platform uses FuriosaAI's RNGD chips, which employ a chiplet architecture, and Broadcom's Jericho3-AI ethernet switches. The platform aims to reduce latency and power consumption compared to GPU-heavy alternatives. FuriosaAI claims its chips deliver high token generation throughput per watt, a metric that becomes more important in inference workloads than raw training performance.

  • DigitalOcean's Inference Router supports presets and custom routing rules via API or UI. No GPU management is required.
  • Batch Inference enables asynchronous processing of large datasets without hitting synchronous rate limits.
  • OpenCode, an open-source coding agent with over 160,000 GitHub stars, now supports DigitalOcean's Inference Router natively.
  • Broadcom and FuriosaAI's platform is designed for rack-scale deployments and targets power efficiency as a differentiator.
  • Akamai cites data showing that even modest distribution of inference workloads across 10 to 20 edge locations can reduce response times by 40 percent, but only if the placement logic accounts for network latency and data locality.

Placement Becomes the New Constraint

Akamai's analysis argues that as inference moves to edge and distributed environments, the bottleneck shifts from raw compute to how and where models are placed. The company's report notes that many developers still provision inference capacity centrally, ignoring factors such as user geography, data gravity, and dynamic load patterns. Akamai recommends embedding placement optimization into the deployment pipeline, rather than treating it as an operational afterthought.

DigitalOcean's router addresses part of this problem by directing requests based on latency, cost, and quality tradeoffs. Broadcom and FuriosaAI address hardware density and power. The combination suggests that the next phase of AI infrastructure will involve not just faster chips, but smarter routing and placement orchestration. All three companies are shipping product or analysis now, with the router and batch inference available to DigitalOcean customers immediately, and Broadcom and FuriosaAI's platform targeting deployments in the second half of 2026.

Fact check

  • DigitalOcean Inference Router is now in public preview.

    verified · source

  • Batch Inference processes high-volume asynchronous workloads at a fraction of the cost of synchronous requests.

    reported · source

  • Broadcom and FuriosaAI are building a rack-scale platform using FuriosaAI's RNGD chips and Broadcom's Jericho3-AI ethernet switches.

    verified · source

  • OpenCode has over 160,000 GitHub stars and now supports DigitalOcean's Inference Router.

    verified · source

  • Akamai's analysis shows that distributing inference across 10 to 20 edge locations can reduce response times by 40 percent.

    reported · source

Source reporting (4)

0 Comments

No comments yet

Be the first to share your thoughts on this article.

Join the conversation

You need to be registered and logged in to comment on blog articles.

Who Is Online

In total there are 81 users online: 0 registered, 75 guests and 6 bots.

Bots: Facebook Googlebot Majestic Other Bot Other Spider SemrushBot

Users active in the past 15 minutes. Total registered members: 340