News Article · Jun 12, 2026 at 6:43 PM

3 min read 0

Member

Cloud #DigitalOcean #AI inference #Databricks #Google #DiffusionGemma #AMD GPU #OpenSharing #Delta Sharing

Databricks OpenSharing, Google DiffusionGemma, and DigitalOcean AMD Inference Lead AI Infrastructure Push

Databricks launches OpenSharing for AI agent data sharing, Google unveils 4x faster DiffusionGemma, and DigitalOcean optimizes AMD GPUs for inference performance.

Listen to this article 4 min

Databricks, Google, and DigitalOcean each announced new AI platform capabilities this week, targeting distinct bottlenecks in model development and deployment. Databricks introduced OpenSharing, a protocol for AI agent skills. Google released DiffusionGemma, a text diffusion model claiming 4x speed gains over prior Gemma models. DigitalOcean explained how it optimizes AMD GPUs for large language model inference.

Databricks launched OpenSharing on Wednesday as the successor to its open source Delta Sharing protocol. OpenSharing adds support for Apache Iceberg and Unity Catalog, enabling AI agents to directly access structured and unstructured data without file attachments or complex integration code.

OpenSharing targets the "email me a file" friction for AI agents

Databricks aims to solve data portability for AI agents. OpenSharing allows agents to retrieve data on demand from shared catalogs, tables, and volumes. The protocol builds on REST APIs and supports SQL queries, file access, and streaming ingestion. Initial users report reduced integration overhead for multi-agent workflows.

OpenSharing extends Delta Sharing with Iceberg and Unity Catalog support
Agents can query shared tables and volumes via REST without custom connectors
Databricks positions the protocol as a standard for agent-to-data communication

Google DiffusionGemma speeds text generation with diffusion approach

Google released DiffusionGemma, a family of text diffusion models that generate output 4x faster than its standard autoregressive Gemma models, according to internal benchmarks. The models use iterative denoising rather than token-by-token prediction, allowing parallel token generation. Google demonstrated the approach at I/O last year but went quiet on the technology until this week. DiffusionGemma is available on Hugging Face and Google Cloud Vertex AI.

DiffusionGemma targets latency-sensitive applications where autoregressive generation is too slow. The model uses 1.5B parameters and is optimized for text completion, summarization, and structured output tasks. Google claims the diffusion approach maintains output quality while reducing time to first token by roughly 75 percent.

DigitalOcean pushes AMD GPU performance for frontier models

DigitalOcean published a detailed technical post explaining how it achieves inference speed improvements on AMD GPUs for frontier open weight models like Llama 3 and Mixtral. The company frames inference performance as a systems-level challenge spanning runtime execution, memory hierarchy, scheduling, and decoding strategy. DigitalOcean says it finds significant "performance alpha" through speculative decoding and custom memory management.

DigitalOcean hosts frontier LLMs on AMD MI250 and MI300X GPUs. The company says peak output speed depends on tight integration between model architecture and runtime optimization, not just raw hardware specs. DigitalOcean plans to publish benchmark results comparing its AMD inference stack with standard deployment configurations.

The three announcements reflect a broader shift: AI infrastructure providers are racing to reduce friction at the data integration layer, accelerate generation speed, and optimize for alternative GPU architectures. Enterprises evaluating AI platforms now have more options for agent data sharing, faster model inference, and non Nvidia GPU deployment.

Fact check

Databricks launched OpenSharing on Wednesday as the successor to its open source Delta Sharing protocol.

verified · source
Google DiffusionGemma generates output 4x faster than its standard autoregressive Gemma models.

reported · source
DiffusionGemma uses 1.5B parameters.

reported · source
DigitalOcean hosts frontier LLMs on AMD MI250 and MI300X GPUs.

reported · source
DigitalOcean says speculative decoding and custom memory management improve inference speed.

reported · source

Source reporting (5)

0 Comments

No comments yet

Be the first to share your thoughts on this article.

Join the conversation

You need to be registered and logged in to comment on blog articles.

Hidden GPU pipeline failures cost genomics teams up to 25% more per sample, Nebius warns

Jun 12, 2026

AWS launches Claude Fable 5, first generally available Mythos-class AI model