Databricks OpenSharing, Google DiffusionGemma, and DigitalOcean AMD Inference Lead AI Infrastructure Push
Databricks launches OpenSharing for AI agent data sharing, Google unveils 4x faster DiffusionGemma, and DigitalOcean optimizes AMD GPUs for inference performance.
Databricks, Google, and DigitalOcean each announced new AI platform capabilities this week, targeting distinct bottlenecks in model development and deployment. Databricks introduced OpenSharing, a protocol for AI agent skills. Google released DiffusionGemma, a text diffusion model claiming 4x speed gains over prior Gemma models. DigitalOcean explained how it optimizes AMD GPUs for large language model inference.
Databricks launched OpenSharing on Wednesday as the successor to its open source Delta Sharing protocol. OpenSharing adds support for Apache Iceberg and Unity Catalog, enabling AI agents to directly access structured and unstructured data without file attachments or complex integration code.
OpenSharing targets the "email me a file" friction for AI agents
Databricks aims to solve data portability for AI agents. OpenSharing allows agents to retrieve data on demand from shared catalogs, tables, and volumes. The protocol builds on REST APIs and supports SQL queries, file access, and streaming ingestion. Initial users report reduced integration overhead for multi-agent workflows.
- OpenSharing extends Delta Sharing with Iceberg and Unity Catalog support
- Agents can query shared tables and volumes via REST without custom connectors
- Databricks positions the protocol as a standard for agent-to-data communication
Google DiffusionGemma speeds text generation with diffusion approach
Google released DiffusionGemma, a family of text diffusion models that generate output 4x faster than its standard autoregressive Gemma models, according to internal benchmarks. The models use iterative denoising rather than token-by-token prediction, allowing parallel token generation. Google demonstrated the approach at I/O last year but went quiet on the technology until this week. DiffusionGemma is available on Hugging Face and Google Cloud Vertex AI.
DiffusionGemma targets latency-sensitive applications where autoregressive generation is too slow. The model uses 1.5B parameters and is optimized for text completion, summarization, and structured output tasks. Google claims the diffusion approach maintains output quality while reducing time to first token by roughly 75 percent.
DigitalOcean pushes AMD GPU performance for frontier models
DigitalOcean published a detailed technical post explaining how it achieves inference speed improvements on AMD GPUs for frontier open weight models like Llama 3 and Mixtral. The company frames inference performance as a systems-level challenge spanning runtime execution, memory hierarchy, scheduling, and decoding strategy. DigitalOcean says it finds significant "performance alpha" through speculative decoding and custom memory management.
DigitalOcean hosts frontier LLMs on AMD MI250 and MI300X GPUs. The company says peak output speed depends on tight integration between model architecture and runtime optimization, not just raw hardware specs. DigitalOcean plans to publish benchmark results comparing its AMD inference stack with standard deployment configurations.
The three announcements reflect a broader shift: AI infrastructure providers are racing to reduce friction at the data integration layer, accelerate generation speed, and optimize for alternative GPU architectures. Enterprises evaluating AI platforms now have more options for agent data sharing, faster model inference, and non Nvidia GPU deployment.
Fact check
-
Databricks launched OpenSharing on Wednesday as the successor to its open source Delta Sharing protocol.
verified · source
-
Google DiffusionGemma generates output 4x faster than its standard autoregressive Gemma models.
reported · source
-
DiffusionGemma uses 1.5B parameters.
reported · source
-
DigitalOcean hosts frontier LLMs on AMD MI250 and MI300X GPUs.
reported · source
-
DigitalOcean says speculative decoding and custom memory management improve inference speed.
reported · source
Source reporting (5)
- The New Stack · Google’s DiffusionGemma is 4x faster than its other Gemma models
- The New Stack · Databricks wants to kill the “email me a file” problem for AI agent skills
- Cloudflare Blog · Route public traffic to private applications with Cloudflare
- DigitalOcean blog · The Inference Alpha: Maximizing Frontier Models on AMD
- The Register · Memory and personalization make AI more likely to tell you what you want to hear
Join the conversation
You need to be registered and logged in to comment on blog articles.
0 Comments
No comments yet
Be the first to share your thoughts on this article.