News Article · Jun 15, 2026 at 3:40 PM

3 min read 0

Member

Industry #Anthropic #India #AI coding agents #KPMG #hallucinations #SWE-Explore #IQM #quantum computing

AI coding agents and consulting reports both face reliability crises, new studies show

A new benchmark finds AI coding agents reliably find files but miss the exact lines that need fixing. Separately, KPMG pulled a report containing fabricated AI case studies, highlighting a broader reliability problem in AI-generated content.

Listen to this article 3 min

Two new studies published this week expose significant reliability gaps in AI systems used for coding and consulting, raising questions about how ready these tools are for critical tasks. The findings come as companies race to deploy AI agents in production environments.

The SWE-Explore benchmark, the first to test code search separately from repair, found that AI coding agents such as Claude Code and OpenAI Codex correctly identify the target file in most cases but miss the exact lines that need modification in 60 percent of attempts. Without precise line-level context, even the best automated fix fails.

Fabricated case studies erode trust in consulting AI

In a separate incident, KPMG published and then pulled a report titled "Redefining excellence in the age of agentic AI" after GPTZero and the Financial Times discovered it contained fabricated case studies. The report falsely claimed that UBS, the UK's National Health Service, Swiss Federal Railways, and Transport for London had deployed AI in specific ways. All four organizations denied the claims.

GPTZero CEO Edward Tian warned that flawed reports from major consulting firms spread what he calls "secondary hallucinations" because they are considered highly credible and get recycled by both AI systems and human readers. GPTZero also flagged sloppy sourcing, including citations that were loose paraphrases of real sources or had no matching original at all. The firm calls this "vibe citing," a problem that also plagues Google's AI Overviews.

KPMG has removed the report from multiple websites.
The incident is doubly embarrassing for KPMG: it spread misinformation and demonstrated it cannot handle the AI tools it sells to clients.
GPTZero identified the errors using its own detection tools; the Financial Times independently verified them.

Access and hardware hurdles compound the picture

Meanwhile, Anthropic suspended access to its newest AI models in India, sparking debate among Indian tech leaders about the country's AI sovereignty and dependence on foreign models. The move underscores how geopolitical and regulatory factors can disrupt access to the very tools companies are being urged to adopt.

On the hardware front, Finnish quantum computing company IQM appointed Barbara Venneman, a Vanguard board director, to its board as it approaches what would be the first Nasdaq listing by a European quantum computing firm. Quantum computing could eventually accelerate AI training and inference, but the technology remains years from mainstream use.

The convergence of reliability failures in both coding agents and consulting reports, combined with access restrictions and nascent hardware, suggests that AI coding and agentic systems still face substantial hurdles before they can be trusted in high-stakes environments. Companies deploying these tools today must verify outputs manually and maintain human oversight.

Fact check

The SWE-Explore benchmark found that AI coding agents miss the exact lines that need fixing in 60 percent of attempts.

reported · source
KPMG published a report containing fabricated case studies about UBS, the NHS, Swiss Federal Railways, and Transport for London.

verified · source
Anthropic suspended access to its newest AI models in India.

reported · source
IQM appointed Barbara Venneman to its board as it approaches a Nasdaq listing.

reported · source

Source reporting (9)

0 Comments

No comments yet

Be the first to share your thoughts on this article.

Join the conversation

You need to be registered and logged in to comment on blog articles.

AI sovereignty push accelerates as nations race to break cloud dependency

Jun 15, 2026

Fox to acquire Roku in $22 billion deal, creating third-largest TV company in US

Jun 15, 2026

Anthropic Pulls Claude Fable 5 and Mythos 5 Globally After US Government Export Order

Jun 15, 2026

Back to News Desk

AI coding agents and consulting reports both face reliability crises, new studies show

Fabricated case studies erode trust in consulting AI

Access and hardware hurdles compound the picture

Fact check

Source reporting (9)

0 Comments

Related Articles

AI sovereignty push accelerates as nations race to break cloud dependency

Fox to acquire Roku in $22 billion deal, creating third-largest TV company in US

Anthropic Pulls Claude Fable 5 and Mythos 5 Globally After US Government Export Order

Who Is Online