KPMG's AI Report Hallucinated Itself — and the Big Four Shouldn't Be Surprised

Top-down editorial still life of an open book lying flat on black marble, white pages covered in dense broken text that fades into illegible gibberish on the right page, hairline cracks running across the right page like shattered glass, a small cobalt blue stamp pressed into the upper right corner of the page, hard overhead institutional light, high contrast monochrome with a single cobalt accent, no people, no warmth, brutalist architectural geometry — Generated via ComfyUI / SDXL Base 1.0 (seed 20260614)

The Headline

KPMG has pulled its October 2025 report "Total Experience: Redefining Excellence in the Age of Agentic AI" from its own websites after GPTZero, a forensic AI-detection firm, showed that five of the forty-five citations in the report actually point where they claim to point. TechCrunch has the timeline; The Register has the on-the-record statements. Four organisations — UBS, the NHS, Swiss Federal Railways, and Transport for London — have publicly disputed claims the report made about their AI usage. KPMG's own data, published the same month in the CEO Outlook, says seventy-one percent of CEOs rank AI as their top investment priority. The agentic-AI report cites "KPMG research" saying the number is fifty-five percent. Sixteen percentage points of self-contradiction, in two reports, same publisher, same month. This is the second Big Four firm in two months to withdraw a published report for AI-hallucinated content. The credibility question is not a KPMG question. It is an industry question.

What GPTZero Actually Found

The forensic breakdown is plain. Of forty-five citations in a flagship Big Four report, only five accurately point to real, uncorrupted sources. Forty of forty-five citation titles are paraphrases that loosely match real sources or are entirely fabricated. Roughly half of the report's factual claims evidenced by those forty-five citations are false, unsupported, or attributed to the wrong source. The pattern is consistent: a citation is a believable-shaped string — title, container, year, sometimes a publication — that an LLM produced when asked for "examples of agentic AI in the wild." Nobody clicked the link. City A.M. has the industry-press confirmation.

Three of the egregious examples are worth naming. A 2019 East Japan Railway press release about joining the MaaS Alliance is cited as evidence of an AI-agent deployment in 2025 — six years before "agentic AI" entered the public discourse. Emirates is credited with "a mobile chatbot named Sara that can change your flight." Sara is a 2023 robot check-in assistant. It cannot change bookings. Verbund, the Austrian electricity utility, is described as running "an AI-agent energy-as-a-service ecosystem" for household appliances and EV charging. The cited source is a press release about Verbund's venture arm investing in a Swedish B2B grid startup. None of the household, AI, weather, or EV claims are in the source. GPTZero coined the term "vibe citing" for the failure mode. It is the citation-grade version of vibe coding. It is the new failure mode of the profession.

The "Vibe Citing" Pattern — and Why It Was Inevitable

The KPMG report was assembled by feeding an LLM a research request — "find examples of agentic AI in the wild" — and accepting the output. The condensed consulting-style endnote format — title, container, year, no authors, no URL — was the perfect camouflage. An endnote with no author and no URL is structurally unfalsifiable on first read. The same pattern played out at Deloitte last year (a refund to the Australian federal government over a fabricated AI report), at EY in May (a loyalty-rewards study withdrawn), and at the law firms Pinsent Masons and Sullivan & Cromwell (cited precedents that did not exist). The shape is the same every time. A research product is briefed. An LLM does the legwork. No human clicks the links. The result ships under a partner's name. The KPMG incident is not an outlier. It is the visible top of a stack we have been warned about for two years.

The reason it is now inevitable is structural. The Big Four bill for advisory hours. The labour that an LLM replaces — research, drafting, citation assembly — is exactly the labour that is most compressible by an LLM and least verifiable by a partner signing off a deck. The incentive is to do more research with fewer analysts. The cost of a missed citation is, until recently, a quiet embarrassment. The cost of an LLM-hallucinated citation, caught by an external forensic firm, on a flagship report, is a withdrawal. The KPMG withdrawal is the first withdrawal at this scale. It is not going to be the last.

"There should be a way for the public to slow the technology's advancement."
— Jack Clark, co-founder, Anthropic, to the BBC, 2026, repurposed as a metaphor for buyer-side restraint on the Big Four's AI research

The Internal Contradiction KPMG Would Have Caught

The most damning single fact is internal. KPMG's own CEO Outlook, published the same month as the agentic-AI report, says seventy-one percent of CEOs rank AI as their top investment priority. The KPMG agentic-AI report cites "KPMG research" saying the number is fifty-five percent. Sixteen percentage points of self-contradiction, in two reports, same publisher, same month. This is the kind of detail a copy desk would have caught in a single read. There was no copy desk. There was no human read of the report as a whole. There was a generation pass, an executive summary, and a partner sign-off. The contradiction was the easiest verifiable fact in the document. The firm that gets paid to advise boards on AI risk did not run its own fact-base through the most basic internal-consistency check.

What Buyers of Big Four Research Should Do This Week

The new due-diligence question is not "is the report well-designed." It is "are the citations real." Concretely, for any Big Four AI-themed report that has crossed your desk in the last twelve months:

Ask the firm for the underlying data files, not the deck. Sample-check five citations at random against the cited source — not the title, the source. If the firm cannot produce a one-hundred-percent match rate on a five-citation spot-check, the report is untrustworthy. Demand an authorship trail: which partner signed off, which analyst ran the research, which human read the final draft, and which tooling was used to generate the citations. For reports on AI specifically, ask what the firm's policy is on AI-generated citations, in writing. Treat Big Four "thought leadership" as marketing, not research, until the firm can demonstrate a verification pass.

The buyers who do this now will be the buyers in front of the buyers who do not. This is the same pattern that played out with the Fable 5 and Mythos 5 directive last week: a credibility regime that the vendors did not build, applied to the vendors' own products. The new regime is buyer-side. The new tool is the spot-check. The new contract clause is "show me the source file, not the deck."

Closing — The Profession's Self-Inflicted Wound

The firms that get paid to advise Fortune 500 boards on AI risk have now, twice in two months, shipped flagship reports where the citations are hallucinated and the case studies are invented. The profession's response so far has been to withdraw the report and to issue a statement about "guidelines on the responsible use of AI." That is the same response the frontier lab gave when its silent routing was exposed. The market is not going to treat the second withdrawal as a one-off. The buyers who noticed the first withdrawal are now treating all of last year's AI-themed Big Four research as suspect. The corrective work — a real, verifiable, human-validated research process — is the work the firms are now going to be paid to do, in addition to whatever they were going to be paid to do. The bill is coming due.

Sources & Links

Generated via ComfyUI / SDXL Base 1.0. Source: new-horizon.tech daily digest, run_date 2026-06-14.
This post was generated by New Horizon's autonomous editorial pipeline: topic selected from the daily news digest (2026-06-14) for viral potential, drafted from the TechCrunch timeline, the GPTZero forensic breakdown, The Register, and City A.M., and reviewed for factual accuracy and house style. Hero image generated via ComfyUI (SDXL Base 1.0, seed 20260614). The arguments and predictions are editorial — not vendor endorsement, not investment advice, not a consulting engagement.
Source digest: 2026-06-14