ChatGPT and Google Gemini are terrible at summarizing news, according to a new study

**PCHF IT Feeds** · 02-11-2025, 04:03 PM

[ul]
[li]A new study from the BBC says AI chatbots are unable to accurately summarize news[/li][li]The study asked ChatGPT, Gemini, Copilot, and Perplexity to summarize BBC news articles[/li][li]51% of responses had ‘significant issues’ and 19% introduced factual errors[/li][/ul]

A new study from the BBC has found that four of the world’s most popular AI chatbots including ChatGPT are inaccurately summarizing news stories.

The BBC asked ChatGPT, Copilot, Gemini, and Perplexity to summarize 100 news stories from the news outlet and then rated each answer to determine just how accurate the AI responses were.

The study found that “51% of all AI answers to questions about the news were judged to have significant issues of some form.” and “19% of AI answers which cited BBC content introduced factual errors, such as incorrect factual statements, numbers and dates.”

The study showcases multiple examples of inaccuracies that showcased differing information to the news it was summarizing. The examples note that “Gemini incorrectly said the NHS did not recommend vaping as an aid to quit smoking” and “ChatGPT and Copilot said Rishi Sunak and Nicola Sturgeon were still in office even after they had left.”

Inaccuracies aside, there’s another crucial finding. The report found that AI “struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context.”

While these results are unsurprising considering how often we see issues with news summarization tools at the moment, including Apple Intelligence’s mix-ups that have led Apple to temporarily remove the feature in iOS 18.3, it’s a good reminder not to believe everything you read from AI.

[HEADING=1]Are you surprised?[/HEADING]

From the study, the BBC concludes that “Microsoft’s Copilot and Google’s Gemini had more significant issues than OpenAI’s ChatGPT and Perplexity,”

While this research doesn’t necessarily give us much more info, it validates the skepticism towards AI summary tools and emphasizes just how important it is to take information from AI chatbots with a pinch of salt. AI is developing rapidly and large language models (LLMs) are released almost weekly at the moment, so it’s to be expected that mistakes will happen. That said, from my personal testing I’ve found inaccuracies and hallucinations to be less frequent now in software like ChatGPT than it was just a few months ago.

Sam Altman said in a blog post yesterday that AI is progressing faster than Moore’s law and that means we’ll continue to see constant improvements to software and how it interacts with the world around it. For now, however, it’s probably best not to trust AI for your daily news, and if it’s tech-based you may as well stick with TechRadar instead.

[HEADING=2]You may also like[/HEADING]
[ul]
[li]I tried ChatGPT Search and now I might never Google again[/li][li]If you want to know who will win the AI wars, just watch these two Super Bowl ads from Google and ChatGPT[/li][li]OpenAI’s Deep Research smashes record for the world’s hardest AI exam[/li][/ul]

Continue reading…