Web Search Promotes Stronger Understanding Than ChatGPT in Knowledge Tasks, Researchers Conclude

As artificial intelligence becomes increasingly embedded in how people access information, new research from the University of Pennsylvania's Wharton School suggests that large language models (LLMs) like ChatGPT may be changing the way people learn, and not necessarily for the better. Across a series of experiments involving over 4,500 participants, researchers found that users who relied on LLMs to gather information developed weaker understanding and produced less thoughtful, original content than those who used conventional web search tools like Google.

Synthesized Answers, Shallower Knowledge

The core finding of the study is straightforward. Although LLMs provide answers more quickly and conveniently, this ease appears to reduce the cognitive effort users invest in processing the material. In contrast, web search requires navigating multiple sources, evaluating credibility, and integrating different viewpoints, actions that, according to the researchers, foster deeper understanding.

When asked to research everyday topics and then offer advice based on their findings, participants using LLMs consistently wrote shorter, less fact-rich responses. Their advice also showed more overlap with others in their group, suggesting less individual engagement with the subject matter.

Controlled Experiments Across Multiple Formats

The research team, led by Wharton professor Shiri Melumad and postdoctoral fellow Jin Ho Yun, ran four experiments to test this hypothesis. In the first, participants used either ChatGPT or Google to learn about how to plant a vegetable garden. Those who used Google spent more time reviewing sources, engaged more deeply with the content, and wrote longer and more detailed advice. Natural language analysis showed that Google users also referenced more specific facts and used more unique phrasing in their written responses.

To isolate whether the difference stemmed from the quality of information or just how it was delivered, a second experiment held the content constant. Participants were shown the same gardening tips, but in different formats, either as a unified summary mimicking an LLM response, or divided across simulated web pages. Again, those in the web search format demonstrated deeper engagement and produced richer, more personalized advice.

Testing Real-World Interfaces

In a lab-based third experiment, researchers turned to Google's AI Overview, an LLM-powered feature integrated directly into search results. Participants who viewed only the AI-generated overview spent less time engaging with the material and produced more generic content, compared to those who scrolled through traditional blue-link results. This suggests the presentation format itself, concise, synthesized, and effort-saving, can be enough to reduce active learning, even when the platform is familiar.

The fourth and final experiment tested how this difference in learning outcomes affects real-world impact. Independent readers were shown two pieces of advice on how to buy a mattress, without being told how the advice was generated. Consistently, the advice written by those who had used Google was rated as more helpful, more informative, and more trustworthy. Readers said they were more likely to adopt or recommend it.

Passive Learning May Undercut Understanding

The study’s authors argue that these patterns reflect a shift from active to passive learning. With traditional web search, users must make sense of fragmented information, compare perspectives, and decide which sources to trust. This “sensemaking” process encourages retention and deeper knowledge formation.

By contrast, LLMs offer fully-formed answers upfront. While convenient, this can discourage users from questioning or interpreting information on their own. The research links this to a broader psychological principle known as “desirable difficulty,” where learning is more effective when it requires mental effort.

Motivation, Not Just Cognition

Some of the effects may also stem from how people perceive LLMs. Separate commentary from Carnegie Mellon’s Daniel Oppenheimer suggests that users may defer too readily to AI-generated answers, assuming the model is more knowledgeable than they are. This can discourage critical thinking, making users less motivated to engage further or personalize the information.

Still, the researchers stop short of saying LLMs should be avoided entirely. Both Melumad and Oppenheimer acknowledge that AI tools can support learning, especially when used thoughtfully, such as to critique one’s own writing or test assumptions. However, they caution that without guidance or intent, most people seem to default to the easiest path, which often means learning less.

Implications for Education and Everyday Learning

The study raises important questions about the long-term impact of AI on knowledge acquisition. With younger users increasingly turning to LLMs as their first stop for information, the authors express concern that key learning skills, such as evaluating evidence, synthesizing ideas, and forming original insights, could be weakened over time.

They also note that these effects may not apply equally across all subjects. For more technical or specialized knowledge areas, LLMs might help novices by simplifying complex material. But for general topics, the habit of skipping over the work of discovery could have broader consequences.

The researchers suggest that combining LLMs with traditional search, rather than replacing it outright, may be a better path. Starting with AI-generated overviews and then diving deeper into source materials could balance convenience with the mental engagement needed for meaningful learning.

Image: DIW-Aigen

Read next:

• Anthropic’s AI Vending Machine Manager Had a Meltdown No One Saw Coming