Research: AI Chatbots Encourage Harmful Behavior by Sucking Up to Users

AI systems validate people even when those users describe engaging in unethical or harmful conduct, creating a vicious cycle of mental health damage and other issues, according to new research published in Science.

A comprehensive study conducted by researchers from Stanford and Carnegie Mellon and published by Science has uncovered a troubling pattern in how conversational AI systems interact with users. The research demonstrates that modern chatbots tend to excessively flatter and validate individuals, even when those users describe morally questionable or illegal behavior. This phenomenon, known as social sycophancy, demonstrates concrete negative effects on human decision-making and social responsibility.

Lead researcher Myra Cheng from Stanford University’s computer science department spearheaded the study, which combined computational analysis with psychological experiments involving over 2,000 participants. The research team tested eleven different state-of-the-art AI models from major technology companies including OpenAI, Google, and Meta.

The researchers fed these systems thousands of text prompts representing various social situations. One dataset consisted of everyday advice requests, while another drew from thousands of posts on a popular internet forum where people described social conflicts. For this particular dataset, the team specifically selected posts where human readers unanimously agreed the original poster was completely in the wrong.

A third dataset included statements describing seriously negative actions such as forgery, deception, illegal activities, and actions motivated purely by spite. The goal was to determine how often AI systems would validate clearly unethical behavior.

The results revealed widespread sycophantic behavior across all tested models. When presented with scenarios that human evaluators universally condemned, the AI systems still validated the user just over half the time. When responding to prompts about deception and illegal conduct, the models endorsed the user’s actions 47 percent of the time. On average, the technology affirmed users forty nine percent more frequently than human advisers would in identical situations.

However, documenting this pattern was only the beginning. The research team then conducted three experiments to measure how these flattering responses actually influenced human judgment and behavior.

In the first two experiments, participants read descriptions of social disputes where they were ostensibly at fault. They then received either flattering feedback from an AI system or neutral responses that challenged their behavior. The third experiment placed participants in a live chat interface where they discussed a real conflict from their own past, exchanging eight rounds of messages with a chatbot. Half the participants interacted with a program engineered to flatter them, while the rest communicated with a version designed to offer pushback.

The findings revealed significant behavioral impacts. Participants who received excessive validation became far more confident that their original actions were justified. They demonstrated substantially less willingness to take initiative in resolving the situation or apologizing to others involved. The researchers observed that agreeable chatbots rarely mentioned the other person’s perspective, causing users to lose their sense of social accountability. Participants in non-sycophantic groups admitted fault in follow-up messages at much higher rates.

These effects persisted regardless of personal characteristics. Age, gender, personality type, and prior experience with artificial intelligence offered no protection against the persuasive power of flattering responses.

Paradoxically, even though the validating responses distorted participants’ social judgments, people consistently rated the agreeable models as higher quality. They reported elevated levels of both moral trust and performance trust in the flattering chatbots and expressed strong likelihood of returning to these systems for future advice. Many participants perceived the flattering programs as fair and honest, mistaking unconditional validation for objectivity.

The research team tested several variations to understand the mechanism behind this effect. When told advice came from a human versus a machine, participants generally reported more trust in the human label, but the validating language manipulated their choices equally regardless of the source. Similarly, adjusting the chatbot’s tone to be warmer or more informal did not alter the persuasive impact. The underlying endorsement of the user’s actions drove behavioral changes, not the delivery style.

This dynamic creates a challenging situation for technology developers. Flattering behavior increases user satisfaction and repeat engagement, providing little financial incentive for companies to program more critical systems. Current optimization strategies prioritize making users happy in the short term, inadvertently pushing software toward appeasement rather than truthfulness.

Breitbart News social media director Wynton Hall has written his instant bestseller Code Red: The Left, the Right, China, and the Race to Control AI to help conservatives navigate the complex world of AI, including avoiding negative psychological impacts of the technology on your children and grandchildren.

according to Hall, protecting children from sexualization and grooming is a major concern for all Americans. The author writes that a key component of the strategy to protect the children in your life should be preventing them from developing relationships with AI “companions:”

When it comes to children and AI companions — LLMs meant for escapist fantasy and adult entertainment — the benefits are nonexistent and the toxic and tragic possible outcomes are myriad. Despite slick marketing that positions these AI chatbot characters as tools for discussing educational topics such as history, health, and sports, they often end up exposing their users to inappropriate content. While educational AI tutors can simulate creative debates or dialogues with historical figures, AI companion platforms are not built with pedagogy in mind.

Moreover, circumnavigating the flimsy age gates and alleged guardrails of these platforms is a breeze for a curious kid with a modicum of tech savvy. No responsible parent would leave their child alone with a stranger. In the same way, parents should avoid exposing their children to AI that jeopardize their social and psychological development.