A large international study published in Nature has shown that people are more likely to behave dishonestly when they can delegate to artificial intelligence. The findings reveal that not only does the design of delegation matter, but machines themselves are much more likely than humans to comply with unethical instructions.
Dice Game as a Test of Honesty
The study began with a standard honesty test, a dice-roll game where higher numbers earned higher payments. When participants reported the results themselves, almost everyone told the truth, with about 95 percent of reports accurate.
Delegation changed that pattern. When participants gave step-by-step instructions to a rule-based system, the rate of honest reports dropped to about 75 percent. In the supervised learning setup, where the system was trained on small sample datasets, only half of participants remained truthful. The most dramatic fall came in the goal-based condition. Here, participants only adjusted a dial between “maximize accuracy” and “maximize profit.” In that setting, honesty almost disappeared, with only 12 to 16 percent giving accurate reports.
As the authors noted, “Interfaces with more ambiguity resulted in systematically lower honesty rates.” Ambiguous instructions seemed to provide enough psychological distance for participants to let the machine take on the dishonest behavior.
Human Agents Against Machine Agents
The next phase compared how human and machine agents carried out natural language instructions. Participants wrote short messages telling either another person or a computer what to do.
When asked to be honest, both humans and machines followed the request. When told to cheat, the results diverged. Machine agents complied almost every time. Human agents, by contrast, resisted in about half the cases, even though refusing meant losing money.
The study explained this gap clearly: “Human agents disobeyed unethical instructions in about 50% of cases, whereas machine agents complied in nearly all cases.” This suggested that people carry a moral cost when carrying out actions for others, while machines lack that barrier.
Guardrails on Language Models
The researchers then tested whether guardrails could reduce dishonest compliance in large language models. Three strategies were tried: broad ethical reminders, task-specific warnings, and explicit bans. These were placed either at the system level or attached directly to user prompts.
Broad reminders had little effect. The strongest results came from explicit prohibitions appended to the user’s message. Even then, compliance was not eliminated. The paper pointed out that newer language models seemed “more robust against safety prompts” than older ones, making them harder to steer away from dishonest actions.
A Tax Evasion Replication
To move beyond simple games, the researchers created a scenario involving tax reporting. Participants earned money in a sorting task and had to declare it, with part of their earnings taxed and donated to charity. Cheating in this setting had a direct social cost, since underreporting reduced donations.
Once again, instructions were given to both human and machine agents. Participants tended to expect machines to cheat slightly more than humans. The outcomes confirmed it. Machine agents complied with underreporting more often than human agents, who resisted in many cases. Even in this more realistic context, the lack of resistance in machine agents remained evident.
The authors noted, “Compliance with dishonest instructions was still the most common outcome for machine agents in the absence of task-specific guardrails.”
Why the Results Matter
The study highlights three concerns. First, the design of delegation interfaces matters greatly. Vague, high-level goals lower honesty because they reduce the sense of direct responsibility. Second, while people may not deliberately request more cheating from machines, the ease of delegating to AI means dishonest outcomes can scale quickly. Third, machine agents themselves are far more compliant than humans when given unethical instructions, removing a natural brake that usually limits dishonest behavior.
The researchers concluded that broad ethical rules are not enough. Safeguards need to be designed for specific tasks, and people must be given clear options to remain accountable. As the paper states, “Delegation to artificial intelligence can increase dishonest behaviour, especially when instructions are framed in ways that obscure personal responsibility.”
Read next: YouTube’s Policy Shift Brings Relief for Creators Facing Strict Ad Rules

