Study of 1,400 AI incidents finds most harm comes from software, not robots

By Jonathan Björkman

When AI systems cause real damage, the examples are often less futuristic than expected.

In 2024, a passenger flying with Air Canada lost a family member. Grieving and trying to arrange travel, he turned to the airline’s customer service chatbot for help understanding its bereavement fare policy. The chatbot gave him detailed, confident information. The problem was that the information was wrong.

When Air Canada disputed responsibility, arguing the chatbot was a separate legal entity, a Canadian tribunal disagreed. The airline was ordered to pay damages in an early test case for whether companies can be held to what their automated systems say.

This was not a frontier model failure. No cutting-edge AI was involved. It was an ordinary customer service tool, deployed without adequate oversight, confidently saying things it should not have been authorised to say.

The case matters because it is not exotic. Most documented AI harm now sits closer to the Air Canada chatbot than to anything resembling science fiction, and the gap between how AI risk is discussed and how it actually shows up is becoming hard to ignore.

The AI incident record tells a different story than the headlines

Paligo analysed 1,406 incidents recorded in the public AI Incident Database, a collection of documented cases where artificial intelligence caused or contributed to real-world harm. The findings cut against the idea that AI risk is primarily a future problem.

Nearly half of all documented harmful AI incidents (49 percent) involve software-only systems. Not robots or autonomous vehicles, but chatbots, recommendation engines, automated publishing tools and deepfake platforms. That figure is more than the combined total of every physical AI category in the dataset.

Figure 1: Share of documented AI incidents by system type. Source: AI Incident Database, Paligo analysis, 2026.

Authority without accountability

The Air Canada case is instructive precisely because it was so avoidable. Nobody decided the chatbot should mislead grieving passengers. But somebody decided it should be able to speak authoritatively on refund and bereavement policy, and that it would do it without a human in the loop to catch what might go wrong.

Other cases illustrate the same logic. A Deloitte report submitted to the Australian government was found to contain academic citations that did not exist, traced back to AI-generated drafting. A 2024 scam used AI-generated deepfake videos of mining billionaire Andrew Forrest to promote a fraudulent cryptocurrency platform. In each case, the underlying model did what such models do. The issue was the level of trust placed in its output.

Rahul Yadav, CEO of Paligo, frames the underlying mechanism this way: “AI doesn’t hallucinate in a vacuum. It hallucinates because we feed it contradictory, outdated, unstructured content. Fix the input, fix the output.”

The platform problem

Social media platforms appear in 19 percent of incidents where a specific system was implicated, more than any other category. The significance of this is not that social media companies are uniquely careless. It is that platforms are where localised failures become large-scale ones.

An AI system produces something harmful, a platform’s recommendation engine decides it is engaging, and millions of people see it before anyone intervenes. For businesses that distribute content through third-party platforms, which is the case for many of them, this is a risk that sits partly outside their control. The data suggests it is worth acknowledging.

Figure 2: Share of incidents involving each system category, with social media platforms collectively at 19 percent. Source: AI Incident Database, Paligo analysis, 2026.

Bias remains an operational problem, not just an ethical debate

When a specific group is disproportionately affected by AI harm, race is the most common differentiating factor, appearing in 16 percent of documented incidents. The pattern shows up across facial recognition, healthcare and access systems.

In one widely reported case, Detroit Police wrongfully arrested a Black man after a facial recognition system returned a faulty match. A long-used kidney function testing algorithm was found to systematically underestimate health risk in Black patients, with direct consequences for who received specialist referrals and transplant assessments. These are not edge cases. They are operational failures with measurable outcomes for the people on the wrong side of them.

Figure 3: Share of incidents in which a specific demographic group was disproportionately affected, by differentiating factor. Source: AI Incident Database, Paligo analysis, 2026.

What the data does not say

It would be easy to read an analysis of 1,400 AI failures as an argument against deploying AI. It isn’t. Organisations that appear in the database are not outliers. They are often early movers in industries where AI adoption is now standard.

What the analysis actually shows is where the governance gaps tend to appear, and those gaps are consistent enough across cases to be useful. The same decisions appear repeatedly: chatbots authorised to speak on matters they cannot verify, recommendation systems optimised for engagement without regard for what they surface, automated tools publishing content no human has reviewed. These are not inevitable features of AI deployment. They are choices, and the data suggests they are where most of the risk lies.

The counter-examples are less visible precisely because they don’t end up in incident databases. A chatbot that escalates complex policy questions to a human agent, a recommendation system with explicit constraints on certain content categories, an automated publishing tool with a review step built into the workflow. None of these normally make the news.

In essence, one might argue that most documented AI failures are not caused by advanced systems behaving unpredictably. They are caused by ordinary systems being given too much authority, with too little oversight.

Methodology

The figures cited above are drawn from 1,406 unique incidents listed in the AI Incident Database, with records included up to March 2026. Incidents were grouped by sector, harm type, affected party, deploying organisation and associated system using the database’s own taxonomies, supplemented by keyword analysis where taxonomy fields were incomplete. Percentages reflect how often each category appeared in incident records, not the share of total harm caused and not a measure of confirmed legal liability. A single incident can sit in more than one category, so counts are not mutually exclusive.

About the author

Jonathan Björkman is a digital PR and content specialist working with Paligo on editorial content, and AEO/SEO. He writes about how AI is reshaping the economics of content, particularly the widening gap between what AI systems can generate and what they can be trusted to cite. Before Paligo, he led campaigns at Verve Search in London. His work and commentary have appeared in BBC, and The Globe and Mail. He is based in Gothenburg.

AI Disclosure: AI assistance may have been used for grammar and proofreading only.

Reviewed by Irfan Ahmad.

Read next:

• Google AI Still Fails to Correct Long-Standing Errors About the Islamic Caliphate

• Many Americans Pessimistic about AI’s Impact – and Want More Regulation

• Taunting and degrading civilians in armed conflict is a clear violation of international law