Why AI Still Makes Unpredictable Decisions
Why It Matters
Businesses increasingly rely on large language models (LLMs) for support in decision-making, yet little is known about whether these systems behave like human decision-makers, or diverge in risky and unexpected ways. This study explains the hidden patterns behind LLM decisions and what organisations must understand before using AI as a stand-in for human judgment.
Key Takeaways
- LLMs often match human accuracy but fail to replicate human noise, the natural variability that shapes real-world decisions.
- AI models show strong “contingency adaptations”: they react sharply to small changes in situations, often more than humans do.
- Organisations cannot assume LLMs make decisions like people; they must account for systematic biases created by model design and training.
Why Human Decision-Making Is Noisy — and Why That Matters for AI
Human judgment is inherently noisy. Two people given the same information often make different decisions, and even the same person may decide differently at separate moments. This variability, or “noise”, is not an error but a central feature of how people make choices under uncertainty. It reflects intuition, emotion, context and subtle cues, all factors that shape real-world decisions in business, policy and everyday life.
As LLMs begin to assist or even replace parts of human decision-making, a critical question emerges: Do these models behave like humans, complete with unpredictability, or do they follow a different logic? Much of the excitement around AI assumes that because LLMs perform well on cognitive tasks, they can reliably mimic human judgement. Yet performance scores alone cannot reveal whether the model’s underlying decision patterns align with human thinking.
This study examines those decision patterns directly. Using behavioural experiments traditionally applied to human subjects, the researchers test whether LLMs display the same noise and the same sensitivity to contextual cues as human decision-makers. By analysing not only the answers but also the patterns behind those answers, the study reveals fundamental differences in how humans and LLMs navigate uncertainty.
What Happens When LLMs Face Human Decision Tasks
To assess whether LLMs mirror human behaviour, the researchers designed a set of structured scenarios involving risk, probability, trade-offs and situational changes. These tasks were administered both to human participants and to a selection of leading LLMs. The goal was not to compare accuracy, which LLMs often matched, but to observe deeper behavioural traits.
A key finding is that LLMs show far lower noise than humans. When presented with identical scenarios, LLMs repeat the same decisions with striking consistency. Humans, by contrast, vary widely. This lack of noise might seem like a strength, but it has a critical implication: AI does not capture the full range of human judgment. It produces decisions that are uniform, predictable and constrained by training data rather than lived experience.
More revealing, however, is how LLMs respond to small situational changes. The study finds that LLMs exhibit strong contingency adaptation, they react sharply to minor tweaks in the scenario, often more than humans do. In other words, LLMs over-react to subtle changes that humans barely notice. This adaptation pattern suggests that LLMs anchor decisions to textual cues and statistical associations, producing large swings in output even when the real-world meaning of a situation has barely shifted.
Across all experiments, LLMs did not naturally reproduce the behavioural patterns found in humans. Even when outcomes appeared similar on the surface, the underlying mechanisms were different. The models could reach the correct answer, but not in a human-like way. This distinction is crucial for organisations using LLMs to simulate, replace or scale human decision processes.
Why These Differences Matter: Bias, Strategy and Model Design
The study highlights a deeper challenge: LLMs may perform well but for reasons that diverge from human reasoning. Their responses reflect patterns in their training data, the structure of their prompts and the statistical nature of language modelling. As a result, LLMs exhibit systematic behaviours, low noise, sharp contingency reactions and predictable patterns, that do not map cleanly onto human psychology.
This matters because many organisations use LLMs to generate synthetic data, forecast consumer behaviour or model employee decisions. If the AI’s behaviour does not match the human behaviour it aims to simulate, the data produced may unintentionally distort real-world patterns. The study warns that synthetic datasets built from LLMs risk embedding model-specific biases, leading to flawed conclusions and misaligned strategies.
A second implication lies in organisational decision-making itself. If leaders assume LLMs behave like expert human analysts, they may trust the wrong aspects of the model’s output. LLMs excel at consistency but struggle with the intuitive, context-sensitive adjustments humans make effortlessly. These behavioural gaps mean businesses must treat LLM outputs not as human-equivalent advice but as one input among many, requiring careful interpretation.
Business Implications
- Do not assume LLMs replicate human judgment. They may match human accuracy yet diverge fundamentally in behaviour.
- Be cautious when using LLMs to generate synthetic decision data. Model-specific biases can misrepresent real human patterns.
- Expect sharp reactions to minor contextual changes. LLMs anchor to textual cues in ways humans do not.
- Use AI as a complement, not a substitute. Combine model outputs with human oversight, domain expertise and contextual understanding.
- Audit behavioural patterns, not just performance scores. Organisations need visibility into how models arrive at decisions, especially in high-stakes environments.
Authors & Sources
Authors: Yuanjun Feng (University of Lausanne), Vivek Choudhary (Nanyang Technological University), Yash Raj Shrestha (University of Lausanne)
Original Article: EMNLP 2025
---
For more research, click here to return to NBS Knowledge Lab.





