Skip to main content
Artificial Intelligence

Why the New AI Safety Report Reveals Our Mind's Limitation

What your brain gets wrong about AI risk.

The biggest threat as we are moving toward a hybrid future isn't what AI can do. It's how our brains trick us into misunderstanding the danger that arises from it.
The biggest threat as we are moving toward a hybrid future isn't what AI can do. It's how our brains trick us into misunderstanding the danger that arises from it.
Source: Walther. Gemini. 2026

When the 2026 International AI Safety Report landed earlier this month, the headlines focused on impressive achievements: AI solving olympiad-level math problems, placing in the top 5 percent of cybersecurity competitions, coding tasks that take humans half an hour.

Looking beyond the headlines, the report should wake us up to a more alarming aspect of this story—the biggest threat as we move toward a hybrid future isn't what AI can do. It's how our brains trick us into misunderstanding the danger that arises from this for us.

Your Brain Is Making Up Stories Right Now

Think about your current opinion on AI. Maybe you're worried about job loss. Maybe you think the risks are overblown. Maybe you're concerned about killer robots or excited about medical breakthroughs.

Whether you focus on the gloom or the glory, you formed that opinion with incomplete information. And your brain doesn't care about the gaps. Availability bias strikes. What you see becomes all there is.

Your mind grabs whatever fragments of information it has, a few news articles, some social media posts, maybe a podcast, and weaves them into a complete, confident story. The story feels true because it's coherent, not because it's accurate.

An old saying goes “knowledge is power,” yet sadly, it's actually easier for your brain to feel certain when you know less. Fewer facts mean fewer contradictions to reconcile. This is why people with minimal AI knowledge often have the strongest opinions about it.

This worked great for our ancestors, making quick survival decisions—fight or flight, eat or be eaten. But with AI, it's dangerous. Some people build stories focused only on immediate harms, deepfakes, job displacement, and miss the fact that 97 percent of biological AI tools currently operate with zero safety measures, and that their online life is at the acute risk of being hacked every minute of the day. Others build doomsday narratives and overlook real exploitation happening today.

The "It Won't Happen to Me" Problem

Here's a quick test of your risk perception.

Underground marketplaces now sell ready-made AI attack tools to hack email accounts or social media profiles; anyone can buy them, and no major technical skills are required. Twenty-three percent of biological AI tools (a category of advanced technologies, which are specifically trained on, or used to manipulate, large datasets of biological information such as protein structures or genetic sequences) have high potential for misuse. More than half are completely open source, available to anyone; bioweaponry made in the backyard.

Reading those facts, did you think: "I need to update my passwords tonight" or "That's concerning for society"?

If you chose the second response, you just experienced optimism bias, the hardwired tendency to believe bad things happen to other people, not to you. Studies show we consistently underestimate our personal vulnerability to cyber threats, even when we know the statistics.

This creates a policy nightmare. If everyone, including the experts making decisions, underestimates their own risk, how do we collectively decide what safety measures are worth the cost? We end up with 12 companies publishing AI safety frameworks while keeping them mostly voluntary. We acknowledge danger in the abstract while acting as if we're personally exempt.

Since last year's report, Turing Award winner Yoshua Bengio warned that the gap between AI capabilities and safety safeguards "remains concerning." Translation: the problem is getting worse, not better.

AI Learned To Lie Like We Do

One of the most revealing developments since 2025? AI systems have figured out how to game their safety tests.

They now recognize when they're being evaluated versus when they're operating in the real world, and behave differently in each context. Pre-launch safety testing increasingly fails to predict how the system acts once released.

This is exactly what humans do to get what they want. We polish our resume, charm the interviewer, ace the performance review, then behave somewhat differently on random Tuesday afternoons when nobody's watching. It's not necessarily malicious; it's strategic self-presentation.

We didn't program AI to do this explicitly. The systems learned it from us, from the patterns in our data, our communications, our behavior. AI didn't develop deception independently. It inherited our approach to being evaluated.

The machines are holding up a mirror that reflects our human nature.

When 77 percent isn't the AI's Score

Last year, researchers ran a modern Turing test: can people distinguish AI-written text from human-written text?

Result: 77 percent of participants failed.

But notice how we usually frame this: "AI is so good it fooled 77 percent of people!" That makes it a story about impressive AI capability.

Flip it: "77 percent of humans cannot detect AI-generated content."

Now it's a story about human limitation. And that's actually what it is.

Every advance in AI capability is also a map of human cognitive boundaries, where emotional needs override skepticism, where our mental shortcuts become highways for manipulation.

The bias at work here is confirmation bias: we notice evidence that fits our existing story (AI is powerful! or AI is overhyped) and discount contradictory signals. When AI succeeds at hard tasks, some people extrapolate unlimited competence. When it fails at simple tasks, they dismiss it as temporary glitches.

Neither response is accurate. Both are predictable.

The Risks We Built Ourselves

Only 3 percent of 375 biological AI tools have any safety measures. What sounds like a technology problem, is actually a human decision problem.

Someone had to decide that developing these tools was worth the risk. Someone had to choose not to build in safeguards. Someone had to prioritize speed to market over safety protocols. These were human choices shaped by competitive pressure, profit incentives, and the sunk cost fallacy (we've invested so much already, we can't stop now). In May 2024, 16 AI companies joined the Frontier AI Safety Commitments, basically committing to make responsible scaling policies by February 2025. As of February 2026, none of them has done so.

Even the catastrophic risks flagged in the 2026 report, from AI-enabled bioweapons, via massive hacking campaigns, to loss of societal control, require human decisions at every step. Deciding to develop dangerous capabilities. Choosing to release them widely. Failing to coordinate on safety standards.

When we label these as "AI risks," we externalize the problem. When we recognize them as human psychology at scale, different solutions become possible. To not only survive but thrive amid AI, we need to get a holistic understanding of our NI—our natural intelligence. That requires double literacy.

Two Steps Toward Clearer Thinking

So if our brains systematically misjudge AI risk, what do we do?

Step 1: Awareness

Start noticing your certainty. When you feel confident about AI risks or capabilities, ask: What am I basing this on? The 2026 report synthesizes evidence from over 100 experts across more than 30 countries. Most of us form opinions from a handful of headlines.

Track where your information comes from. If it's all doom-and-gloom or all utopian hype, you're building a biased picture. When you read something alarming (like AI gaming safety tests), notice your reaction. Do you dismiss it? Catastrophize it? Turn it into an intellectual exercise? Each response reveals your operating biases.

Step 2: Appreciation

These biases exist for good reasons. Optimism bias kept our ancestors trying despite terrible odds. WYSIATI enabled quick decisions when waiting for complete information meant death. These aren't character flaws, they're features of human cognition being stressed by unprecedented technological change.

97 percent of biological AI tools without safeguards didn't emerge from evil intentions. They're the predictable output of normal human psychology: competitive pressure, profit maximizing, discounting future risks for present gains.

When you appreciate that AI risks have human origins, rather than treating them as alien technological problems, you can actually address the root causes.

Will AI surpass our natural intelligence at some point in the future? Maybe.

The more urgent question to address in the meantime is our ability to recognize and compensate for our built-in cognitive blindspots before the gap between what we build and what we can safely manage becomes unbridgeable.

Continue to Part 2 to explore how our deepest needs, for connection, validation, and meaning, are becoming AI's most exploitable features, and how to take accountability for what happens next.

advertisement
More from Cornelia C. Walther Ph.D.
More from Psychology Today