When Intelligence Goes Wrong: The Maccabi Tel Aviv Case and the Urgent Need for Better OSINT Tradecraft

The Coalition of Cyber Investigators review the Maccabi Tel Aviv case to demonstrate the importance of adhering to core intelligence tradecraft principles.

Paul Wright & Neal Ysart

1/16/20269 min read

When Intelligence Goes Wrong

The Maccabi Tel Aviv Case and the Urgent Need for Better OSINT Tradecraft

How AI hallucinations, confirmation bias, and abandoned best practices led to a preventable intelligence failure

There's a moment in every failed investigation where you realise something's gone terribly wrong. You trace back through the decision chain and find the exact point where someone failed to ask, "Wait, how do we know this?"

That moment happened in November 2025 when West Midlands Police recommended banning Maccabi Tel Aviv supporters from attending a Europa League fixture against Aston Villa. The intelligence that supported this decision was generated by Artificial Intelligence (AI). The problem was that it was false.

The root cause will make anyone working in intelligence, investigations, or public safety nervous. This wasn't a sophisticated attack, a clever adversary, or an unavoidable black swan event. This was a straightforward failure to follow basic Open-Source Intelligence (OSINT) tradecraft - practices which experienced practitioners have long understood and apply instinctively.

The Anatomy of a Preventable Failure

A report by His Majesty's Inspectorate of Constabulary (HMICFRS) helps explain what went wrong.

They reported that West Midlands Police relied on intelligence derived from artificial intelligence tools to assess the threat posed by Maccabi Tel Aviv supporters. In this case, it appears they used Microsoft Copilot.

The “intelligence” painted a picture of serious misbehaviour and security risks that justified excluding fans from the match.

However, Amsterdam police, who had dealt with Maccabi supporters in earlier incidents, explicitly stated that the West Midlands account was "highly exaggerated." A letter obtained by the BBC from the Dutch Inspector General also confirmed the claims were inaccurate.

Failing to apply investigative rigour led to a scenario where a key decision was made following a risk assessment based on intelligence generated by AI, which lacked human corroboration and was directly contradicted by foreign law enforcement.

Not only that, the HMICFRS report noted West Midlands Police demonstrated "poor record keeping and retention of important information." In practical terms, this means there was no way to prove exactly how this decision was made, who reviewed what, or whether anyone questioned the AI-generated intelligence.

The Missing Tradecraft: Intelligence Grading and the 3x5x2 Framework

Investigators and analysts are drilled on intelligence grading frameworks early in their careers. It can often feel tedious and bureaucratic, but it’s vital.

Imagine a case falling apart in court, because the defence team was able to systematically dismantle the intelligence assessment by pointing out that the prosecution couldn’t demonstrate source reliability or confidence levels. In such circumstances, inexperienced practitioners would quickly learn that intelligence grading isn't red tape; it's there for a reason.

The Coalition of Cyber Investigators recently published essential guidance on this ("Embracing Grading, Handling, and Dissemination Practices in OSINT"). They outlined the 3x5x2 model and similar frameworks that provide structure for evaluating:

  • Source reliability – Can we trust where this came from?

  • Information credibility – Does the content itself make sense?

  • Handling codes – Who should see this and under what conditions?

These aren't optional extras. They're the difference between intelligence and gossip (or, in this case, between intelligence and algorithmic hallucination).

When you receive information - any information, but especially information generated by AI systems that are known to hallucinate - you should be asking:

  • What's the provenance of this data?

  • Has it been corroborated by independent sources?

  • What's our confidence level in this assessment?

  • What alternative explanations exist?

  • Who has reviewed this before it informed consequential decisions?

In the Maccabi case, these questions apparently weren't asked, or, if they were, the answers weren't satisfactory, but the decision was made anyway. That's not an intelligence failure; that's a tradecraft and procedural failure.

The AI Integration Problem: Speed vs. Scrutiny

AI has enormous potential in OSINT workflows such as pattern recognition, data processing at scale, multilingual analysis, and anomaly detection - genuine advantages that investigators desperately need.

But integrating AI into OSINT workflows without proper procedures and safeguards is dangerous.

The Coalition of Cyber Investigators published another critical piece on this topic: "Enhanced Challenges and Mitigation Strategies for OSINT AI Integration", which explored risks ranging from hallucinations to evidentiary limitations and outlined mitigation strategies to help preserve analytical integrity.

Police forces are under intense pressure, including budget cuts, increasingly complex crime, staffing shortages, and growing public demand for rapid response. AI promises to help address these challenges by enabling faster analysis, lower costs, and greater efficiency.

While AI can deliver fantastic results, it fails when you abandon the fundamental practices that ensure the quality of intelligence, for example, by treating automated output as a verified fact rather than an input which requires evaluation. Skipping human review because teams are too stretched, or choose not to, will also undermine the entire intelligence process.

Public-Private Partnerships and the Technology Trap

There’s a further factor that needs to be addressed: the relationship between police forces and technology vendors. Public-private partnerships (PPPs) in policing have become increasingly common and bring tangible benefits, including access to cutting-edge tools, technical expertise, and innovation capacity that public-sector budgets can't match.

But PPPs also create subtle pressures. When you've invested in an AI system and senior leadership has vehemently championed it, there's a gravitational pull to use the technology, trust its outputs, and justify the investment.

This is where organisational culture matters enormously. You need teams empowered to say "this AI output doesn't pass scrutiny" without fear of being seen as obstructionist or tech resistant. You need procurement processes that prioritise tradecraft compatibility over feature lists and procedures that include robust evaluation protocols and, crucially, the ability to override or ignore AI recommendations when human judgment dictates.

We have seen organisations (not just police forces) fall into what we call the "technology trap", where the tool becomes the methodology rather than a supporting framework. When this occurs, the "the AI said so" approach often serves as sufficient justification rather than the beginning of an analytical process.

Confirmation Bias Amplified

The HMICFRS report noted a further crucial point in the Maccabi case: confirmation bias influenced West Midlands Police's recommendation. This can become especially dangerous when combined with AI tools.

Confirmation bias is the tendency to seek out, interpret, and remember information that confirms our existing beliefs. It's a universal human cognitive quirk to which no one is immune. Good intelligence tradecraft includes checks specifically designed to counteract confirmation bias, such as intelligence team analysis, alternative hypotheses, or devil's advocate reviews.

But AI can supercharge confirmation bias in dangerous ways. If you're concerned about Maccabi supporters (perhaps based on legitimate incidents elsewhere, media coverage, or general anxiety about football violence), and you ask an AI system to analyse the threat, it’s entirely possible that AI might generate outputs that confirm your concerns, especially if it's been trained on data that emphasises security incidents or if it's picking up on the framing of your queries.

This is important as AI-generated content can feel more authoritative than human judgment. It comes with an aura of objectivity, data-driven analysis, and computational rigour. It's harder to question than an analyst, saying, "I've got a hunch about this."

It results in a situation in which confirmation bias, amplified by technological authority and the short-circuiting of tradecraft safeguards, can lead to decisions that affect real people and communities.

The Ripple Effects: Trust, Courts, and Public Opinion

The consequences can be significant and extend far beyond a single football match. They include:

  • Legal proceedings – Defence attorneys pay attention to these cases. When police intelligence failures become public, it undermines the credibility of intelligence-led prosecution in other cases.

    "Your Honour, remember the Maccabi case where they relied on AI hallucinations? How do we know this intelligence is any better?"

  • Community trust Maccabi Tel Aviv has supporters in the UK, many of whom are British citizens or residents. Imagine being told you can't attend a football match because an algorithm generated false information about your community. Policing operates best when it has the community's trust, but in this case, it’s not hard to see how that bond could be broken and replaced by a perceived injustice.

  • Institutional credibility – Public confidence in policing depends on perceptions of competence and fairness. High-profile failures damage that confidence in ways that take years to rebuild.

  • International relations – This case involved Dutch police contradicting UK police assessments. That's diplomatically awkward and undermines crucial information-sharing relationships for transnational security cooperation.

  • The chilling effect on AI adoption – There is also the possibility that cases like this make organisations MORE hesitant to adopt AI tools that could genuinely help them - bad implementation creates resistance to exemplary implementation.

What Should Have Happened (and Must Happen Going Forward)

1. Mandatory Intelligence Grading for AI-Assisted Analysis

Every piece of AI-generated or processed intelligence should be graded to assess source reliability, information credibility, and confidence levels. This process involves questioning the AI system's verification, seeking corroboration of the data, and determining exactly how specific the findings are.

If you can't grade it, you shouldn’t use it for operational decisions.

2. Decision Logging and Documentation

Every consequential decision should have a documented chain of command and record what intelligence was considered, how it was evaluated, which alternatives were explored, who made the final call, and on what basis.

This isn't cover-your-ass bureaucracy, it's accountability infrastructure.

If West Midlands Police had proper decision logs, it could be established exactly how the Maccabi recommendation was made and where the process broke down.

3. Training on AI Limitations and Hallucination Risks

Everyone using AI tools in intelligence work needs to understand:

  • What hallucinations are and why they occur

  • How to spot potential AI-generated falsehoods

  • Why corroboration and integrity are non-negotiable

  • When to trust AI assistance and when to be sceptical

This isn't technical training; its tradecraft training adapted for technological tools.

4. Human-in-the-Loop Decision-Making for Consequential Outcomes

AI can support analysis, process data, and identify patterns. But decisions that significantly affect individuals or communities must involve human judgment at the final stage - that human needs to be empowered and required to question AI outputs.

"The AI said so" should never be a sufficient justification. "The AI suggested this, we evaluated it against these criteria, we corroborated it with these sources, and we assessed it as reliable because..." That's the standard which should be the baseline.

5. Public-Private Partnership Governance

When police forces engage with technology vendors, contracts need to include:

  • Clear protocols for evaluating AI-generated intelligence

  • Requirements for clear explanations and transparency

  • Liability frameworks for AI failures

  • Regular independent audits of AI system performance

  • Mechanisms for feedback and improvement

And critically, organisational culture needs to support saying "this isn't working" even after you've invested in the technology.

6. Red Team Reviews and Alternative Analysis

Integrate devil's-advocate processes into AI-assisted intelligence workflows. Have someone whose job is to poke holes in the assessment, challenge the AI outputs, and explore alternative explanations.

This is especially important for combating confirmation bias amplified by technological authority.

The Broader Context: OSINT in an AI Age

It could be argued that this case represents a collision between traditional intelligence tradecraft (developed over decades, proven in countless operations, built on hard-won lessons) and new technological capabilities (powerful, promising, but not yet fully understood or properly integrated).

However, law enforcement can't afford to keep learning these lessons over and over. The Maccabi case should act as a catalyst for change, not just another incident filed away.

OSINT tradecraft evolved for good reasons. Intelligence grading frameworks exist because people made mistakes and communities suffered the consequences. Documentation requirements emerged from cases where accountability was impossible without records. Human review processes were developed because automated systems (even pre-AI ones) proved unreliable for complex judgments.

AI doesn't make these lessons obsolete. If anything, it makes them more critical since the scale and speed of AI-assisted decisions mean mistakes can spread faster and affect more people.

Conclusion - A Call to Action

If you're working in OSINT, investigations, intelligence analysis, or security, whether in law enforcement, the private sector, or elsewhere, the Maccabi case offers clear takeaways:

  • Don't abandon tradecraft fundamentals in pursuit of technological efficiency.

  • Don't treat AI outputs as gospel; treat them as starting points requiring verification.

  • Don't skip documentation because you're busy; documentation is what makes accountability possible.

  • Don't let confirmation bias blind you to alternative explanations, especially when AI might be amplifying your preconceptions.

  • Do insist on intelligence grading for AI-assisted analysis.

  • Do maintain human-in-the-loop decision-making for consequential outcomes.

  • Do build organisational cultures where questioning AI outputs is encouraged, not resisted.

The promise of AI in OSINT work is real and substantial. But that promise requires discipline, scepticism, and adherence to practices that might feel tedious but exist for excellent reasons.

West Midlands Police learned this lesson the hard way. Hopefully, other organisations can avoid repeating that experience, and AI and OSINT workflow integration can start to deliver the benefits it promises.

Authored by: The Coalition of Cyber Investigators

Paul Wright (United Kingdom) & Neal Ysart (Philippines)

©2026 The Coalition of Cyber Investigators. All rights reserved.

The Coalition of Cyber Investigators is a collaboration between

Paul Wright (United Kingdom) - Experienced Cybercrime, Intelligence (OSINT & HUMINT) and Digital Forensics Investigator;

Neal Ysart (Philippines) - Elite Investigator & Strategic Risk Advisor, Ex-Big 4 Forensic Leader; and

Lajos Antal (Hungary) - Highly experienced expert in cyberforensics, investigations, and cybercrime.

The Coalition unites leading experts to deliver cutting-edge research, OSINT, Investigations, & Cybercrime Advisory Services worldwide.

Our co-founders, Paul Wright and Neal Ysart, offer over 80 years of combined professional experience. Their careers span law enforcement, cyber investigations, open source intelligence, risk management, and strategic risk advisory roles across multiple continents.

They have been instrumental in setting formative legal precedents and stated cases in cybercrime investigations and contributing to the development of globally accepted guidance and standards for handling digital evidence.

Their leadership and expertise form the foundation of the Coalition’s commitment to excellence and ethical practice.

Alongside them, Lajos Antal, a founding member of our Boiler Room Investment Fraud Practice, brings deep expertise in cybercrime investigations, digital forensics, and cyber response, further strengthening our team’s capabilities and reach.

The Coalition of Cyber Investigators, with decades of hands-on experience in cyber investigations and OSINT, is uniquely positioned to support organisations facing complex or high-risk investigations. Our team’s expertise is not just theoretical - it’s built on years of real-world investigations, a deep understanding of the dynamic nature of digital intelligence, and a commitment to the highest evidential standards.