Eurostar AI Chatbot Vulnerabilities Reveal Risks of Generative Customer Service

CyberSecureFox 🦊

Security weaknesses uncovered in Eurostar’s AI-powered customer service chatbot illustrate how rapidly deploying generative AI without mature cybersecurity controls and incident response processes can create serious technical and reputational risk.

Eurostar, High-Speed Rail and Digital Customer Experience

Eurostar Group operates a network of high-speed trains connecting the United Kingdom with continental Europe via the Channel Tunnel, transporting around 19.5 million passengers annually between London, Paris, Brussels, Amsterdam and other major hubs. For rail operators at this scale, digital channels — websites, mobile apps and AI chatbots — are now critical components of the passenger journey and increasingly attractive targets for attackers.

How Researchers Discovered Security Flaws in the Eurostar AI Chatbot

According to security consultancy Pen Test Partners, the vulnerabilities were found almost by accident. A researcher attempting to buy a ticket via the Eurostar website noticed that the AI-based support chatbot appeared to have weak content restrictions, often referred to as guardrails, which are supposed to prevent the model from stepping outside predefined support scenarios or exposing internal data.

The core architectural problem was that the system only applied safety checks to the most recent message in a conversation. Earlier user messages in the same chat session were not revalidated. By editing the conversation history on the client side and taking advantage of how the final prompt was assembled for the model, the researchers were able to bypass the guardrails and cause the AI to ignore built-in safety instructions.

Prompt Injection: Exposing Internal Instructions and Model Details

Once the filters had been bypassed, the team used a prompt injection attack — inserting crafted instructions into the conversation context that override the system’s own rules. The chatbot then revealed its internal guidance and confirmed which underlying model it was using, indicating that the service was powered by GPT‑4.

Prompt injection is now recognised by organisations such as ENISA and NIST as a mainstream threat category for large language model (LLM) systems. It requires not only model-level safeguards but also robust input validation, context isolation and careful prompt engineering to reduce the impact of malicious or unexpected instructions.

Technical Breakdown of the Eurostar Chatbot Vulnerabilities

HTML Injection: Phishing and Malicious Content in the Chat Interface

Further testing showed that the Eurostar chatbot was vulnerable to HTML injection. In practice, this meant an attacker could craft a message that, when rendered in the chat interface, displayed arbitrary HTML elements — for example, fake buttons, forms or phishing links that closely mimicked legitimate Eurostar functions.

For passengers, this creates a realistic scenario for theft of login credentials, payment card details or redirection to malicious websites. For the operator, it introduces the risk of brand abuse, customer churn and potential regulatory penalties in the event of data compromise or large-scale fraud.

Weak Session and Message ID Validation: IDOR-Like Exposure

Researchers also identified weaknesses in how the system handled chat and message identifiers. The backend reportedly did not reliably verify that a session ID actually belonged to the authenticated user. In theory, this could allow an attacker to inject malicious content into another customer’s conversation or access information not intended for them.

This type of logic flaw is closely related to Insecure Direct Object References (IDOR), a category consistently listed by OWASP as one of the most dangerous business logic and access control issues. In a transport context — where conversations may include journey details, personal information and payment data — such vulnerabilities are particularly sensitive.

Disclosure Dispute and Allegations of “Blackmail”

Pen Test Partners state that they first notified Eurostar of the issues on 11 June 2025, but did not receive a response. After a month of unanswered messages sent to official channels, the researchers located Eurostar’s security lead via LinkedIn and reached out directly.

By that point, according to Eurostar, vulnerability handling had been outsourced and the original report was allegedly not available. The researchers were asked to resubmit via a new web-based vulnerability disclosure form, which they say they had already used. This raised a legitimate concern about whether other reports may have been lost during the transition to the new process.

At one stage, Eurostar representatives allegedly characterised the researchers’ persistent follow-ups about the missing initial report as a form of “blackmail”. Pen Test Partners rejected this, emphasising that they were acting under the principles of coordinated vulnerability disclosure as set out in ISO/IEC 29147 and ISO/IEC 30111, which encourage structured, good-faith reporting and remediation.

Eurostar subsequently located the original email, acknowledged the findings and confirmed that it had remediated “some” of the vulnerabilities, after which Pen Test Partners were permitted to publish their analysis.

Key Cybersecurity Lessons for Organisations Deploying Generative AI

The Eurostar incident underscores that generative AI risk does not reside solely in the model itself. Significant exposure often stems from the surrounding application stack: input validation, dialogue orchestration, secure HTML rendering, session and object access control, and reliable logging and monitoring. A single weak layer can effectively neutralise expensive model-level safeguards.

For organisations in high-risk sectors — including transport, financial services and government — several practices are particularly important when deploying AI chatbots for customer service:

  • Conduct comprehensive security testing of AI chatbots, including penetration testing and red-teaming, before production rollout.
  • Implement multi-layer content filters and defences against prompt injection and HTML/script injection at both client and server side.
  • Enforce strict binding of chat, session and message identifiers to authenticated users, following secure access control and IDOR prevention patterns.
  • Operate a transparent, resilient vulnerability disclosure programme with clear SLAs for acknowledgement and remediation, even when processes are outsourced.
  • Engage constructively with the security research community rather than treating unsolicited reports as a threat.

As customer-facing AI systems become standard in travel and other industries, regular independent security assessments, iterative redesign of defensive architecture and mature handling of vulnerability reports are essential. Organisations that invest early in robust generative AI security will be better positioned to prevent incidents like the Eurostar chatbot case from turning into full-scale attacks on their customers and brand.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.