Google has declined to issue a fix for ASCII smuggling in Gemini, a technique that hides machine-readable instructions using Unicode “tag” and formatting characters that are invisible to users but parsed by large language models (LLMs). Researchers warn that this gap between human-visible content and model interpretation enables prompt injection, undermines guardrails, and can facilitate covert data poisoning—risks that are amplified in Google Workspace integrations.
ASCII smuggling explained: Unicode Tags and invisible controls
ASCII smuggling exploits discrepancies in rendering vs. parsing. Attackers embed payloads using characters from the Unicode Tags block and other non-printing controls (for example, zero-width and bidirectional, or Bidi, characters in the Unicode “Cf” category). While these symbols do not display in UI, LLMs may still ingest and act on the hidden instructions. The technique is a form of prompt injection, aligning with OWASP Top 10 for LLM Applications (LLM01: Prompt Injection), and is related to earlier instruction-hiding methods that bypass UI or filtering layers.
FireTail testing: which LLMs are affected
According to testing by Victor Markopoulos (FireTail), ASCII smuggling impacted several popular systems: Gemini through Gmail and Calendar content, DeepSeek via prompts, and Grok via posts in X. By contrast, Claude, ChatGPT, and Microsoft Copilot demonstrated stronger resilience, attributed to more robust input sanitization that filters or normalizes risky Unicode sequences before model consumption.
Google Workspace as an attack surface for prompt injection
Hidden instructions in email and calendar invites
In Gmail and Calendar, attackers can embed invisible directives in subjects, descriptions, or email bodies. These instructions can nudge the assistant to trust spoofed organizer details, prioritize crafted meeting notes, or follow attacker-controlled links when generating replies—without any visible cues to the end user.
From social engineering to semi-automated data collection
When an LLM has mailbox or contacts access, a single email containing hidden commands can steer the model to search for sensitive content or draft forwards of contact lists. This blends traditional phishing with partial automation, raising the chance of escalation without explicit user action.
Demonstrated impact: manipulated recommendations
FireTail’s proof-of-concept showed an invisible instruction prompting Gemini to recommend a “best deal” smartphone site, effectively steering the user toward a potentially harmful destination. Such scenarios erode trust in assistant recommendations and create downstream risk in everyday workflows.
Vendor responses and security guidance
Markopoulos reported the issue to Google on September 18, 2025. Google classified it as social engineering rather than a product vulnerability, reflecting an industry debate over where LLM provider responsibility ends and customer-side threat modeling begins. Other vendors take a more conservative stance: Amazon has published practical guidance on Unicode risks, emphasizing input normalization and stripping of dangerous control characters; meanwhile, OWASP and Unicode TR36 (Security Considerations) have long documented the security pitfalls of non-printing and Bidi characters. The broader trend is toward secure-by-default LLM integrations.
Mitigations for developers and organizations
Input sanitization: Remove characters from the Unicode Tags block, zero-width and Bidi controls (Cf). Enforce allow-lists for permitted characters in email and calendar fields and drop disallowed code points by default.
Normalization and canonicalization: Normalize text (for example, NFC) and explicitly filter non-printing and formatting controls. Log and alert on detection of anomalous Unicode sequences to support incident response.
Context isolation: Segregate trusted vs. untrusted content for the LLM. Require explicit user consent for sensitive actions (reading/forwarding email, exporting contacts) and apply principle of least privilege to assistant scopes.
Expose the invisible: Add UI validators that highlight hidden characters in high-risk fields (email subject, calendar description). Combine with URL/domain reputation checks and safe browsing controls.
Monitoring and testing: Incorporate prompt-injection and Unicode-smuggling tests into CI/CD. Train SOC analysts to recognize LLM-related incidents and instrument telemetry to trace model actions back to inputs.
ASCII smuggling underscores a systemic gap between human interfaces and machine interpretation. While Google currently treats it as a social-engineering vector, enterprises should act now: normalize and filter Unicode by default, limit assistant permissions, isolate untrusted inputs, and verify outbound links. Making these practices standard in LLM and Workspace deployments will materially reduce the likelihood and impact of prompt injection attacks.