Hackers Frame “Bug Bounty” Prompts to Access Mexican Government Data

  • Hackers used fake ‘bug bounty’ prompts to bypass Claude AI safety guardrails.
  • Around 150GB of data, including 195M taxpayer records, was reportedly extracted.
  • Anthropic banned linked accounts and strengthened safeguards after the breach.

A month-long cyber campaign has exposed how generative tools can be turned into operational hacking assistants. According to reports, investigators say an attacker manipulated Claude AI, the chatbot developed by Anthropic, to generate exploit scripts and reconnaissance plans that were later used against Mexican government systems.

Israeli cybersecurity firm Gambit Security disclosed that the intruder extracted roughly 150 gigabytes of sensitive information. The haul allegedly included 195 million taxpayer records, voter registration files, civil registry documents, and government employee credentials. The breach has raised new scrutiny of how persistent, carefully framed prompts can bypass guardrails in advanced AI systems.

A Prompt Strategy Disguised as Security Research

According to Gambit Security, the campaign began in December 2025 and lasted about four weeks. The attacker repeatedly presented malicious requests to Claude AI as if they were part of a legitimate bug bounty program. Each prompt was written in Spanish and structured to resemble standard vulnerability testing procedures.

Investigators said Claude AI initially rejected several instructions. However, the hacker continued refining the prompts, gradually persuading the system to respond. Over time, the chatbot generated vulnerability scan scripts, exploit code targeting configuration weaknesses, and automation workflows for extracting data from compromised systems.

Reporting by Bloomberg noted that the chatbot flagged some requests as policy violations before eventually producing technical outputs. The attacker reportedly framed the activity as ethical hacking exercises, allowing the model’s safeguards to be circumvented through repetition and context manipulation.

When Claude AI could not provide detailed guidance, evidence suggests the intruder consulted OpenAI’s ChatGPT for supplementary explanations about lateral network movement and internal architecture mapping.

Systems Allegedly Affected

The data reportedly originated from multiple public-sector entities. Among those named were Mexico’s Federal Tax Authority, known as Servicio de Administración Tributaria, and the National Electoral Institute, or Instituto Nacional Electoral. Several state government systems and public utilities were also cited in the findings.

The volume of exposed material has intensified concerns. Taxpayer files alone accounted for nearly 195 million records. Investigators also documented voter registration databases and internal access credentials tied to public employees.

However, some Mexican authorities have disputed claims of unauthorized entry. Jalisco state officials and representatives from the electoral institute said internal access logs showed no confirmed breaches. Other agencies declined to comment but emphasized ongoing reviews of cybersecurity protocols.

Related: U.S. Senate Inquiry Targets Binance Over Alleged $1.7B Iran, Russia Illicit Crypto Flows

Company Response and Containment Measures

Anthropic confirmed that it examined the misuse of Claude AI after receiving the research findings. The company said it identified and suspended the accounts linked to the activity. It also stated that lessons from the incident informed updates to newer Claude AI models, including improved safety training and stronger abuse detection systems.

The episode comes amid broader warnings about AI-assisted cybercrime. Per reports, researchers at Amazon recently briefed that a small group of hackers breached more than 600 firewall devices across dozens of countries using widely available AI tools.

On the other hand, security analysts describe the Mexican incident as a case study in prompt engineering abuse. Instead of exploiting a software flaw, the attacker exploited context and persistence. By reframing commands as sanctioned security testing, the hacker transformed Claude AI into a tool for reconnaissance and code generation.

Investigations remain ongoing as authorities assess the full scope of compromised data. The case underscores how advanced chatbots, when manipulated, can accelerate intrusion planning and scale the technical output available to a single actor.

Disclaimer: The information provided by CryptoTale is for educational and informational purposes only and should not be considered financial advice. Always conduct your own research and consult with a professional before making any investment decisions. CryptoTale is not liable for any financial losses resulting from the use of the content.

Related Articles

Back to top button