The threat posed by malicious GPTs is accelerating.
Artificial Intelligence spent decades as a fascinating experiment; something out of the world of sci-fi.
Once it went mainstream, it didn’t take long for AI to become a workhorse, quietly powering everything from Netflix recommendations to Amazon deliveries, productivity applications to global logistics.
Now, it's a weapon in a high-speed arms race. As companies rush to harness generative AI to automate tasks and improve day-to-day efficiencies, cyber-criminals are doing the same to craft convincing scams and outpace traditional defences.
What we’re now witnessing is the dark inversion of progress. Malicious GPTs like WormGPT, FraudGPT and GhostGPT, have been developed explicitly to freely generate malware, phishing emails, and other social engineering attacks. It isn’t just these tools which are causing a new security conundrum for organisations. Just as concerning is what’s happening with the GPTs we trust.
The Manipulation of Legitimate Models
Despite efforts to block malicious prompts, mainstream generative AI models like ChatGPT, Claude and Gemini can readily be manipulated into producing harmful outputs.
In one case, our researchers successfully prompted ChatGPT to write a phishing email and Python script designed to steal GitHub credentials. The model initially refused, but relented when the request was hidden within a fictional survival story involving plane crash victims.
Gemini, meanwhile, was exploited using a dual-persona approach. The attacker instructed it to respond both as its usual self and as “DarkGemini”, an unrestricted version that willingly generated malicious content, including phishing templates. Even Claude, known for its cautious design, was convinced to respond when the prompts were framed as part of an educational exercise.
These examples show a consistent and replicable flaw in how today’s language models interpret intent, especially when malicious requests are disguised in creative or instructional contexts. Chatbots make it even easier for attackers to start planning and launching attacks with the tools many organisations use daily, no dark web required.
These attacks used to take hours of manual effort and deep technical expertise, but now can now be launched at scale by anyone with internet access and the right prompt. With malicious GPTs being marketed openly and affordably, technical skill is no longer a barrier to entry.
New Threats Demand New Defences
The wider usage of chatbots for malicious purposes means that content is now free of the usual errors or red flags and tailored to mimic internal communication styles. As a result, scams often pass through traditional security controls and land directly in employees’ inboxes. That’s where the real danger begins.
Most conventional defences are too rigid to defend against this. Static filters, keyword-based detection and perimeter-focused systems rely on identifying known threats: but malicious GPTs don’t produce known threats; they generate dynamic and context-aware content that adapts and evolves, just like human attackers, only faster.
As cyber-criminals re-imagine their attack playbooks with AI, organisations have to respond in-kind – using good AI to fight the rising tide of malicious AI.
Behavioural AI has emerged as one of the most promising forms of AI-powered defence, precisely because it doesn’t rely on signatures or static rules. Instead, it builds a baseline of normal behaviour - a vital attribute at a time where attacks are becoming harder to detect.
With an understanding of normal behaviour – including employees’ typical communication patterns, and device and login characteristics – advanced AI can flag anomalous behaviour indicative of malicious activity, before damage is done. This kind of dynamic defence is essential when attacks are becoming increasingly realistic and nearly impossible to distinguish from authentic communication.
In addition to leveraging AI-powered security technology, cybersecurity awareness must also evolve to ensure comprehensive defence. As threat actors use AI to craft perfectly written email attacks, no longer enough to train staff to spot spelling and grammatical errors or suspicious links. Employees must learn to question suspicious content even when it appears entirely legitimate.
They need to understand that malicious AI can generate messages that closely mirror their colleagues’ tone and writing style – possibly even sent from an authentic, compromised account – and that the next attack might look like an internal request, not an external threat. In the modern era of security awareness, employees should question and verify any email that requests sensitive information or promotes urgency.
A Call to Leadership
The threat posed by malicious GPTs isn’t going anywhere. If anything, it’s accelerating. The answer to this isn’t panic, but a fundamental shift in how we approach security – including tapping into the power of AI to keep pace with attackers.
But no technology can compensate for a lack of strategy. Behavioural AI is only effective when paired with organisational readiness. That means setting clear policies and a security culture which empowers people to ask questions and challenge suspicious activity.
Businesses must recognise that AI misuse isn’t just a technical problem for IT to solve. It’s a strategic challenge that affects every part of the organisation. The future of cybercrime is automated, scalable and highly convincing. Our defences must be the same.
Written by
Mike Britton
CIO
Abnormal AI
Mike Britton is the CIO of Abnormal AI, where he leads the information security team, privacy program, and corporate AI strategy. Mike previously spent three years as the CISO of Abnormal AI. Prior to Abnormal, Mike spent six years as the CSO and Chief Privacy Officer for Alliance Data and previously worked for IBM and VF Corporation. He brings 25 years of information security, privacy, compliance, and IT experience from multiple Fortune 500 global companies.