Jailbroken versions are marketed for up to $5,000.
Cyber-criminals are increasingly jailbreaking popular AI models to create phishing tools, malicious code, and hacking tutorials
According to researchers from Cato Networks, and reported by The Record, jailbroken versions are circulating on dark web forums like BreachForums under names such as WormGPT and FraudGPT, marketed for up to $5,000.
Instead of exploiting software vulnerabilities, threat actors manipulate LLM behaviour using specially crafted system prompts that steer responses past safety guardrails.
Cato researchers emphasised that these misuses don’t stem from flaws in the models themselves, but from how attackers reconfigure their context. The proliferation of open-source models makes it easy to host and distribute modified versions, making detection and takedown a challenge.
Written by
Dan Raywood is a B2B journalist with 25 years of experience, including covering cybersecurity for the past 17 years. He has extensively covered topics from Advanced Persistent Threats and nation-state hackers to major data breaches and regulatory changes.
He has spoken at events including 44CON, Infosecurity Europe, RANT Forum, BSides Scotland, Steelcon and the National Cyber Security Show, and served as editor of SC Media UK, Infosecurity Magazine and IT Security Guru. He was also an analyst with 451 Research and a product marketing lead at Tenable.