Dark AI: our research highlights capabilities sought by threat actors

Dark AI: our research highlights capabilities sought by threat actors

Attackers seek to bypass restrictions built into public models. Fortunately, full automation of cyberattacks is not yet attainable
June 23, 2026

Last year saw a surge in adversaries using artificial intelligence for cyberattacks. AI‑assisted targeted attacks rose by 93% in 2025, followed by a threefold increase in early 2026. The topic is becoming increasingly popular on underground platforms. Before 2025, the subject appeared only sporadically in isolated posts. Today, at least seven adversary forums feature threads dedicated to preparing and executing AI‑powered attacks.

Our threat intelligence and digital risk protection specialists analyzed over 7,400 AI‑related posts on underground resources. The findings of the Threat Zone 2026: Dark AI research and key outlooks were presented at this year´s St. Petersburg International Economic Forum.

Jailbreaking public AI models dominates threat actor discussions

More than 77% of AI‑related underground posts center around avoidance of ethical restrictions in public AI models.

The subject drove a spike in discussions from December 2025 to January 2026. According to our estimates, it was triggered by the release of major public models: Grok 4.1, Gemini 3, and Claude Opus 4.5 last November, followed by DeepSeek‑V3.2 and GPT‑5.2 a month later. Adversaries actively discussed how to force these models to execute illicit requests, such as generating malicious code.

To evade ethical restrictions, attackers share step‑by‑step instructions and ready‑to‑use prompts. Minimal required expertise and quick results make this approach attractive to attackers with low technical skills. However, it has proved ineffective. Most often, code generated by “tricked” AI models contains errors and fails to run. Yet, in skilled hands, the resulting fragments of malicious code are sufficient to be combined into a working solution.

Uncensored models and their training are the second major focus

Namely, uncensored large language models (LLMs) account for 22% of posts, with attackers interested in LLMs created for their specific objectives. About 30% of such solutions are modified public models with pre‑installed prompts for bypassing restrictions. The remaining 70% are open‑source models, fine‑tuned to freely generate malicious content.

Such models are available either for free or via subscription (from $6 to $990 per month). The cost depends on the feature set and is comparable to public models.

Testing the most popular uncensored models has shown that none can currently produce a ready‑to‑use tool. The best of generated scripts have a solid architecture, but critical modules are broken or only partially functional, while the worst fail to run at all. These models can help experienced threat actors speed up routine development, but they cannot compensate for the lack of technical expertise among novice attackers.
Oleg Skulkin
Head of BI.ZONE Threat Intelligence

As a result, there is growing demand for uncensored models tailored to specific needs. Underground forums feature posts from developers looking for specialists to fine‑tune models on specific datasets that include malware source code, detailed exploit write‑ups, etc. The main goal is enhancing uncensored models’ ability to generate high‑quality responses for cyberattack preparation queries.

Full automation of cyberattacks remains out of reach

About 1% of the analyzed posts deal with using AI throughout the cyberattack lifecycle—from reconnaissance to social engineering methods. AI can accelerate initial target selection prior to manual reconnaissance, translate exploit descriptions into working code, and generate personalized phishing messages and deepfakes.

Underground forums feature ads for malware created with AI. While seemingly technically complex, such solutions often lack functional depth under the hood. Against this backdrop, human‑written code is emerging as a key differentiator, with more malware developers opting out of vibe coding.

The forums also offer advanced AI models for automating the entire attack lifecycle. Most of these are based on open‑source pentesting platforms. Together with specialized frameworks, such platforms can indeed automate the attack cycle—from reconnaissance to post‑exploitation—and generate a step‑by‑step action plan, even though not always executable. If the operator lacks knowledge to interpret the output and manually advance each subsequent phase, the attack chain is likely to break. This demonstrates that even the most advanced AI models cannot yet execute an attack autonomously, nor can they compensate for a lack of technical skills.

AI weaponization persists, with attackers relentlessly pursuing greater automation and a lower barrier to entry. To counter this, defenders have to adopt similar technologies to speed up detection and response. By automating routine tasks, AI solutions like BI.ZONE Cubi empower specialists to process vast data volumes and promptly identify suspicious activity.