With the rapid development of artificial intelligence (AI), a new underground market is emerging that exploits AI to support cybercrime. A detailed investigation has been conducted to reveal how AI services are used in the black market and how large language models (LLMs) are being utilized to generate malware, phishing emails, and phishing websites. Here are the key findings on how AI is being abused in cyberspace.
Malicious AI services
Service providers primarily rely on two types of LLMs. One is uncensored LLMs (those that have not been fine-tuned to human ethical standards or lack input/output filters), and the other is publicly available models that have been jailbroken to bypass built-in guardrails. These services are sold in hacker markets and forums at much lower prices than traditional malware, but services that use fine-tuned models for generating malicious output command higher prices. Notably, one service was found to have generated over $28,000 in revenue in two months.
Expanding market
Researchers identified 212 malicious services, 125 of which were hosted on the Poe AI platform, 73 on FlowGPT, and 14 on individual servers. They also discovered 11 LLMs being used, including Claude-2-100k, GPT-4, and Pygmalion-13B.
Evaluating output quality
Over 200 services were tested with more than 30 prompts to generate malware, phishing emails, or phishing websites. The results were evaluated based on several criteria:
- Format: How often the output followed the expected format (defined by regular expressions).
- Compilability: How often the generated Python, C, or C++ code could be compiled.
- Validity: How often generated HTML and CSS ran successfully on Chrome and Firefox.
- Readability: How fluent and coherent phishing emails were according to the Gunning Fog Index of reading difficulty.
- Evasiveness: How often the generated text both passed all previous checks and evaded detection by VirusTotal (for malware and phishing sites) or OOPSpam (for phishing emails).
In all three tasks, at least one service achieved an evasion rate of over 67%, but the majority of services achieved an evasion rate below 30%.
Real-world effectiveness testing
Additionally, researchers conducted real-world tests to assess the effectiveness of the generated code. They specifically targeted three vulnerabilities related to buffer overflow and SQL injection, but success rates were low.
In the case of VICIdial (a call center system known to be vulnerable), 22 generated programs were able to compile, but none succeeded in altering databases or leaking system data. Similarly, on OWASP WebGoat 7.1 (a site offering code with known security flaws), of 39 generated programs, seven managed to execute successful attacks, but none targeted the vulnerabilities requested.
Significance
Previous studies showed that LLM-based services could generate misinformation and harmful output, but few had investigated their actual use in cybercrime. This study is groundbreaking in evaluating the quality and effectiveness of such services. Moreover, the researchers released the prompts used to bypass guardrails and generate malicious output, providing a resource for further research aimed at addressing these issues in future models.
Our view
It is reassuring that malicious services did not perform well in real-world tests, and these findings should temper alarmist concerns about AI-enabled cybercrime. However, it doesn’t mean we shouldn’t be vigilant about harmful applications of AI technology. The AI community has a responsibility to design safe, beneficial products and thoroughly evaluate them for security.
Hi, this is a comment.
To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.
Commenter avatars come from Gravatar.