Many-shot jailbreaking

We investigate a family of simple long-context attacks on large language models: prompting with hundreds of demonstrations of undesirable behavior. This is newly feasible with the larger context windows recently deployed by Anthropic, OpenAI and Google DeepMind. We find that in diverse, realistic circumstances, the effectiveness of this attack follows a power law, up to hundreds of shots. We demonstrate the success of this attack on the most widely used state-of-the-art closed-weight models, and across various tasks. Our results suggest very long contexts present a rich new attack surface for LLMs. — Read More

#adversarial

Jailbreaking Attack against Multimodal Large Language Model

This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries. A maximum likelihood-based algorithm is proposed to find an image Jailbreaking Prompt (imgJP), enabling jailbreaks against MLLMs across multiple unseen prompts and images (i.e., data-universal property). Our approach exhibits strong model-transferability, as the generated imgJP can be transferred to jailbreak various models, including MiniGPT-v2, LLaVA, InstructBLIP, and mPLUG-Owl2, in a black-box manner. Moreover, we reveal a connection between MLLM-jailbreaks and LLM-jailbreaks. As a result, we introduce a construction-based method to harness our approach for LLM-jailbreaks, demonstrating greater efficiency than current state-of-the-art methods. The code is available here. \textbf{Warning: some content generated by language models may be offensive to some readers.} — Read More

#adversarial

Jon Stewart On The False Promises of AI

Read More

#humor, #videos

Washington state judge blocks use of AI-enhanced video as evidence in possible first-of-its-kind ruling

A Washington state judge overseeing a triple murder case barred the use of video enhanced by artificial intelligence as evidence in a ruling that experts said may be the first-of-its-kind in a United States criminal court.

The ruling, signed Friday by King County Superior Court Judge Leroy McCullogh and first reported by NBC News, described the technology as novel and said it relies on “opaque methods to represent what the AI model ‘thinks’ should be shown.” — Read More

#legal

200+ Artists Urge Tech Platforms: Stop Devaluing Music

STOP DEVALUING MUSIC. An open letter signed by over 200 musicians calls on AI developers, tech companies, platforms and digital music services to stop using AI to “infringe upon and devalue the rights of human artists.”  — Read More

#audio, #vfx

We’re Focusing on the Wrong Kind of AI Apocalypse

Conversations about the future of AI are too apocalyptic. Or rather, they focus on the wrong kind of apocalypse.

There is considerable concern of the future of AI, especially as a number of prominent computer scientists have raised, the risks of Artificial General Intelligence (AGI)—an AI smarter than a human being. They worry that an AGI will lead to mass unemployment or that AI will grow beyond human control—or worse (the movies Terminator and 2001 come to mind).

Discussing these concerns seems important, as does thinking about the much more mundane and immediate threats of misinformation, deep fakes, and proliferation enabled by AI. But this focus on apocalyptic events also robs most of us of our agency. AI becomes a thing we either build or don’t build, and no one outside of a few dozen Silicon Valley executives and top government officials really has any say over. — Read More

#strategy

Introducing… Magic Security Dust™!

Read More

#humor, #videos

What’s next for AI agentic workflows featuring Andrew Ng

Read More

#videos

NYC’s AI Chatbot Tells Businesses to Break the Law

In October, New York City announced a plan to harness the power of artificial intelligence to improve the business of government. The announcement included a surprising centerpiece: an AI-powered chatbot that would provide New Yorkers with information on starting and operating a business in the city. 

The problem, however, is that the city’s chatbot is telling businesses to break the law.

Five months after launch, it’s clear that while the bot appears authoritative, the information it provides on housing policy, worker rights, and rules for entrepreneurs is often incomplete and in worst-case scenarios “dangerously inaccurate,” as one local housing policy expert told The Markup. — Read More

#accuracy

OpenAI built a voice cloning tool, but you can’t use it… yet

As deepfakes proliferate, OpenAI is refining the tech used to clone voices — but the company insists it’s doing so responsibly.

Today marks the preview debut of OpenAI’s Voice Engine, an expansion of the company’s existing text-to-speech API. Under development for about two years, Voice Engine allows users to upload any 15-second voice sample to generate a synthetic copy of that voice. But there’s no date for public availability yet, giving the company time to respond to how the model is used and abused.

“We want to make sure that everyone feels good about how it’s being deployed — that we understand the landscape of where this tech is dangerous and we have mitigations in place for that,” Jeff Harris, a member of the product staff at OpenAI, told TechCrunch in an interview. — Read More

#audio