AI Industry is Trying to Subvert the Definition of “Open Source AI”

The Open Source Initiative has published (news article here) its definition of “open source AI,” and it’s terrible. It allows for secret training data and mechanisms. It allows for development to be done in secret. Since for a neural network, the training data is the source code—it’s how the model gets programmed—the definition makes no sense.

And it’s confusing; most “open source” AI models—like LLAMA—are open source in name only. But the OSI seems to have been co-opted by industry players that want both corporate secrecy and the “open source” label. (Here’s one rebuttal to the definition.)

This is worth fighting for. We need a public AI option, and open source—real open source—is a necessary component of that. — Read More

#strategy

Illustrated LLM OS: An Implementational Perspective

This blog post explores the implementation of large language models (LLMs) as operating systems, inspired by Andrej Karpathy’s vision of AI resembling an OS, akin to Jarvis from Iron Man. The focus is on practical considerations, proposing an application-level integration for LLMs within a terminal session. A novel approach involves injecting state machines into the decoding process, enabling real-time code execution and interaction. Additionally, this post proposes Reinforcement Learning by System Feedback (RLSF),” a reinforcement learning technique applied to code generation tasks. This method leverages a reward model to evaluate code correctness through Python subprocess execution, enhancing LLM performance. The findings contribute insights into the dynamic control of LLMs and their potential applications beyond coding tasks. — Read More

AIOS: LLM Agent Operating System
MemGPT: Towards LLMs as Operating Systems

#devops

Microsoft and a16z set aside differences, join hands in plea against AI regulation

Two of the biggest forces in two deeply intertwined tech ecosystems — large incumbents and startups — have taken a break from counting their money to jointly plead that the government desist from even pondering regulations that might affect their financial interests, or as they prefer to call them, innovation.

“Our two companies might not agree on everything, but this is not about our differences,” writes this group of vastly disparate perspectives and interests: Founding a16z partners Marc Andreessen and Ben Horowitz, and Microsoft CEO Satya Nadella and President/Chief Legal Officer Brad Smith. A truly intersectional assemblage, representing both big business and big money.

But it’s the little guys they’re supposedly looking out for.  — Read More

#strategy

CONFIRMED: LLMs have indeed reached a point of diminishing returns

For years I have been warning that “scaling” — eeking out improvements in AI by adding more data and more compute, without making fundamental architectural changes — would not continue forever. In my most notorious article, in March of 2022, I argued that “deep learning is hitting a wall”. Central to the argument was that pure scaling would not solve hallucinations or abstraction; I concluded that “there are serious holes in the scaling argument.”

And I got endless grief for it. Sam Altman implied (without saying my name, but riffing on the images in my then-recent article) I was a “mediocre deep learning skeptic”; Greg Brockman openly mocked the title. Yann LeCun wrote that deep learning wasn’t hitting a wall, and so on. Elon Musk himself made fun of me and the title earlier this year.

The thing is, in the long term, science isn’t majority rule. In the end, the truth generally outs. Alchemy had a good run, but it got replaced by chemistry. The truth is that scaling is running out, and that truth is, at last coming out. — Read More

#strategy

From Naptime to Big Sleep: Using Large Language Models To Catch Vulnerabilities In Real-World Code

n our previous post, Project Naptime: Evaluating Offensive Security Capabilities of Large Language Models, we introduced our framework for large-language-model-assisted vulnerability research and demonstrated its potential by improving the state-of-the-art performance on Meta’s CyberSecEval2 benchmarks. Since then, Naptime has evolved into Big Sleep, a collaboration between Google Project Zero and Google DeepMind.

Today, we’re excited to share the first real-world vulnerability discovered by the Big Sleep agent: an exploitable stack buffer underflow in SQLite, a widely used open source database engine. We discovered the vulnerability and reported it to the developers in early October, who fixed it on the same day. Fortunately, we found this issue before it appeared in an official release, so SQLite users were not impacted.

We believe this is the first public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software. — Read More

#cyber