China has set up a new government body that will be responsible for implementing a national standard for large language models (LLMs) – representing the technology used to train artificial intelligence (AI) chatbots like ChatGPT – as Beijing seeks to minimise potential disruption from this field, while harnessing its power to help transform traditional industries.
The China Electronic Standardisation Institute, which is under the Ministry of Industry and Information Technology (MIIT), is currently in the process of enacting a local standard for LLMs to support the growing number of fresh AI development initiatives now under way across the mainland, the agency announced on Friday at the World Artificial Intelligence Conference (WAIC) in Shanghai. — Read More
Monthly Archives: July 2023
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic look ahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models’ problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL. — Read More
#human, #nlpAI-text detection tools are really easy to fool
Within weeks of ChatGPT’s launch, there were fears that students would be using the chatbot to spin up passable essays in seconds. In response to those fears, startups started making products that promise to spot whether text was written by a human or a machine.
The problem is that it’s relatively simple to trick these tools and avoid detection, according to new research that has not yet been peer reviewed. — Read More
Alibaba launches A.I. tool to generate images from text
Chinese technology giant Alibaba on Friday launched an artificial intelligence tool that can generate images from prompts.
Tongyi Wanxiang allows users to input prompts in Chinese and English and the AI tool will generate an image in various styles such as a sketch or 3D cartoon.
Alibaba’s cloud division, which launched the product, said it is available for enterprise customers in China for beta testing. — Read More
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Scaling sequence length has become a critical demand in the era of large language models. However, existing methods struggle with either computational complexity or model expressivity, rendering the maximum sequence length restricted. In this work, we introduce LongNet, a Transformer variant that can scale sequence length to more than 1 billion tokens, without sacrificing the performance on shorter sequences. Specifically, we propose dilated attention, which expands the attentive field exponentially as the distance grows. LongNet has significant advantages: 1) it has a linear computation complexity and a logarithm dependency between tokens; 2) it can be served as a distributed trainer for extremely long sequences; 3) its dilated attention is a drop-in replacement for standard attention, which can be seamlessly integrated with the existing Transformer-based optimization. Experiments results demonstrate that LongNet yields strong performance on both long-sequence modeling and general language tasks. Our work opens up new possibilities for modeling very long sequences, e.g., treating a whole corpus or even the entire Internet as a sequence. — Read More
OpenAI is forming a new team to bring ‘superintelligent’ AI under control
OpenAI is forming a new team led by Ilya Sutskever, its chief scientist and one of the company’s co-founders, to develop ways to steer and control “superintelligent” AI systems.
In a blog post published today, Sutskever and Jan Leike, a lead on the alignment team at OpenAI, predict that AI with intelligence exceeding that of humans could arrive within the decade. This AI — assuming it does, indeed, arrive eventually — won’t necessarily be benevolent, necessitating research into ways to control and restrict it, Sutskever and Leike say. — Read More
OpenAI launches its GPT-4 API into general availability
OpenAI LP today made GPT-4, its newest and most capable language model, generally available through a cloud-based application programming interface.
… Alongside GPT-4, OpenAI is making three other AI models’ APIs generally available: GPT-3.5 Turbo, a predecessor to GPT-4 that offers more limited capabilities for a significantly lower cost, DALL-E for image generation, and Whisper for speech transcription. — Read More
gpt-author
This project utilizes a chain of GPT-4 and Stable Diffusion API calls to generate an original fantasy novel. Users can provide an initial prompt and enter how many chapters they’d like it to be, and the AI then generates an entire novel, outputting an EPUB file compatible with e-book readers.
A 15-chapter novel can cost as little as $4 to produce, and is written in just a few minutes. — Read More
Project S.A.T.U.R.D.A.Y — A Vocal Computing Toolbox
A toolbox for vocal computing built with Pion, whisper.cpp, and Coqui TTS. Build your own personal, self-hosted J.A.R.V.I.S powered by WebRTC
Project S.A.T.U.R.D.A.Y is a toolbox for vocal computing. It provides tools to build elegant vocal interfaces to modern LLMs. The goal of this project is to foster a community of like minded individuals who want to bring forth the technology we have been promised in sci-fi movies for decades. It aims to be highly modular and flexible while staying decoupled from specific AI Models. This allows for seamless upgrades when new AI technology is released. — Read More
What is Langchain and why should I care as a developer?
Langchain is one of the fastest growing open source projects in history, in large part due to the explosion of interest in LLM’s.
This post explores some of the cool thing that langchain helps developers do from a 30,000 foot overview. It was written for my own benefit as I explored the framework and I hope it helps you if you are also curios where langchain might be useful.
Some of the features that make langchain so powerful include allowing you to connect data to language models (like OpenAI’s GPT models via the API) and create agent workflows (more on agents later). — Read More