Rick's Cafe AI 12:07 pm on August 26, 2023
Tags: China AI ( 141 ), DevOps ( 360 )

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

We [Alibaba] introduce the Qwen-VL series, a set of large-scale vision-language models designed to perceive and understand both text and images. Comprising Qwen-VL and Qwen-VL-Chat, these models exhibit remarkable performance in tasks like image captioning, question answering, visual localization, and flexible interaction. The evaluation covers a wide range of tasks including zero-shot captioning, visual or document visual question answering, and grounding. We demonstrate the Qwen-VL outperforms existing Large Vision Language Models (LVLMs). We present their architecture, training, capabilities, and performance, highlighting their contributions to advancing multimodal artificial intelligence. Code, demo and models are available at https://github.com/QwenLM/Qwen-VL. — Read More

#china-ai, #devops

Rick's Cafe AI 1:20 pm on August 25, 2023
Tags: DevOps ( 360 )

AIColor: Colorize your old Photos with the power of AI

If you’re looking to colorize old black and white photos, our AI photo colorizer can help you bring your memories to life. — Read More

#devops

Rick's Cafe AI 1:16 pm on August 25, 2023
Tags: DevOps ( 360 )

Introducing Code Llama, a state-of-the-art large language model for coding

Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software. — Read More

Read the paper

Access the code

#devops

Rick's Cafe AI 1:18 pm on August 24, 2023
Tags: DevOps ( 360 )

LLMStack

LLMStack is a no-code platform for building generative AI applications, chatbots, agents and connecting them to your data and business processes.

Build tailor-made generative AI applications, chatbots and agents that cater to your unique needs by chaining multiple LLMs. Seamlessly integrate your own data and GPT-powered models without any coding experience using LLMStack’s no-code builder. Trigger your AI chains from Slack or Discord. Deploy to the cloud or on-premise. — Read More

#devops

Rick's Cafe AI 9:38 am on August 24, 2023
Tags: Human ( 323 ), Nvidia ( 113 )

An analog-AI chip for energy-efficient speech recognition and transcription

Models of artificial intelligence (AI) that have billions of parameters can achieve high accuracy across a range of tasks^1,2, but they exacerbate the poor energy efficiency of conventional general-purpose processors, such as graphics processing units or central processing units. Analog in-memory computing (analog-AI)^3,4,5,6,7 can provide better energy efficiency by performing matrix–vector multiplications in parallel on ‘memory tiles’. However, analog-AI has yet to demonstrate software-equivalent (SW_eq) accuracy on models that require many such tiles and efficient communication of neural-network activations between the tiles. Here we present an analog-AI chip that combines 35 million phase-change memory devices across 34 tiles, massively parallel inter-tile communication and analog, low-power peripheral circuitry that can achieve up to 12.4 tera-operations per second per watt (TOPS/W) chip-sustained performance. We demonstrate fully end-to-end SW_eq accuracy for a small keyword-spotting network and near-SW_eq accuracy on the much larger MLPerf⁸ recurrent neural-network transducer (RNNT), with more than 45 million weights mapped onto more than 140 million phase-change memory devices across five chips. — Read More

#nvidia, #human

Rick's Cafe AI 9:27 am on August 24, 2023
Tags: Human ( 323 )

A high-performance speech neuroprosthesis

Speech brain–computer interfaces (BCIs) have the potential to restore rapid communication to people with paralysis by decoding neural activity evoked by attempted speech into text^1,2 or sound^3,4. Early demonstrations, although promising, have not yet achieved accuracies sufficiently high for communication of unconstrained sentences from a large vocabulary^{1,2,3,4,5,6,7}. Here we demonstrate a speech-to-text BCI that records spiking activity from intracortical microelectrode arrays. Enabled by these high-resolution recordings, our study participant—who can no longer speak intelligibly owing to amyotrophic lateral sclerosis—achieved a 9.1% word error rate on a 50-word vocabulary (2.7 times fewer errors than the previous state-of-the-art speech BCI²) and a 23.8% word error rate on a 125,000-word vocabulary (the first successful demonstration, to our knowledge, of large-vocabulary decoding). Our participant’s attempted speech was decoded at 62 words per minute, which is 3.4 times as fast as the previous record⁸ and begins to approach the speed of natural conversation (160 words per minute⁹). Finally, we highlight two aspects of the neural code for speech that are encouraging for speech BCIs: spatially intermixed tuning to speech articulators that makes accurate decoding possible from only a small region of cortex, and a detailed articulatory representation of phonemes that persists years after paralysis. These results show a feasible path forward for restoring rapid communication to people with paralysis who can no longer speak. — Read More

#human

Rick's Cafe AI 9:00 am on August 24, 2023
Tags: China AI ( 141 ), Legal ( 51 )

Analyzing an Expert Proposal for China’s Artificial Intelligence Law

A few months after the introduction of OpenAI’s ChatGPT captured imaginations around the world, China’s State Council quietly announced that it would work toward drafting an Artificial Intelligence Law. The government had already acted relatively quickly, drafting, significantly revising, and finally implementing on August 15 rules on generative AI that build on existing laws. Still, broader questions about AI’s role in society remain, and the May announcement signaled that more holistic legislative thinking was on the horizon.

… In the case of this scholars’ draft of an AI Law, the accompanying explanation notes that it is to serve as a reference for legislative work and is expected to be revised in a 2.0 version. Although the connection between this text and any eventual Chinese AI Law is uncertain, its publication from a team led by Zhou Hui, deputy director of the CASS Cyber and Information Law Research Office and chair of a research project on AI ethics and regulation, makes it an early indication of how some influential policy thinkers are approaching the State Council-announced AI Law effort.

We invited DigiChina community members to share their analysis of the scholars’ draft, a translation of which was led by Concordia AI and is published here. Their responses are below. — Read More

#china-ai, #legal

Rick's Cafe AI 8:55 am on August 24, 2023
Tags: Image Recognition ( 313 ), Training ( 74 )

FlexiViT: One Model for All Patch Sizes

Vision Transformers convert images to sequences by slicing them into patches. The size of these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher accuracy at greater computational cost, but changing the patch size typically requires retraining the model. In this paper, we demonstrate that simply randomizing the patch size at training time leads to a single set of weights that performs well across a wide range of patch sizes, making it possible to tailor the model to different compute budgets at deployment time. We extensively evaluate the resulting model, which we call FlexiViT, on a wide range of tasks, including classification, image-text retrieval, open-world detection, panoptic segmentation, and semantic segmentation, concluding that it usually matches, and sometimes outperforms, standard ViT models trained at a single patch size in an otherwise identical setup. Hence, FlexiViT training is a simple drop-in improvement for ViT that makes it easy to add compute-adaptive capabilities to most models relying on a ViT backbone architecture. Code and pre-trained models are available at this https URL — Read More

#image-recognition, #training

Rick's Cafe AI 4:13 pm on August 23, 2023
Tags: NLP ( 486 )

This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure

This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure: https://thegradient.pub/othello/

You can look at an LLM trained on Othello moves, and extract from its internal state the current state of the board after each move you tell it. In other words, an LLM trained on only moves, like “E3, D3,..” contains within it a model of a 8×8 board grid and the current state of each square. — Read More

#nlp

Rick's Cafe AI 3:58 pm on August 23, 2023
Tags: Videos ( 379 )

Exploring Artificial Intelligence’s Potential & Threats | Andrew Ng | Eye on AI #131

Read More

#videos

Rick's Cafe AI

The latest in Artificial Intelligence carefully curated into its own special blend

Recent Updates Page 168

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

AIColor: Colorize your old Photos with the power of AI

Introducing Code Llama, a state-of-the-art large language model for coding

LLMStack

An analog-AI chip for energy-efficient speech recognition and transcription

A high-performance speech neuroprosthesis

Analyzing an Expert Proposal for China’s Artificial Intelligence Law

FlexiViT: One Model for All Patch Sizes

This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure

Exploring Artificial Intelligence’s Potential & Threats | Andrew Ng | Eye on AI #131