Those attending outdoor parties or barbecues in New York City this weekend may notice an uninvited guest looming over their festivities: a police surveillance drone.
The New York City police department plans to pilot the unmanned aircrafts in response to complaints about large gatherings, including private events, over Labor Day weekend, officials announced Thursday. — Read More
Recent Updates Page 167
How to Fine-Tune Llama2 for Python Coding on Consumer Hardware
Our previous article covered Llama 2 in detail, presenting the family of Large Language models (LLMs) that Meta introduced recently and made available for the community for research and commercial use. There are variants already designed for specific tasks; for example, Llama2-Chat for chat applications. Still, we might want to get an LLM even more tailored for our application.
Following this line of thought, the technique we are referring to is transfer learning. This approach involves leveraging the vast knowledge already in models like Llama2 and transferring that understanding to a new domain. Fine-tuning is a subset or specific form of transfer learning. In fine-tuning, the weights of the entire model, including the pre-trained layers, are typically allowed to adjust to the new data. It means that the knowledge gained during pre-training is refined based on the specifics of the new task.
In this article, we outline a systematic approach to enhance Llama2’s proficiency in Python coding tasks by fine-tuning it on a custom dataset. — Read More
Autonomous visual information seeking with large language models
There has been great progress towards adapting large language models (LLMs) to accommodate multimodal inputs for tasks including image captioning, visual question answering (VQA), and open vocabulary recognition. Despite such achievements, current state-of-the-art visual language models (VLMs) perform inadequately on visual information seeking datasets, such as Infoseek and OK-VQA, where external knowledge is required to answer the questions. — Read More
AI-powered drone beats human champion pilots
Having trounced humans at everything from chess and Go, to StarCraft and Gran Turismo, artificial intelligence (AI) has raised its game and defeated world champions at a real-world sport.
The latest mortals to feel the sting of AI-induced defeat are three expert drone racers who were beaten by an algorithm that learned to fly a drone around a 3D race course at breakneck speeds without crashing. Or at least not crashing too often. — Read More
Baidu and SenseTime launch ChatGPT-style AI bots to the public
Chinese tech firms Baidu and SenseTime launched their ChatGPT-style AI bots to the public on Thursday, marking a new milestone in the global AI race.
Baidu has opened public access to its ERNIE Bot, allowing users to conduct AI-powered searches or carry out an array of tasks, from creating videos to providing summaries of complex documents. — Read More
Redub Me — Speak to the world!
Dub your content into 70+ languages at a click of a button, and reach millions of new fans. — Read More
Machine Learning Libraries For Any Project
There are many libraries out there that can be used in machine learning projects. Of course, some of them gained considerable reputations through the years. Such libraries are the straight-away picks for anyone starting a new project which utilizes machine learning algorithms. However, choosing the correct set (or stack) may be quite challenging.
In this post, I would like to give you a general overview of the machine learning libraries landscape and share some of my thoughts about working with them. — Read More
VALL-E-X: Multilingual Text-to-Speech Synthesis and Voice Cloning
An open source implementation of Microsoft’s VALL-E X zero-shot TTS model.
VALL-E X is an amazing multilingual text-to-speech (TTS) model proposed by Microsoft. While Microsoft initially publish in their research paper, they did not release any code or pretrained models. Recognizing the potential and value of this technology, our team took on the challenge to reproduce the results and train our own model. We are glad to share our trained VALL-E X model with the community, allowing everyone to experience the power next-generation TTS! — Read More
Grubhub is bringing Amazon’s cashierless tech to colleges this fall
Grubhub’s bringing Amazon’s cashierless Just Walk Out technology to some colleges, the company announced today. The food delivery service will first focus on rolling out the tech to colleges, starting with Loyola University Maryland next week before expanding nationwide.
The tech is capable of identifying items taken from and returned to shelves so students and staff can buy food from on-campus stores without waiting in line. After scanning a QR code in the Grubhub app, the company will automatically charge their Grubhub-linked meal plans or other stored payment methods after they leave the store. — Read More
Watch out, Midjourney! Ideogram launches AI image generator with impressive typography
Earlier this week, a new generative AI image startup called Ideogram, founded by former Google Brain researchers, launched with $16.5 million in seed funding led by a16z and Index Ventures.
Another image generator? Don’t we have enough to choose from between Midjourney, OpenAI’s Dall-E 2, and Stability AI’s Stable Diffusion? Well, Ideogram has a major selling point, as it may have finally solved a problem plaguing most other popular AI image generators to date: reliable text generation within the image, such as lettering on signs and for company logos. — Read More