Apple today announced new Apple Intelligence features that elevate the user experience across iPhone, iPad, Mac, Apple Watch, and Apple Vision Pro. Apple Intelligence unlocks new ways for users to communicate with features like Live Translation; do more with what’s on their screen with updates to visual intelligence; and express themselves with enhancements to Image Playground and Genmoji.1 Additionally, Shortcuts can now tap into Apple Intelligence directly, and developers will be able to access the on-device large language model at the core of Apple Intelligence, giving them direct access to intelligence that is powerful, fast, built with privacy, and available even when users are offline. These Apple Intelligence features are available for testing starting today, and will be available to users with supported devices set to a supported language this fall. — Read More
Recent Updates Page 64
IBM now describing its first error-resistant quantum compute system
On Tuesday, IBM released its plans for building a system that should push quantum computing into entirely new territory: a system that can both perform useful calculations while catching and fixing errors and be utterly impossible to model using classical computing methods. The hardware, which will be called Starling, is expected to be able to perform 100 million operations without error on a collection of 200 logical qubits. And the company expects to have it available for use in 2029.
Perhaps just as significant, IBM is also committing to a detailed description of the intermediate steps to Starling. These include a number of processors that will be configured to host a collection of error-corrected qubits, essentially forming a functional compute unit. This marks a major transition for the company, as it involves moving away from talking about collections of individual hardware qubits and focusing instead on units of functional computational hardware. If all goes well, it should be possible to build Starling by chaining a sufficient number of these compute units together.
“We’re updating [our roadmap] now with a series of deliverables that are very precise,” IBM VP Jay Gambetta told Ars, “because we feel that we’ve now answered basically all the science questions associated with error correction and it’s becoming more of a path towards an engineering problem.” — Read More
Duolingo’s CEO outlined his plan to become an ‘AI-first’ company. He didn’t expect the human backlash that followed
On April 28, Duolingo cofounder and CEO Luis von Ahn posted an email on LinkedIn that he had just sent to all employees at his company. In it, he outlined his vision for the language-learning app to become an “AI-first” organization, including phasing out contractors if AI could do their work, and giving a team the ability to hire a new person only if they were not able to automate their work through AI.
The response was swift and scathing. “This is a disaster. I will cancel my subscription,” wrote one commenter. “AI first means people last,” wrote another. And a third summed up the general feeling of critics when they wrote: “I can’t support a company that replaces humans with AI.” — Read More
How You Can Use Few-Shot Learning In LLM Prompting To Improve Its Performance
You must’ve noticed that large language models can sometimes generate information that seems plausible but isn’t factually accurate. Providing more explicit instructions and context is one of the key ways to reduce such LLM hallucinations.
That said, have you ever struggled to get an AI model to understand precisely what you want to achieve? Perhaps you’ve provided detailed instructions only to receive outputs that fall short of the mark?
Here is where we’ll use the few-shot prompting technique to guide LLMs toward producing accurate, relevant, and properly formatted responses. In it, you’ll teach the LLM by example rather than through complex explanations. Excited?! Let’s begin! — Read More
How to Use Banned US Models in China
In China, U.S.-based large language models like ChatGPT, Claude, or Gemini are technically banned, blocked, or buried under layers of censorship. The Chinese government has only explicitly banned ChatGPT, citing concerns over political content, while other U.S. models like Claude and Gemini are not formally banned but remain inaccessible due to the Great Firewall. U.S. LLM providers also restrict access from China but leave some loopholes: OpenAI blocks API use but Azure continues to serve enterprise clients via offshore data centers; Anthropic blocks access to Claude within China but permits use by Chinese subsidiaries based in supported regions abroad; and Google does not offer the Gemini API in China, but access seems to be still possible via third-parties like Cloudflare (we reached out to Google for a comment but didn’t hear back).
But on Taobao, the country’s largest e-commerce platform, consumers and companies can buy access to these models with just a few clicks. This piece explains how Western models are priced, advertised, bought, and sold in China, and what their popularity reveals about state censorship, platform enforcement, and consumer demand. — Read More
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood. Current evaluations primarily focus on established mathematical and coding benchmarks, emphasizing final answer accuracy. However, this evaluation paradigm often suffers from data contamination and does not provide insights into the reasoning traces’ structure and quality. In this work, we systematically investigate these gaps with the help of controllable puzzle environments that allow precise manipulation of compositional complexity while maintaining consistent logical structures. This setup enables the analysis of not only final answers but also the internal reasoning traces, offering insights into how LRMs “think”. Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counterintuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget. By comparing LRMs with their standard LLM counterparts under equivalent inference compute, we identify three performance regimes: (1) low complexity tasks where standard models surprisingly outperform LRMs, (2) medium-complexity tasks where additional thinking in LRMs demonstrates advantage, and (3) high-complexity tasks where both models experience complete collapse. We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles. We also investigate the reasoning traces in more depth, studying the patterns of explored solutions and analyzing the models’ computational behavior, shedding light on their strengths, limitations, and ultimately raising crucial questions about their true reasoning capabilities. — Read More
Meta and Yandex are de-anonymizing Android users’ web browsing identifiers
Tracking code that Meta and Russia-based Yandex embed into millions of websites is de-anonymizing visitors by abusing legitimate Internet protocols, causing Chrome and other browsers to surreptitiously send unique identifiers to native apps installed on a device, researchers have discovered. Google says it’s investigating the abuse, which allows Meta and Yandex to convert ephemeral web identifiers into persistent mobile app user identities.
The covert tracking—implemented in the Meta Pixel and Yandex Metrica trackers—allows Meta and Yandex to bypass core security and privacy protections provided by both the Android operating system and browsers that run on it. Android sandboxing, for instance, isolates processes to prevent them from interacting with the OS and any other app installed on the device, cutting off access to sensitive data or privileged system resources. Defenses such as state partitioning and storage partitioning, which are built into all major browsers, store site cookies and other data associated with a website in containers that are unique to every top-level website domain to ensure they’re off-limits for every other site. — Read More
The Shape of Things to Come
Amazon ‘testing humanoid robots to deliver packages’: Amazon is reportedly developing software for humanoid robots that could perform the role of delivery workers and “spring out” of its vans.
… The Information reported that the robots could eventually take the jobs of delivery workers. It is developing the artificial intelligence software that would power the robots but will use hardware developed by other companies. — Read More
Walmart and Wing expand drone delivery to five more US cities: Wing, the on-demand drone delivery company owned by Alphabet, is spreading its commercial wings with help from Walmart.
The two companies announced Thursday plans to roll out drone delivery to more than 100 Walmart stores in five new cities: Atlanta, Charlotte, Houston, Orlando, and Tampa. Walmart is also adding Wing drone deliveries to its existing market in the Dallas-Fort Worth area. — Read More
AGI Is Not Multimodal
The recent successes of generative AI models have convinced some that AGI is imminent. While these models appear to capture the essence of human intelligence, they defy even our most basic intuitions about it. They have emerged not because they are thoughtful solutions to the problem of intelligence, but because they scaled effectively on hardware we already had. Seduced by the fruits of scale, some have come to believe that it provides a clear pathway to AGI. The most emblematic case of this is the multimodal approach, in which massive modular networks are optimized for an array of modalities that, taken together, appear general. However, I argue that this strategy is sure to fail in the near term; it will not lead to human-level AGI that can, e.g., perform sensorimotor reasoning, motion planning, and social coordination. Instead of trying to glue modalities together into a patchwork AGI, we should pursue approaches to intelligence that treat embodiment and interaction with the environment as primary, and see modality-centered processing as emergent phenomena. — Read More
Duolingo said it just doubled its language courses thanks to AI
Duolingo is “more than doubling” the number of courses it has available, a feat it says was only possible because it used generative AI to help create them in “less than a year.”
The company said today that it’s launching 148 new language courses. …Duolingo says that building one new course historically has taken “years,” but the company was able to build this new suite of courses more quickly “through advances in generative AI, shared content systems, and internal tooling.” The new approach is internally called “shared content,” and the company says it allows employees to make a base course and quickly customize it for “dozens” of different languages. — Read More