How a top Chinese AI model overcame US sanctions

The AI community is abuzz over DeepSeek R1, a new open-source reasoning model. 

The model was developed by the Chinese AI startup DeepSeek, which claims that R1 matches or even surpasses OpenAI’s ChatGPT o1 on multiple key benchmarks but operates at a fraction of the cost. 

… DeepSeek’s success is even more remarkable given the constraints facing Chinese AI companies in the form of increasing US export controls on cutting-edge chips. But early evidence shows that these measures are not working as intended. Rather than weakening China’s AI capabilities, the sanctions appear to be driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration. — Read More

#china-vs-us

DeepSeek R1’s bold bet on reinforcement learning: How it outpaced OpenAI at 3% of the cost

DeepSeek R1’s Monday release has sent shockwaves through the AI community, disrupting assumptions about what’s required to achieve cutting-edge AI performance. Matching OpenAI’s o1 at just 3%-5% of the cost, this open-source model has not only captivated developers but also challenges enterprises to rethink their AI strategies.

The model has rocketed to the top-trending model being downloaded on HuggingFace (109,000 times, as of this writing) – as developers rush to try it out and seek to understand what it means for their AI development. Users are commenting that DeepSeek’s accompanying search feature (which you can find at DeepSeek’s site) is now superior to competitors like OpenAI and Perplexity, and is only rivaled by Google’s Gemini Deep Research.

The implications for enterprise AI strategies are profound: With reduced costs and open access, enterprises now have an alternative to costly proprietary models like OpenAI’s. DeepSeek’s release could democratize access to cutting-edge AI capabilities, enabling smaller organizations to compete effectively in the AI arms race. — Read More

#reinforcement-learning

How China’s New AI Model DeepSeek Is Threatening U.S. Dominance

Read More

#videos

OpenAI launches Operator, an AI agent that performs tasks autonomously

OpenAI CEO Sam Altman kicked off this year by saying in a blog post that 2025 would be big for AI agents, tools that can automate tasks and take actions on your behalf.

… OpenAI announced on Thursday that it is launching a research preview of Operator, a general-purpose AI agent that can take control of a web browser and independently perform certain actions. Operator is coming to U.S. users on ChatGPT’s $200 Pro subscription plan first. OpenAI says it plans to roll this feature out to more users in its Plus, Team, and Enterprise tiers eventually. — Read More

Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell

Read More

#videos

OpenAI announces ‘The Stargate Project:’ $500bn over four years on AI infrastructure

OpenAI has announced ‘The Stargate Project,’ a new company set to invest $500 billion into AI infrastructure over the next four years

The data centers will be exclusively used by OpenAI as it expands its generative AI compute portfolio. Of the total investment, $100bn will be deployed ‘immediately.’

SoftBank, OpenAI, Oracle, and Abu Dhabi’s MGX are the equity investors in Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. SoftBank’s Masayoshi Son will serve as chairman.

The buildout is currently underway, starting in Texas – likely Oracle’s project in Abilene, Texas, which is itself leased from Crusoe. — Read More

#investing

BrowserAI

Run LLMs in the Browser – Simple, Fast, and Open Source!

No server costs or complex infrastructure needed. All processing happens locally – your data never leaves the browser. Simple API, multiple engine support, ready-to-use models. — Read More

#devops

Deepseek: The Quiet Giant Leading China’s AI Race

Deepseek is a Chinese AI startup whose latest R1 model beat OpenAI’s o1 on multiple reasoning benchmarks. Despite its low profile, Deepseek is the Chinese AI lab to watch.

… Deepseek’s strategy is grounded in their ambition to build AGI. Unlike previous spins on the theme, Deepseek’s mission statement does not mention safety, competition, or stakes for humanity, but only “unraveling the mystery of AGI with curiosity”. Accordingly, the lab has been laser-focused on research into potentially game-changing architectural and algorithmic innovations.

Deepseek has delivered a series of impressive technical breakthroughs. Before R1-Lite-Preview, there had been a longer track record of wins: architectural improvements like multi-head latent attention (MLA) and sparse mixture-of-experts (DeepseekMoE) had reduced inference costs so much as to trigger a price war among Chinese developers. Meanwhile, Deepseek’s coding model trained on these architectures outperformed open weights rivals like July’s GPT4-Turbo.

As a first step to understanding what’s in the water at Deepseek, we’ve translated a rare, in-depth interview with CEO Liang Wenfeng, originally published this past July on a 36Kr sub-brand. — Read More

#china-ai

AI Will Write Complex Laws

Artificial intelligence (AI) is writing law today. This has required no changes in legislative procedure or the rules of legislative bodies—all it takes is one legislator, or legislative assistant, to use generative AI in the process of drafting a bill.

In fact, the use of AI by legislators is only likely to become more prevalent. There are currently projects in the US House, US Senate, and legislatures around the world to trial the use of AI in various ways: searching databases, drafting text, summarizing meetings, performing policy research and analysis, and more. A Brazilian municipality passed the first known AI-written law in 2023.

That’s not surprising; AI is being used more everywhere. What is coming into focus is how policymakers will use AI and, critically, how this use will change the balance of power between the legislative and executive branches of government. Soon, US legislators may turn to AI to help them keep pace with the increasing complexity of their lawmaking—and this will suppress the power and discretion of the executive branch to make policy. — Read More

#legal

Deep-learning enabled generalized inverse design of multi-port radio-frequency and sub-terahertz passives and integrated circuits

Millimeter-wave and terahertz integrated circuits and chips are expected to serve as the backbone for future wireless networks and high resolution sensing. However, design of these integrated circuits and chips can be quite complex, requiring years of human expertise, careful tailoring of hand crafted circuit topologies and co-design with parameterized and pre-selected templates of electromagnetic structures. These structures (radiative and non-radiative, single-port and multi-ports) are subsequently optimized through ad-hoc methods and parameter sweeps. Such bottom-up approaches with pre-selected regular topologies also fundamentally limit the design space. Here, we demonstrate a universal inverse design approach for arbitrary-shaped complex multi-port electromagnetic structures with designer radiative and scattering properties, co-designed with active circuits. To allow such universalization, we employ deep learning based models, and demonstrate synthesis with several examples of complex mm-Wave passive structures and end-to-end integrated mm-Wave broadband circuits. The presented inverse design methodology, that produces the designs in minutes, can be transformative in opening up a new, previously inaccessible design space. — Read More

#strategy