Moving Data Quality and Business Logic Upstream for More Efficient Data Systems
Shifting left is an interesting concept that’s gaining momentum in modern data engineering. SDF has been among those sharing this approach, even making “shifting left” one of their main slogans. As Elias DeFaria, SDF’s co-founder, describes it, shifting left means “improving data quality by moving closer toward the data source”.
However, the benefits extend beyond just data quality improvements. With dbt Labs’ recent acquisition of SDF, many are wondering: what does this mean for the shifting left movement, and more importantly, what exactly is shifting left in the data context?
In this article, we’ll explore the core principles behind shifting left, examine how code-first approaches have made moving logic upstream more efficient, and answer the questions: Why should data teams shift left? What elements need to be shifted? And how can your organization implement this approach to build more maintainable, efficient data systems? — Read More
Monthly Archives: April 2025
Model Context Protocol (MCP)
MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools. — Read More
How to Build an Agent
It’s not that hard to build a fully functioning, code-editing agent.
It seems like it would be. When you look at an agent editing files, running commands, wriggling itself out of errors, retrying different strategies – it seems like there has to be a secret behind it.
There isn’t. It’s an LLM, a loop, and enough tokens. It’s what we’ve been saying on the podcast from the start. The rest, the stuff that makes Amp so addictive and impressive? Elbow grease.
But building a small and yet highly impressive agent doesn’t even require that. You can do it in less than 400 lines of code, most of which is boilerplate.
I’m going to show you how, right now. We’re going to write some code together and go from zero lines of code to “oh wow, this is… a game changer.” — Read More
NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time
NVIDIA is working with its manufacturing partners to design and build factories that, for the first time, will produce NVIDIA AI supercomputers entirely in the U.S.
Together with leading manufacturing partners, the company has commissioned more than a million square feet of manufacturing space to build and test NVIDIA Blackwell chips in Arizona and AI supercomputers in Texas. — Read More
DolphinGemma: How Google AI is helping decode dolphin communication
For decades, understanding the clicks, whistles and burst pulses of dolphins has been a scientific frontier. What if we could not only listen to dolphins, but also understand the patterns of their complex communication well enough to generate realistic responses?
Today, on National Dolphin Day, Google, in collaboration with researchers at Georgia Tech and the field research of the Wild Dolphin Project (WDP), is announcing progress on DolphinGemma: a foundational AI model trained to learn the structure of dolphin vocalizations and generate novel dolphin-like sound sequences. This approach in the quest for interspecies communication pushes the boundaries of AI and our potential connection with the marine world. — Read More
OpenAI debuts its GPT-4.1 flagship AI model
OpenAI has introduced GPT-4.1, a successor to the GPT-4o multimodal AI model launched by the company last year. During a livestream on Monday, OpenAI said GPT-4.1 has an even larger context window and is better than GPT-4o in “just about every dimension,” with big improvements to coding and instruction following.
GPT-4.1 is now available to developers, along with two smaller model versions. That includes GPT-4.1 Mini, which, like its predecessor, is more affordable for developers to tinker with, and GPT-4.1 Nano, an even more lightweight model that OpenAI says is its “smallest, fastest, and cheapest” one yet. — Read More
Meta defends Llama 4 release against ‘reports of mixed quality,’ blames bugs
Meta’s new flagship AI language model Llama 4 came suddenly over the weekend, with the parent company of Facebook, Instagram, WhatsApp and Quest VR (among other services and products) revealing not one, not two, but three versions — all upgraded to be more powerful and performant using the popular “Mixture-of-Experts” architecture and a new training method involving fixed hyperparameters, known as MetaP.
But following the surprise announcement and public release of two of those models for download and usage — the lower-parameter Llama 4 Scout and mid-tier Llama 4 Maverick — on Saturday, the response from the AI community on social media has been less than adoring. — Read More
Going beyond open data – increasing transparency and trust in language models with OLMoTrace
Today we introduce OLMoTrace, a one-of-a-kind feature in the Ai2 Playground that lets you trace the outputs of language models back to their full, multi-trillion-token training data in real time. OLMoTrace is a manifestation of Ai2’s commitment to an open ecosystem – open models, open data, and beyond. OLMoTrace is available today with our flagship models, including OLMo 2 32B Instruct. — Read More
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Scalable Vector Graphics (SVG) is an important image format widely adopted in graphic design because of their resolution independence and editability. The study of generating high-quality SVG has continuously drawn attention from both designers and researchers in the AIGC community. However, existing methods either produces unstructured outputs with huge computational cost or is limited to generating monochrome icons of over-simplified structures. To produce high-quality and complex SVG, we propose OmniSVG, a unified framework that leverages pre-trained Vision-Language Models (VLMs) for end-to-end multimodal SVG generation. By parameterizing SVG commands and coordinates into discrete tokens, OmniSVG decouples structural logic from low-level geometry for efficient training while maintaining the expressiveness of complex SVG structure. To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Extensive experiments show that OmniSVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows. — Read More
Samsung’s cute Ballie robot arrives this summer with Google Gemini in tow
Samsung’s Ballie will go on sale in the US and South Korea this summer, the company announced today. What’s more, through a partnership with Google Cloud, the diminutive robot will ship with a Gemini AI model.
Samsung didn’t state the specific system that powers Ballie, but in combination with the company’s own proprietary language models, it says the robot has multimodal capabilities, meaning Ballie can process voice, audio and visual data from its sensors. According to Samsung, Ballie can also manage your smart home devices and even offer health and styling recommendations, if you’re inclined to seek that type of advice from a robot. — Read More