The Network is the Product: Data Network Flywheel, Compound Through Connection

The value of a data product is never contained within its boundaries. It emerges from the number, quality, and friction of its connections, and the signals from its produce. Connectivity is the architecture that turns isolated signals into coordinated intelligence. The mistake most teams make is assuming insight comes from accumulation, when in reality it comes from interaction. — Read More

#data-science

Data Science Is Not Dying…. It Is Splitting Into Five New Jobs You Should Know

When I first fell in love with data, it felt simple and dangerous at the same time.

You could load a CSV… run a few queries… make a chart… and suddenly people listened.

That was twelve years ago. The tools have changed… the expectations have widened… and the word data scientist now means something else depending on the team you join.

…Data science is not going away. It is multiplying into new craft roles that require different muscles.

If you treat this moment like doom… you will be outpaced.

If you treat it like a chance to pick what you are really good at… you will be in demand. — Read More

#data-science

Why You’ll Never Have a FAANG Data Infrastructure and That’s the Point | Part 1

This is Part 1 of a Series on FAANG data infrastructures. In this series, we’ll be breaking down the state-of-the-art designs, processes, and cultures that FAANGs or similar technology-first organisations have developed over decades. And in doing so, we’ll uncover why enterprises desire such infrastructures, whether these are feasible desires, and what the routes are through which we can map state-of-the-art outcomes without the decades invested or the millions spent in experimentation. This is an introductory piece, touching on the fundamental questions, and in the upcoming pieces, we’ll pick one FAANG at a time and break down the infrastructure to project common patterns and design principles, and illustrate replicable maps to the outcomes. — Read More

#data-science

Thinking Like a Data Engineer

I thought becoming a data engineer meant mastering tools. Instead, it meant learning how to see. I thought the hardest part would be learning the tools — Hadoop, Spark, SQL optimization, and distributed processing. Over time, I realized the real challenge wasn’t technical. It was learning how to think.

Learning to think like a data engineer — to see patterns in chaos, to connect systems to human behavior, to balance simplicity and scale — is a slow process of unlearning, observing, and reimagining. I didn’t get there through courses or certifications. I got there through people.

Four mentors, in four different moments of my life, unknowingly gave me lessons that shaped how I approach engineering, leadership, and even life. Each taught me something not about data, but about thinking systems.

What follows isn’t a tutorial. It’s a map of how four people — and their lessons — rewired how I think. — Read More

#data-science

Data Modeling for the Agentic Era: Semantics, Speed, and Stewardship

In data analytics, we’re facing a paradox. AI agents can theoretically analyze anything, but without the right foundations, they’re as likely to hallucinate a metric as to calculate it correctly. They can write SQL in seconds, but will it answer the right business question? They promise autonomous insights, but at what cost to trust and accuracy?

These days, everyone is embedding AI chat in their product. But to what end? Does it actually help, or would users rather turn to tools like Claude Code when they need real work done? The real questions are: how can we model our data for agents to reliably consume, and how can we use agents to develop better data models?

After spending the last year exploring where LLMs have genuine leverage in analytics (see my writing on GenBI and Self-Serve BI), I’ve identified three essential pillars that make agentic data modeling actually work: semantics as the shared language both humans and AI need to understand metrics, speed through sub-second analytics that lets you verify numbers before they become decisions, and stewardship with guardrails that guide without constraining. The TL;DR? AI needs structure to understand, humans need speed to verify, and both need boundaries to stay productive. — Read More

#data-science

The Complete AI Engineering Roadmap for Beginners

Hey there, future AI engineer!

Feeling overwhelmed by all the AI buzz and wondering where to start? Don’t worry. This roadmap will take you from “What’s AI?” to building real AI systems, one step at a time. Think of this as your GPS for the AI journey ahead!

Here’s your friendly guide to breaking into the world of AI Engineering. — Read More

#data-science

AI-Ready Data: A Technical Assessment. The Fuel and the Friction.

Most organizations operate data ecosystems built over decades of system acquisitions, custom development, and integration projects. These systems were designed for transactional processing and business reporting, not for the real-time, high-quality, semantically rich data requirements of modern AI applications.

Research shows that 50% of organizations are classified as “Beginners” in data maturity, 18% are “Dauntless” with high AI aspirations but poor data foundations, 18% are “Conservatives” with strong foundations but limited AI adoption, and only 14% are “Front Runners” achieving both data maturity and AI scale. — Read More

#data-science

Meta’s Data Scientist’s Framework for Navigating Product Strategy as Data Leaders

One question that I often get is what makes Product Data Scientist special at Meta. My answer has always been “You are by default a product leader, navigating product directions with data”. This is true across all levels, from new grads to directors. Data scientists at Meta don’t just analyze data — they transform business questions into data-driven product visions that help building better human connections.

The challenge? Product strategy development exists across a spectrum of conditions. Here I’ll explore how data scientists at Meta can drive product strategies across four distinct scenarios defined by data availability (low to high) and problem clarity (broad to concrete). — Read More

#data-science

10 Years of Experience in 10 Minutes — A Data Analyst’s Problem-Solving Guide

Data analytics isn’t just about crunching numbers — it’s about solving real business problems with clarity and efficiency. Over the past decade, I’ve faced countless challenges, from messy datasets to indecisive stakeholders. This guide is my way of condensing 10 years of hard-earned experience into 10 minutes of actionable insights. Whether you’re just starting or refining your approach, these lessons will help you think and work like an experienced data analyst. — Read More

#data-science

Work smarter, not harder: Using the 80/20 principle in data analysis.

Have you heard of the 80/20 rule, or the Pareto Principle? It says that roughly 80% of the effects come from 20% of the causes.

In most cases, a small percentage of efforts drive most of the results. Let’s apply this rule to data analysis, and work smarter, not harder!

Why is the 80/20 rule useful? It lets you focus on the few tasks that generate the most value for you and your organization. This saves time, increases efficiency, and makes you more useful at work. — Read More

#data-science