07.16.2025 | By: Steve Vassallo

The story of software is one of layered recursions. First, we wrote explicit code: line-by-line instructions for computers (software 1.0). Next, with the rise of deep learning, we trained neural networks to learn how to achieve objectives directly from data (software 2.0). Today, we use natural-language prompts to instruct AI systems to write code for us, which becomes the basis for further rounds of AI-generated prompts and code (software 3.0). Each step forward represents a new level of abstraction that brings human intent closer to machine execution.
Until now data engineering has seen little benefits from previous software revolutions, but software 3.0 is changing that. For years, data teams have been burdened by repetitive, tedious tasks: connecting to APIs, handling authentication, mapping JSON to tables, and maintaining fragile ETL pipelines, to name just a few. Today, these tasks can be described in natural language and delegated to AI systems.
The shifts are creating major tailwinds for our portfolio company dltHub. dltHub is an open-source product that’s purpose-built for the software 3.0 era. Designed from the start for both AI and human users, dltHub transforms the messy, complex world of data engineering into an accessible, streamlined, and collaborative human-AI workflow.
As a lead investor in dltHub’s seed round, I’ve watched dlt scale dramatically: from 87 companies in production in December 2023 to over 4,000 today, spanning startup unicorns and Fortune 500 firms. Just last month, dlt was downloaded more than 2 million times on PyPI, making it the most popular Python library for moving data and one of the fastest-growing data tooling projects I’ve ever seen.

Even more compelling is the momentum from LLMs and the rise of “vibe coding”: a new way of programming in natural language (concurrent with software 3.0) that’s allowing the long tail of niche data sources to become a shared catalog of pipelines. This June alone, more than 40,000 dlt sources were created by users, demonstrating the power of AI-assisted development at scale.
I first met Matthaus Krzykowski, dltHub’s CEO, through my longtime friend and fellow founder Lars Kamp. When we reconnected years later around dlt, Matthaus and Marcin, his co-founder and CTO, had lived the frustrations of building machine-learning data pipelines firsthand for customers of AI agent startup Rasa. They understood deeply the friction points in modern data engineering, especially for fast-scaling, Python-centric teams.
Their hard-earned insights, combined with my own experience as an early investor in several foundational data infrastructure startups including Tabular, MotherDuck, and Mode, set the stage for a data-nerdy reunion at dlt’s Berlin headquarters in late 2023. The more I dug in with Matthaus and Marcin, the more impressed I was with their technical depth, clarity of vision, and genuine passion for democratizing powerful data tools. By the end of our day together, we signed a term sheet to lead their seed round.

dltHub began with the mission of enabling data moving for Python-first data teams. As AI’s capabilities have advanced, their Python-native approach has positioned them perfectly for our software 3.0 world, allowing them to evolve from solving data movement challenges to reimagining how data infrastructure is created, shared, and managed.
With dltHub, you might start by telling an AI agent, “I need to load data from the Stripe API into BigQuery,” and within moments receive a working pipeline script that you can run or refine. This human-AI partnership reflects both the present and the future of programming, and dltHub is among the earliest developer tools to fully embrace this paradigm.
This human-AI collaboration becomes even more powerful when paired with community. Join dltHub Slack or browse their GitHub, and you’ll find a buzz of activity around building new data connectors. Every day, developers use LLMs (via tools like Cursor, GitHub Copilot, and Claude Code) to spin up pipelines for new APIs and data sources.

This community-driven approach makes dltHub more than a tool: it’s true platform and ecosystem. Just as GitHub became the home for code and open-source collaboration, dltHub aims to be the home for data connectors and pipelines: a centralized knowledge database where you can search for any data source and find a pre-built pipeline created by a fellow community member, often with AI assistance.
At Foundation, we believe the next generation of developer tools will be AI-native and community-driven. dltHub embodies this vision. Our investment centers on several key theses:
Programming is becoming a conversation with our computers. Where we once debugged syntax errors and wrestled with API documentation, we now describe our intentions and watch AI systems translate them into working code in ways that feel truly magical. Software 3.0 means both a new way of building software and radical expansion of who gets to build it – all happening at a breakneck pace.
dltHub sits at the center of this accelerating trend. Having watched the team develop and scale their vision over the past year, I couldn’t be more excited to see dltHub go live in beta for the broader developer community.
For anyone who works with data, getting started is simple: install the dlt library, explore the docs, and open your favorite IDE or AI coding assistant to see how quickly you can vibe code a pipeline. Be sure to also join the community Slack – you’ll find dltHub team members and users there to help.
We’re proud to back Matthaus, Marcin, and the entire dltHub team in their mission to make data engineering more effortless, intelligent, and community-driven. The age of human-AI collaboration in software is here, and dltHub is helping make it real for every Python engineer.
Published on July 15, 2025
Written by Steve Vassallo