A System of Agents brings Service-as-Software to life READ MORE
02.03.2024 | By: Ashu Garg
This month, on the cusp of B2BaCEO’s 50th podcast episode, I revisit conversations with five of my most recent and prominent guests. Each occupies a unique position at the forefront of AI. I’ve distilled the ten most important learnings for founders who are navigating the world of generative models.
I start at Databricks with two of its co-founders, Ali Ghodsi, now CEO, and Matei Zaharia, now CTO, along with its head of generative AI, Naveen Rao. The three share their thoughts on how AI is transforming software, offer key considerations for founders building with LLMs, and opine on why we shouldn’t fret over AGI. They also touch on the challenges of working with emerging technologies for which predefined categories and markets do not yet exist.
I then turn to Robert Nishihara, co-founder and CEO of Anyscale, a managed service for running distributed computing workloads. Robert describes some of the trends that he’s seeing in generative AI adoption from his experience with Anyscale’s users. I close with Bobby Yerramilli-Rao, chief strategy officer at Microsoft, who gives an incumbent’s perspective on the most promising opportunities for AI startups.
Without further ado…
Building on Marc Andreessen’s now truism that “software is eating the world,” Ali goes one step further with his claim that “AI will eat software.” The way Ali sees it, AI won’t just be an add-on in the future: it will be native to all data platforms. “Intelligence will creep in wherever you have data,” he emphasizes. “That’s going to happen in every profession, in every industry, in every organization.” The result will further software’s transition from a static, hard-coded tool into a self-driving, decision-making engine.
Ali compares our present stage of AI development to the late 1970s, when Oracle launched its relational database management system. Oracle’s innovation dramatically simplified and improved the ways that companies organize and store their data, paving the way for a new era of data-centric software. Ali envisions “AI databases” as the natural next step. “We call it a ‘Lakehouse,’ but you can just think of it as a database with ML and AI abilities built in. Wherever there’s data, there’s going to be machine learning sitting right next to it.” Increasingly, intelligence and automation will be bundled into the data architecture that underlies enterprise software.
He draws further parallels to tech giants like Google and Twitter, which he argues are data analytics and AI companies at core. Search, targeted advertising, and social media are outgrowths of these more fundamental capabilities. As he puts it: “These companies disrupted their markets with data analytics and AI. How do we enable every company on the planet to do that? I think that’s what the future will be like.”
Ali’s advice for founders? Target the innovators and early adopters (read: other startups) first. He draws on Geoffrey Moore’s Crossing the Chasm, which identifies five key groups in the technology adoption cycle: innovators, early adopters, early majority, late majority, and laggards. Ali’s advice is especially relevant for AI-focused founders, given the myriad of risks (security, privacy, explainability, and so on) that come with putting generative AI models into production.
Ali stresses the difficulties of targeting established, traditional companies—the laggards—as an initial market. These large enterprises come with a labyrinth of security requirements, bureaucratic red tape, and drawn-out sales cycles. Beginning with other startups increases the likelihood that a product will be adopted and provides a powerful way to collect early user feedback. Establishing this data flywheel can help an AI startup differentiate their model and iterate their way to product-market fit.
Once their product shows success on a smaller scale, founders can then start to shift upmarket to the enterprise. Ali cites Salesforce and Dropbox as examples of iconic enterprise startups that pursued this “startup-first” GTM strategy. “That’s much easier than saying we’re going to crack the code directly to sell to JPMC from day one,” he relays.
To date, many founders and users alike have approached LLMs as if they were vast databases, marveling at their ability to learn and recall detailed information about seemingly every subject known to humans. However, Matei cautions against this perspective, especially when building real-world applications. One major issue with treating LLMs as databases is their propensity to hallucinate: a problem that tends to become more pronounced in larger models.
Instead of treating LLMs as internet-scale encyclopedias, Matei proposes that we should focus on what they excel at: reasoning, interpreting language, and understanding context. “For founders building applications, really think of the LLMs, especially the large LLMs, as reasoning engines. And build the knowledge engine outside the model.” This means leveraging LLMs for their language analysis and generation skills while sourcing factual, up-to-date information from trusted external sources through retrieval mechanisms and tool calling. By adopting this strategy, founders can capitalize on LLMs’ strengths while minimizing the risks associated with their unreliable recall of specific facts.
In a must-listen for any founder working at AI’s frontier, Naveen offers insights into the challenges of launching companies in emerging technology sectors. He describes his approach to entrepreneurship as “taking the hard path.” “I’ve gone after technologies that haven’t yet happened, but I know they’re going to happen. There isn’t yet a market and employees don’t even know about it yet. So how the hell do you build a company?”
Consider Naveen’s journey with Nervana, an AI-focused chip company he founded in 2014. At the time, the chip industry was deeply out of favor, and machine learning was largely unknown outside of academia. “We were pitching employees that [machine learning] is a game changer, that it’s something completely different. They’d never heard of it, and they had to go do some learning on their own to even understand, ‘Is this viable or not?’”
The same pattern applied at MosaicML. Observing the power of LLMs like BERT, which Google put into production in 2019, Naveen recognized these models’ broad applicability across industries. Taking the next logical step, he foresaw the need for enterprise-grade services to manage their training, deployment, and customization. A core part of his journey as a founder was educating and convincing others—be they employees, investors, or prospective customers—about the potential of LLMs, which were then far from mainstream.
Carving out a new category presents a unique opportunity for founders. Naveen describes it this way: “As the problem space itself is evolving, you’re defining the solution, but there’s a lot of value to doing that because you can actually change the way people think about that problem and think about solutions. […] You’re playing at a harder level, but you’re also doing something potentially much more impactful.” His words underscore a key point: true innovation isn’t just about creating new products; it’s about reshaping how problems are understood.
Naveen offers an antidote to the doomsday fears about AGI, expressing deep skepticism about claims of its imminent arrival. He pegs the likelihood of achieving AGI in the next decade at between 30% and 50%. In a his view, a more realistic timeframe is within thirty years, with a 90% probability.
To support his claim, Naveen reflects on the history of AI development, highlighting its cyclical nature and the tendency for promising advances to hit scalability limits. “The whole AI world has been a series of demonstrations that didn’t scale. We’ve seen this happen over and over again.” He recalls the mid-90s, when the support vector machine was introduced and momentarily overshadowed neural network research. “People were like, ‘Oh my God, we’ve solved everything. This is the way that the brain works.’ Then it got tapped out. It didn’t scale.”
Looking at current AI trends, Naveen believes that we may be at a similar juncture with Q learning and reinforcement learning. “Q learning has traditionally not scaled to high dimensional spaces very well. Reinforcement learning breaks down when you get to those places. Will we solve it? Probably. But we’ll have to figure out how to make it work stably at those scales.”
Robert has a wider purview than most when it comes to generative AI. As a Ph.D. student in computer science at UC Berkeley in 2013, he felt like he had already missed the AI wave. But outside of academic research labs, real-world adoption of generative AI has taken much longer.
In 2023, the goal for most businesses experimenting with generative AI was simply to put something out there: build an MVP, validate that it has a clear use case, and iterate quickly. “At this stage, you’re not looking to spend a lot of time training a model. You’re looking to ship something quickly,” Robert says. API products make a lot of sense here, as they allow for fast launches with minimal engineering effort.
We’re now maturing to phase two, where businesses have proven the value of generative AI. They’ve shipped something. It has users. Now the emphasis turns to enhancing quality, reducing latency, extending the application, and lowering costs. These improvements often require a shift from APIs to more customized models. At this stage, “there are a lot of harder infrastructure challenges. Performance and cost become big considerations,” Robert explains. His bet is that, as generative models mature, we’ll see more enterprises lean into the flexibility and control that open-source solutions afford.
To date, most businesses have relied on one of three options for their generative AI needs: proprietary models, open-source models, or their own custom-built models. Robert is now seeing hybrid strategies emerge, where businesses are evaluating different types of models for different use cases. With the rapid advancement of AI technology, it won’t be long before open-source models are “good enough” for the majority of business tasks. The natural progression from here is to develop smaller, task-focused models using proprietary company data. As Robert puts it: “You’re going to need small, fast models. And task-specific models have a huge advantage when it comes to cost and speed because you can achieve the same quality with a smaller model if it’s specialized.”
Another interesting development Robert highlights is the growing use of routing architectures. Imagine a classifier as a kind of traffic controller for your AI queries. It assesses the complexity of each query and directs it to the most suitable model: simpler requests are handled by smaller models, while the more complex ones are passed on to larger, more sophisticated models. “If you do it right,” Robert describes, “you can achieve something like the cost of the smaller model with the quality of the more advanced model.”
As Anyscale builds out its own LLM-powered features, Robert noticed what he thinks will be a growing problem: how we evaluate generative models. This problem only compounds as the number of customized and fine-tuned models grows. Traditionally, with machine learning, this was an easy question to answer. You’d have your test data set, run the model against it, and get an accuracy score. If the new model scored higher, you’d simply swap it in.
That’s much harder with natural language, where outputs are open-ended and benchmarks are not absolute. How do you know if one textual or visual response is better than another? What does “better” mean in this context? This subjectivity of generative outputs makes them difficult to evaluate in clear-cut ways. Moreover, many existing frameworks for ML evaluation originated in academia and fail to map neatly onto business use cases. As a result, Robert sees model evaluation as a major opportunity for startups in 2024.
To grow an AI application, making it part of users’ workflows might be just as important as the tool you’re building. “The technology and the speed of writing applications is going to far outpace the ability for people to change their behavior—if that’s what they need to do to adopt,” Bobby conveys. Take video conferencing. It was on an interesting penetration trajectory, then COVID happened, and it went through an enormous step change. It’s now something that’s part of many peoples’ lives on a daily basis. This makes it much easier to introduce AI features that will be adopted, as you don’t have to rely on someone doing something they weren’t doing before.
Many professionals aren’t going to upend their workflows to try something new. Gaining AI adoption needs to start at the level of working into established behavioral patterns.
Speaking to founders, Bobby stressed the importance of solving a durable problem: one that will persist as generative models’ capabilities progress. This is always a challenge when building with emerging technologies, but it’s particularly acute when it comes to AI, given the pace at which the field is advancing. If OpenAI can make your product obsolete by releasing a new state-of-the-art model, you’re probably focusing on the wrong problem.
Bobby highlighted that some of the most exciting opportunities for AI founders lie at the application layer. Here, products can be developed, released on GitHub without any marketing (and sometimes not even a formed company!), and take off as vertical applications. “I often joke these days that vertical is the new up-and-to-the-right,” Bobby says. Focusing on a particular vertical allows founders to tap into high-quality, domain-specific data, which they can use to customize models and establish footholds in smaller yet stickier markets. Other areas for founders to explore include model observability, along with anything related to ensuring the compliance, predictability, and privacy of models.
As we move through 2024, I’m excited to continue my focus on AI in what promises to be another breakthrough year for AI-first founders. If there’s anyone whom you’d like to hear from on the podcast, or specific topics that you’re eager to explore, feel free to leave a comment or reach out!
Published on 01.26.2024
Written by Ashu Garg