Leading NEX’s $3.5M seed round: Next-generation visual foundation models

Ideas / News / Leading NEX’s $3.5M seed round: Next-generation visual foundation models

08.27.2024 | By: Ashu Garg

I met Darius Lam for the first time over ramen in Mountain View. Within minutes of meeting him, it was clear that he was a razor sharp technologist and had an endlessly enthusiastic, infectious energy about him.

Midway through lunch, he takes out his laptop to show me several AI-generated images that look like they could have come from one of the state-of-the-art image models on the market, trained with hundreds of thousands or millions dollars, on millions of images. I was stunned to learn that they were, in fact, images generated by his own model, trained from scratch, complete with architecture- and infrastructure-level optimizations to reduce cost, and that he had already acquired tens of thousands of monthly active users on the accompanying image-gen app.

Over the course of that interaction and subsequent meetings, I learn more about Darius’s life, which in today’s parlance could be aptly characterized as “monk mode.” I learned about his truly monastic daily routine, alternating between sets at the gym and studying loss curves on his elaborate model training dashboard.

I also admired his renaissance mind—he studied classics and computer science at Harvard and could fluidly interweave insights from both disciplines into our conversations. These ranged from Greek and Persian literature to his work on computer vision at Cerebras, another of Foundation Capital’s trailblazing AI companies that was recently named by Time as one of 2024’s 100 most influential companies. His experience at Cerebras, which builds wafer-scale chips for AI workloads, gave him an acute understanding of not only the software and training aspects of foundation model development, but also of the hardware and firmware layers.

We sparred over topics like model architecture and data curation. Darius routinely schooled me on the finer details of diffusion techniques and their downstream impact on model training, output quality, and runtime efficiency.

But Darius isn’t just an expert in model development. He also shows an extraordinary aptitude for building quickly and efficiently, iterating on market feedback, and shipping product.

At first glance, his company NEX might seem like another image generation app among many. But what makes NEX’s models special are their native capabilities for control. Today’s frontier image synthesis models make it easy to generate beautiful, compelling images from text; many now also offer the ability to change and edit images with additional text direction. However, for image models to really be useful for design, illustration, and other professional use cases, they need to offer more granular controls that preserve style, color, pose, or an artist’s initial design.

While today’s users might control image outputs with add-ons like ControlNets, their output quality tends to be poorer than raw generations from the base models, are often difficult to use, and are primarily compatible with older model versions. I spent several weekends tinkering with these models and their accompanying web UIs and controllability features, fine-tuning them, and relaying my findings back to Darius. While they represent meaningful strides in making powerful foundation models accessible to consumers, they remain limited in both quality and accessibility.

NEX solves these problems with a novel architectural and training approach that harmonizes multiple input conditions into one. This enables NEX to generate images that are higher quality than ControlNets and offer more varied conditions and inputs. It’s also more accessible: instead of using complex node-based workflows, NEX allows generations via a simple block system. This innovative approach also allows for smaller model sizes, making it possible to generate a 1024×1024 pose-conditioned image in under 3 seconds on NVIDIA 4090 GPUs, without quantization.

At Foundation Capital, we partner with incredible founders who are thinking about and building the future in novel and insightful ways, and we’ve been studying changes in the AI market landscape deeply. My partner Steve Vassallo was one of the earliest investors in Cerebras, which offers cluster-scale computing performance on a single chip and was incubated out of our offices. He has also written about the need to reimagine the design of generative AI products. In parallel, my partner Ashu Garg has closely tracked the trend toward multimodal architectures and compound AI systems that embed a variety of capabilities.

In keeping with building toward these visions of the future, we are thrilled to partner with Darius at NEX on a $3.5 million seed round, and to see what moving creations his users produce with it.

We are always looking for founders like Darius, who are able to masterfully blend technical expertise with market insight. If you’re an ambitious founder thinking about what’s next, I’d love to connect: ahan@foundationcap.com.

Published on August 27, 2024
Written by Foundation Capital

Leading NEX’s $3.5M seed round: Next-generation visual foundation models

Related Stories

A new blueprint for fintech: Goodbye human data transfer, hello AI agents as human APIs

Graduating from Series A: How to get to $10M ARR

A tool that democratizes Big Data analysis

Get insights directly to your inbox