Bootstrapping AI Training with Composer Autoinstall Artwork

Intellectually Curious

Intellectually Curious is a podcast by Mike Breault featuring over 1,800 AI-powered explorations across science, mathematics, philosophy, and personal growth. Each short-form episode is generated, refined, and published with the help of large language models—turning curiosity into an ongoing audio encyclopedia. Designed for anyone who loves learning, it offers quick dives into everything from combinatorics and cryptography to systems thinking and psychology.

Inspiration for this podcast:

"Muad'Dib learned rapidly because his first training was in how to learn. And the first lesson of all was the basic trust that he could learn. It's shocking to find how many people do not believe they can learn, and how many more believe learning to be difficult. Muad'Dib knew that every experience carries its lesson."

― Frank Herbert, Dune

Note: These podcasts were made with NotebookLM. AI can make mistakes. Please double-check any critical information.

Show More

Intellectually Curious

Bootstrapping AI Training with Composer Autoinstall

June 12, 2026 • Mike Breault

0:00 | 5:27

We dive into Cursor’s May 2026 work on Composer Auto Install, a two-stage bootstrapping system that auto-generates runnable training environments for AI coders. An initial agent drafts setup commands; a second agent tests them, fabricating missing pieces and even patching dependencies live to get code running. The result is a dramatic jump in TerminalBench scores (61.7% vs 47.9%) and a scalable path to teaching AI to code—without getting bogged down by messy environment setup.

Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.

Sponsored by Embersilk LLC

SPEAKER_00 0:00

Last week I decided uh I was finally gonna learn how to cook a proper beef wellington. But like imagine if before I could even chop an onion or, you know, sear the meat, the instructor just hands me a toolbox and tells me to go plumb the kitchen sink.

SPEAKER_01 0:13

Right, and wire the oven while you're at it?

SPEAKER_00 0:15

Exactly. I mean, I'd never learn to cook, I'd just be a terrible plumber. And that is exactly the uh the frustrating hurdle AI developers hit when trying to train coding models.

SPEAKER_01 0:25

Aaron Powell Yeah, because if an AI model is spending all its computational power trying to figure out a broken package manager or like a missing dependency, it's not actually learning how to code.

SPEAKER_00 0:35

Right. So instead of training on Python or whatever, the model is burning millions of parameters just trying to get the environment to run.

SPEAKER_01 0:42

I don't know.

SPEAKER_00 0:42

Which brings us to today's deep dive into a really fascinating May 2026 research post from Cursor on uh Composer Auto Install.

SPEAKER_01 0:51

It's such an ingenious system.

SPEAKER_00 0:52

It really is. And you know, before we get into it, if you are inspired by this kind of AI automation and uh you need help uncovering where AI agents can make the most impact for your business, you should really check out Embersilk.com.

SPEAKER_01 1:04

Oh, definitely.

SPEAKER_00 1:05

Yeah. They are today's sponsor and they help with everything from AI training to integration and software development. So check out Embersilk.com for your AI needs. But uh getting back to cursor, how exactly do they stop their models from acting like frustrated plumbers?

SPEAKER_01 1:21

Aaron Powell So it really comes down to automating the Fed up. Like when you're training a coding model using reinforcement learning, the AI absolutely needs a runnable environment. It requires that um that feedback loop of execution, you know? Trevor Burrus, Jr.

SPEAKER_00 1:34

Like trying the code, seeing if it compiles, adjusting, like paying an expensive tutor, but you spend the whole hour just looking for your textbook.

SPEAKER_01 1:41

Right, exactly. But doing that manually across thousands of unique, messy code bases is physically impossible. Yeah. You would spend lifetimes just configuring files.

SPEAKER_00 1:50

But wait, shouldn't human developers just, I don't know, I'm sure these training environments work perfectly from the start?

SPEAKER_01 1:56

Um in in a perfect world, sure. But at scale, across thousands of diverse repositories. No way. So Cursor used their older model, Composer 1.5, to automatically build the training environments for the new model, Composer 2.

SPEAKER_00 2:10

Wait, the older AI builds the classroom for the newer one. How does it actually do that without a human stepping in to fix the inevitable bugs?

SPEAKER_01 2:18

It uses this uh two-stage bootstrapping process. First is goal setting. An agent scans the target code base, it checks the README's, the make files, and it proposes 10 setup commands.

SPEAKER_00 2:30

Okay, and I assume it predicts what the successful output should look like.

SPEAKER_01 2:33

Yep, exactly. Then the second agent takes three of those commands and actually tries to execute them.

SPEAKER_00 2:39

And when things inevitably break, because you know it's software.

SPEAKER_01 2:42

That is where the second agent gets really creative. It actively problem solves. It'll like mock missing files or great placeholder database tables or even spin up gummy docker containers just to force the code to run.

SPEAKER_00 2:54

Wait, really? It just fakes them.

SPEAKER_01 2:56

Yeah, and it loops up to five times if it hits an error, just trying new workarounds.

SPEAKER_00 3:00

Hold on, if we're generating fake tables and dummy containers just to pass a setup test, aren't we training the new model in a completely like hallucinated environment? It sounds like building a movie set where the doors don't actually lead anywhere.

SPEAKER_01 3:13

Aaron Powell It does sound counterintuitive, I'll give you that. But think about what the reinforcement learning model actually needs here. It doesn't care about the real data in the database.

SPEAKER_00 3:22

Oh, so it just needs the underlying logic to compile.

SPEAKER_01 3:25

Right. It just needs to execute, so it gets that reward signal for writing correct code syntax. The uh the movie set is all it needs to practice the actual mechanics of coding.

SPEAKER_00 3:35

Okay, so the structural feedback is there, even if the data isn't. But I mean, does this actually work on a genuinely messy project out in the wild?

SPEAKER_01 3:43

Oh, it does. They tested this on the CeeLo Monrepo, which is this really large blockchain project with um pretty sparse documentation.

SPEAKER_00 3:51

A total nightmare to set up, basically.

SPEAKER_01 3:53

Exactly. And the AI realized it was missing a dependency called Foundry, but it didn't just fake it, it actively searched the live web, read the Foundry docs, saw it needed authentication, and then created a localized mock user just to successfully run the app.

SPEAKER_00 4:09

Wait, so it's dynamically patching the environment on the fly using the internet?

SPEAKER_01 4:13

Yes. And the benchmark results show exactly why this matters. Because Composer 1.5 could set up these environments so effectively, Composer 2 jumped to a 61.7% score on Terminal Bench.

SPEAKER_00 4:26

Wow. And that's the benchmark for configuring developer environments, right?

SPEAKER_01 4:30

Yep, and that score is a huge leap from the older models 47.9%.

SPEAKER_00 4:35

That is just such a brilliant positive feedback loop. Like the older generation clears the brush, paving a wider, faster road for the next generation to learn on. It actually reminds me a lot of um Demis Sisavis recently winning the Nobel Prize for using AI to predict protein structures.

SPEAKER_01 4:51

Oh, absolutely. He sees AI as this incredible tool that will help scientists make even more discoveries in the years to come.

SPEAKER_00 4:57

Right. By having AI pave the way for its successors, we're removing all these tedious roadblocks. It's so inspiring.

SPEAKER_01 5:03

It really is. Just imagine the unprecedented scientific and creative breakthroughs humanity will achieve when AI models handle their own run management and data pre-processing. We won't be bogged down by the setup.

SPEAKER_00 5:18

I love that. Well, if you enjoy this deep dive, please subscribe to the show and hey, leave us a five-star review if you can. It really does help get the word out. Thanks for tuning in.