Intellectually Curious
Intellectually Curious is a podcast by Mike Breault featuring over 1,800 AI-powered explorations across science, mathematics, philosophy, and personal growth. Each short-form episode is generated, refined, and published with the help of large language models—turning curiosity into an ongoing audio encyclopedia. Designed for anyone who loves learning, it offers quick dives into everything from combinatorics and cryptography to systems thinking and psychology.
Inspiration for this podcast:
"Muad'Dib learned rapidly because his first training was in how to learn. And the first lesson of all was the basic trust that he could learn. It's shocking to find how many people do not believe they can learn, and how many more believe learning to be difficult. Muad'Dib knew that every experience carries its lesson."
― Frank Herbert, Dune
Note: These podcasts were made with NotebookLM. AI can make mistakes. Please double-check any critical information.
Intellectually Curious
SSD Unleashed: How Simple Self-Distillation Turns AI Guesses into Mastery
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
A deep dive into Simple Self-Distillation (SSD): how large language models can improve by training on their own unverified outputs with zero external supervision. We unpack the Precision Exploration Conflict, the roles of locks (need for precision) and forks (creative exploration), and how SSD reshapes token distributions to sharpen precision while preserving exploration. We review the Quinn 330B Instruct results on LiveCodeBench (notable ~30% relative gains and stronger improvements on hard problems) and discuss the surprising finding that even data with gibberish can help models learn the geometry of problem-solving. Finally, we consider what latent capabilities might be unlocked when models learn from their own guesses and what this could mean for AI-assisted problem solving.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
I recently tried to assemble um one of those incredibly complex flat pack entertainment centers.
SPEAKER_00Oh no. Those are a nightmare. Aaron Ross Powell Right.
SPEAKER_02And of course I lost the manual immediately, so I just started, you know, blindly guessing the steps.
SPEAKER_00Let me guess it didn't end well.
SPEAKER_02Aaron Ross Powell It ended in total disaster, just a backwards wobbly mess. But uh researchers at Apple actually just prove that when an AI blindly guesses its own steps without a manual, it doesn't build a wobbly bookshelf.
SPEAKER_00Aaron Powell Yeah. It actually figures out how to become a master builder. Aaron Powell.
SPEAKER_02Which is wild. Trevor Burrus, Exactly. So today our mission for this deep dive into the source material is unpacking a fascinating paper on something called simple self-distillation, or SSD.
SPEAKER_00It's an incredibly optimistic breakthrough.
SPEAKER_02Aaron Powell It really is. It basically shows how large language models can autonomously unlock their latent coding potential. The future of human and AI problem solving is looking so bright here. So let's jump right in. How does this work?
SPEAKER_00Well, the method is, I mean, it's shockingly straightforward. Usually to train an AI, you need external teachers or human labels or you know, feedback on whether the code actually executes properly.
SPEAKER_02Right. Someone has to tell it what it did wrong.
SPEAKER_00Exactly. But SSD stips all of that entirely. The AI generates code using specific temperature and truncation settings, and then get this, it fine-tunes itself on its own unverified raw outputs.
SPEAKER_02With zero external teachers, just learning from itself.
SPEAKER_00Zero. No execution feedback, nothing.
SPEAKER_02Wow. It's doing all of this without any human hand holding.
SPEAKER_00Yeah.
SPEAKER_02But you know, for those of you who do need a little hand holding to integrate AI into your businesses, that is exactly what EmberSilk specializes in. They're the sponsor of today's deep dive.
SPEAKER_00A very helpful resource for sure.
SPEAKER_02Definitely. If you need help with AI training or automation or integration or software development, basically uncovering where agents can make the most impact for your business or personal life, check out Embersilk.com for your AI needs.
SPEAKER_00Highly recommend them.
SPEAKER_02So getting back to SSD, the data here is just phenomenal. They tested this straightforward method on the Quinn 330B instruct model, and the pass rate on live code bench went from about 42.4% to 55.3%.
SPEAKER_00That is a 30% relative game.
SPEAKER_02And the biggest improvements were on the absolute hardest coding problems. But okay, I have to push back here for a second. Let's unpack this. If I practice bad habits playing the piano, I just get worse, right? I memorize the mistakes. So how does an AI training on its own potentially flawed, unverified code actually improve?
SPEAKER_00Aaron Powell That's the million-dollar question. To answer it, we have to look at what researchers call the precision exploration conflict. Aaron Powell Okay.
SPEAKER_02What is that?
SPEAKER_00Well, generating code features two distinct things, locks and forks. Locks are those moments of strict syntax that demand absolute precision.
SPEAKER_02Aaron Powell Like putting a bracket in the exact right place.
SPEAKER_00Exactly. But forks are different. Forks are the creative algorithmic choices where you actually need exploration. There are multiple valid paths.
SPEAKER_02Aaron Powell So a lock is like making sure the word is spelled right, and a fork is like deciding which word tells the best story.
SPEAKER_00That's a perfect way to put it. And standard decoding forces this really clumsy compromise between the two.
SPEAKER_02Aaron Powell Because it can't be precise and creative at the same time.
SPEAKER_00Right. But SSD fundamentally reshapes the token distributions. It suppresses distractions at those locks, which creates these sharp spikes of precision. Oh, I see. And at the same time, it preserves diverse valid choices at the forks. So it creates these broad plateaus of exploration.
SPEAKER_01Aaron Powell And researchers did a stress test to prove this, right? Which kind of blew my mind.
SPEAKER_00Yes. They trained the model on data where 62% was literal gibberish.
SPEAKER_0162% gibberish.
SPEAKER_00Complete nonsense. And incredibly, the model still improved.
SPEAKER_01Wait, really? Even with gibberish?
SPEAKER_00Yeah.
SPEAKER_01Yeah.
SPEAKER_00Because it proves it isn't just memorizing correct code, it's learning the underlying geometry of token probabilities. It's figuring out the mathematical shape of problem solving itself.
SPEAKER_02That is so inspiring. It's finding its own brilliance just by analyzing the shape of its own unverified guesses.
SPEAKER_00It really shows how much untapped potential these models already possess.
SPEAKER_02It totally does. Which leaves me with a thought for you to ponder if AI models have massive untapped potential, just waiting to be unlocked by looking at their own guesses. What other latent capabilities are hiding the tools you use every day?
SPEAKER_00There is so much more to discover.
SPEAKER_02So much more. It's a wonderful time to be intellectually curious. If you enjoyed this podcast, please subscribe to the show. Hey, leave us a five star review if you can. It really does help get the word out. Thanks for tuning in.