Intellectually Curious
Intellectually Curious is a podcast by Mike Breault featuring over 1,800 AI-powered explorations across science, mathematics, philosophy, and personal growth. Each short-form episode is generated, refined, and published with the help of large language models—turning curiosity into an ongoing audio encyclopedia. Designed for anyone who loves learning, it offers quick dives into everything from combinatorics and cryptography to systems thinking and psychology.
Inspiration for this podcast:
"Muad'Dib learned rapidly because his first training was in how to learn. And the first lesson of all was the basic trust that he could learn. It's shocking to find how many people do not believe they can learn, and how many more believe learning to be difficult. Muad'Dib knew that every experience carries its lesson."
― Frank Herbert, Dune
Note: These podcasts were made with NotebookLM. AI can make mistakes. Please double-check any critical information.
Intellectually Curious
Autonomous AI Agents in Research: Codex, Claude Code, and the Future of the Workflow
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
In this Intellectually Curious deep dive, we unpack a VoxDev webinar featuring Aniket Panjwani on how autonomous AI agents are transforming research workflows. From iterative loops and skill-based wrappers to Git-backed safety and disciplined planning, Codex and Claude Code can run regressions, critique hypotheses, and accelerate learning with minimal human busywork. We cover practical setups, how to structure context windows, and the director-vs-micromanager mindset.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
I mean, I once spent literally three agonizing weeks in grad school trying to format uh a single data table, just tweaking margins, fighting with our libraries, and basically questioning every life choice that led me to that exact moment.
SPEAKER_00Oh, I remember those days, the endless syntax battles. But well, those days are officially over.
SPEAKER_01Right. It is such a huge relief. For those of you listening to Intellectually Curious today, our deep dive is into a Vox Dev webinar by Anniket Panjuani. He is an economist and AI director. And we are looking at how autonomous AI agents, uh specifically Codex and Claude Code, are completely transforming the research workflow.
SPEAKER_00Yeah, and the mission today is to really show you how to apply these exact tools to supercharge your own learning, you know, to just skip the busy work entirely.
SPEAKER_01Okay, so let's unpack this. We hear so much about AI speed, like uh Dartmouth economist Paul Nobosad writing a functional paper in just three hours. But I want to know how that is actually happening under the hood.
SPEAKER_00Aaron Powell Well, the critical shift is that these agents are now executing iterative loops. So they don't just generate a block of text and then stop. They actually chain reasoning steps together. Like if an agent writes code to run a regression and hits an error, it reads that error, rewrites the code, and tests it again.
SPEAKER_01All completely autonomously.
SPEAKER_00Exactly, completely on its own. And what's fascinating here is that this drastically lowers the cost of human experimentation. When testing a hypothesis takes hours instead of like months, you can explore wildly different theories.
SPEAKER_01And it isn't just basic data cleaning, neither, is it? I mean, Terence Tao, who is one of the world's preeminent mathematicians, he watched an AI solve an open-airdose math problem.
SPEAKER_00Yeah, by stringing together these incredibly complex logical steps, it is wild.
SPEAKER_01Which brings us to how researchers are actually harnessing this power without it, you know, going completely off the rails. Panjuani highlights this mechanism called skills.
SPEAKER_00Right. Skills are huge. Econometrician Pedro San Ana, for instance, built these specific skills named review paper and data analysis.
SPEAKER_01So if I am understanding this, a skill isn't just a clever prompt. It is essentially putting the AI on a specialized set of tracks, right?
SPEAKER_00Yes, that is a great way to put it. You feed it pre-written scripts and specific R libraries. So it isn't just guessing how to format a table or evaluate an econometric specification.
SPEAKER_01It is executing your exact methodological recipe.
SPEAKER_00Precisely. That structural constraint is what makes it so reliable. The review paper skill, for example, is coded to systematically scan a PDF against a known database of common referee objections.
SPEAKER_01Oh wow.
SPEAKER_00Yeah, it basically forces the AI to evaluate the robustness of the methodology rather than just spitting out a generic summary of the text.
SPEAKER_01Now building those custom wrappers and integrating them into a daily workflow is usually where people hit a wall. But if you need help uncovering where autonomous agents could make the most impact for your business or personal projects, our sponsor, Embersilk, actually specializes in this.
SPEAKER_00You really do make it accessible.
SPEAKER_01Absolutely. You can visit Embersilk.com for AI, training, automation, integration, or software development to get these systems up and running. Because once they are running, you definitely need a strategy to manage them.
SPEAKER_00Oh, for sure. Letting an agent rewrite its own code autonomously introduces real operational risks if you aren't careful.
SPEAKER_01Right. Let's say I unleash clawed code on my project directory. How do I stop it from just overriding my entire life's work with a flawed mathematical assumption?
SPEAKER_00Aaron Powell Well, you instruct the AI to use git. It acts as the agent's memory in sandbox.
SPEAKER_01Okay, so it creates a separate branch every time it attempts a new analytical approach.
SPEAKER_00Aaron Powell Exactly. Git provides a rollback. But you know, git doesn't stop an AI from making a bad assumption early on and then spiraling out of control for 20 automated steps.
SPEAKER_01Aaron Powell Yeah. How do you catch that hallucination before it wastes all that compute?
SPEAKER_00Aaron Powell You have to manage the AI's context window. The most common mistake people make is letting the agent brainstorm, debug, and execute all in one giant continuous chat thread.
SPEAKER_01So the context window just fills up.
SPEAKER_00Yeah, and the AI gets overwhelmed by its own conversational noise.
SPEAKER_01The actual instructions get diluted.
SPEAKER_00Precisely. So the co-tip here is to separate the planning from the execution. Use one session purely to brainstorm and force the AI to output a finalized plan document.
SPEAKER_01Oh, I see. Then you open a fresh session, hand it that static plan, and just say, execute these steps.
SPEAKER_00Exactly. It keeps the agent constrained and sharply focused on your approved methodology.
SPEAKER_01That completely changes what it means to be a learner. You step into the role of a director rather than a micromanager.
SPEAKER_00It really is such an optimistic time to dive into these tools.
SPEAKER_01It truly is. Before I share my final thought on what that means for you, if you enjoyed the show, please subscribe. Hey, leave us a five-star review if you can. It really does help get the word out. Thanks for tuning in.
SPEAKER_00We really appreciate you listening.
SPEAKER_01We do. Because here is the real question for you to ponder today. If autonomous agents can reliably iterate the code, run the regressions, and anticipate the referee objections, what entirely undiscovered fields of human imagination will you unlock when the friction of execution is completely removed?