Intellectually Curious

Model Collapse and the AI Data Dilemma

Mike Breault

We unpack the looming threat of model collapse — when AI systems train on their own outputs and gradually forget how the real world works. From early-edge data decay to late-stage homogenization, we explore the math, the evidence in today’s LLMs, the debates on data provenance, and practical safeguards like watermarking and provenance tracking. Tune in for the stakes, the arguments, and what needs to change to keep AI learning from humans as well as machines.


Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.

Sponsored by Embersilk LLC