Media Summary: In this AI Research Roundup episode, Alex discusses the paper: 'Reinforcement Learning via In this video, we break down the key ideas from the paper Reinforcement Learning via In this video, we sit down with Jonas Hübotter (ETH Zurich) and Idan Shenfeld (MIT) to break down

Sdpo Llm Self Distillation With Rich Feedback - Detailed Analysis & Overview

In this AI Research Roundup episode, Alex discusses the paper: 'Reinforcement Learning via In this video, we break down the key ideas from the paper Reinforcement Learning via In this video, we sit down with Jonas Hübotter (ETH Zurich) and Idan Shenfeld (MIT) to break down In this AI Research Roundup episode, Alex discusses the paper: 'A Predictive Law for On-Policy Can AI learn more from a "Why" than a "No"? Explore how What if AI could learn from its mistakes the same way humans do?

Latent Space Paper Club with Johan Duramy and swyx - 12 Feb 2026 Ted Kyi presents a deep dive into "Reinforcement Learning ... In this AI Research Roundup episode, Alex discusses the paper: '

Photo Gallery

SDPO: LLM Self-Distillation with Rich Feedback
Reinforcement Learning via Self-Distillation: SDPO for Rich Feedback in LLM Reinforcement Learning
Why Self-Distillation Is Taking Over LLM Post-Training (w/ the Researchers Behind It)
Self-Distillation as a New Framework for Continual Learning | Idan Shenfeld | Random Samples
Predict LLM Self-Distillation Before Training
SDPO: Reinforcement Learning via Self-Distillation (Hübotter et al.)
Reinforcement Learning via Self-Distillation: Solving the Credit Assignment Problem
[Paper of the Day] SDPO: Reinforcement Learning via Self-Distillation
RL via Self-Distillation (SDPO) Paper Club 12 Feb 2026
SDAR: Gated Self-Distillation for LLM Agents
Reinforcement Learning via Self-Distillation
SPD: Boosting LLMs via Self-Distillation
Sponsored
Sponsored
View Detailed Profile
SDPO: LLM Self-Distillation with Rich Feedback

SDPO: LLM Self-Distillation with Rich Feedback

In this AI Research Roundup episode, Alex discusses the paper: 'Reinforcement Learning via

Reinforcement Learning via Self-Distillation: SDPO for Rich Feedback in LLM Reinforcement Learning

Reinforcement Learning via Self-Distillation: SDPO for Rich Feedback in LLM Reinforcement Learning

In this video, we break down the key ideas from the paper Reinforcement Learning via

Sponsored
Why Self-Distillation Is Taking Over LLM Post-Training (w/ the Researchers Behind It)

Why Self-Distillation Is Taking Over LLM Post-Training (w/ the Researchers Behind It)

In this video, we sit down with Jonas Hübotter (ETH Zurich) and Idan Shenfeld (MIT) to break down

Self-Distillation as a New Framework for Continual Learning | Idan Shenfeld | Random Samples

Self-Distillation as a New Framework for Continual Learning | Idan Shenfeld | Random Samples

Self

Predict LLM Self-Distillation Before Training

Predict LLM Self-Distillation Before Training

In this AI Research Roundup episode, Alex discusses the paper: 'A Predictive Law for On-Policy

Sponsored
SDPO: Reinforcement Learning via Self-Distillation (Hübotter et al.)

SDPO: Reinforcement Learning via Self-Distillation (Hübotter et al.)

This video provides an overview of

Reinforcement Learning via Self-Distillation: Solving the Credit Assignment Problem

Reinforcement Learning via Self-Distillation: Solving the Credit Assignment Problem

Can AI learn more from a "Why" than a "No"? Explore how

[Paper of the Day] SDPO: Reinforcement Learning via Self-Distillation

[Paper of the Day] SDPO: Reinforcement Learning via Self-Distillation

What if AI could learn from its mistakes the same way humans do?

RL via Self-Distillation (SDPO) Paper Club 12 Feb 2026

RL via Self-Distillation (SDPO) Paper Club 12 Feb 2026

Latent Space Paper Club with Johan Duramy and swyx - 12 Feb 2026 Ted Kyi presents a deep dive into "Reinforcement Learning ...

SDAR: Gated Self-Distillation for LLM Agents

SDAR: Gated Self-Distillation for LLM Agents

In this AI Research Roundup episode, Alex discusses the paper: '

Reinforcement Learning via Self-Distillation

Reinforcement Learning via Self-Distillation

This week we

SPD: Boosting LLMs via Self-Distillation

SPD: Boosting LLMs via Self-Distillation

In this AI Research Roundup episode, Alex discusses the paper: '

How AI Learns to Critique Its Own Failures

How AI Learns to Critique Its Own Failures

Can AI learn more from a "Why" than a "No"? Explore how