Math-Shepherd

This is the talk and presentation I’ve given during seminar “Process Reward Modeling in LLMs” at the University of Heidelberg. It involves a presentation and a short academic discussions, the content is about paper sharing, experiments, and reproduction results with classmates and professors. The paper name is “Math-Shepherd: Verify and Reinforce LLMs Step-by-Step without Human Annotations” (Wang et al., 2024). You could find the paper and slide here: Annotated Paper (PDF) | Preview Slides (PDF) ...

May 1, 2026 | 2808 words | Author: Tan Ke