Math-Shepherd

Fri, 01 May 2026 10:11:25 +0000

This is the talk and presentation I’ve given during seminar “Process Reward Modeling in LLMs” at the University of Heidelberg.

It involves a presentation and a short academic discussions, the content is about paper sharing, experiments, and reproduction results with classmates and professors.

The paper name is “Math-Shepherd: Verify and Reinforce LLMs Step-by-Step without Human Annotations” (Wang et al., 2024).

You could find the paper and slide here:

Annotated Paper (PDF) | Preview Slides (PDF)

Verification on Tan Ke

Math-Shepherd