Projects

Here are some projects I have worked on.

Unary Feedback (UFO) thumbnail

Unary Feedback (UFO)

June 2025

A reinforcement learning framework for improving multi-turn reasoning in LLMs using minimal feedback like 'Let's try again'. Enhances answer diversity and error correction without dense supervision.

View Details →
RAGEN thumbnail

RAGEN

May 2025

A research framework for training LLM agents in stochastic, multi-turn question answering environments. Equips agents with trajectory-level feedback, enabling adaptive behavior over time rather than reliance on isolated responses.

View Details →