Publications
2026
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions
arXiv
·
02 Jun 2026
·
arXiv:2606.03318
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics
arXiv
·
18 May 2026
·
arXiv:2605.18548
Verifier-Backed Hard Problem Generation for Mathematical Reasoning
arXiv
·
07 May 2026
·
arXiv:2605.06660
Step-Level Sparse Autoencoder for Reasoning Process Interpretation
arXiv
·
03 Mar 2026
·
arxiv:2603.03031
Linear Dynamics in the RLVR Training of Large Language Models
arXiv
·
25 Jan 2026
·
arxiv:2601.04537
2025
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
arXiv
·
03 Oct 2025
·
arxiv:2510.01925
Deep Thinking by Markov Chain of Continuous Thoughts
arXiv
·
29 Sep 2025
·
arXiv:2509.25020