Publications

2026

TD-Grokking: Learning from Zero-Reward Problems by Training-Time Decomposition
TD-Grokking: Learning from Zero-Reward Problems by Training-Time Decomposition
Ningyuan Xi, Hao Xu, Hongsheng Xin, Ning Miao
03 Jun 2026  ·  arXiv:2606.09883
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions
Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions
Xuan Yang, Hao Xu, Tingfeng Hui, Hongsheng Xin, Kaike Zhang, Chunxiao Liu, Ning Miao
02 Jun 2026  ·  arXiv:2606.03318
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics
STT-Arena: A More Realistic Environment for Tool-Using with Spatio-Temporal Dynamics
Tingfeng Hui, Hao Xu, Pengyu Zhu, Hongsheng Xin, Kun Zhan, Sen Su, Chunxiao Liu, Ning Miao
18 May 2026  ·  arXiv:2605.18548
Verifier-Backed Hard Problem Generation for Mathematical Reasoning
Verifier-Backed Hard Problem Generation for Mathematical Reasoning
Yuhang Lai, Jiazhan Feng, Yee Whye Teh, Ning Miao
07 May 2026  ·  arXiv:2605.06660
Step-Level Sparse Autoencoder for Reasoning Process Interpretation
Step-Level Sparse Autoencoder for Reasoning Process Interpretation
Xuan Yang, Jiayu Liu, Yuhang Lai, Hao Xu, Zhenya Huang, Ning Miao
ICML 2026  ·  03 Mar 2026  ·  arxiv:2603.03031
Linear Dynamics in the RLVR Training of Large Language Models
Linear Dynamics in the RLVR Training of Large Language Models
Tianle Wang, Jiayu Liu, Zhongyuan Wu, Shenghao Jin, Wei Chen, Hao Xu, Ning Miao
25 Jan 2026  ·  arxiv:2601.04537

2025

Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Enhancing Large Language Model Reasoning with Reward Models: An Analytical Survey
Qiyuan Liu, Hao Xu, Xuhong Chen, Wei Chen, Yee Whye Teh, Ning Miao
03 Oct 2025  ·  arxiv:2510.01925
Deep Thinking by Markov Chain of Continuous Thoughts
Deep Thinking by Markov Chain of Continuous Thoughts
Jiayu Liu, Zhenya Huang, Xuan Yang, Tianyun Ji, Anya Sims, Hao Xu, Enhong Chen, Yee Whye Teh, Ning Miao
29 Sep 2025  ·  arXiv:2509.25020