Zhouliang Yu (郁昼亮)

PhD Student, Chinese University of Hong Kong

Large Language Models & Agentic Reasoning
Reinforcement Learning & Formal Mathematics
Embodied AI & World Models

Email: zhouliangyu at link dot cuhk dot edu dot cn

[Email] [CV] [Google Scholar] [GitHub] [Twitter]

Zhouliang Yu

I'm Zhouliang Yu (郁昼亮), a PhD student at the Scalable Principles for Learning and Reasoning Lab (SphereLab) in the Chinese University of Hong Kong, Department of Computer Science & Engineering, advised by Prof. Weiyang Liu. My research focuses on large language models, deep learning, reinforcement learning, and formal reasoning.

My primary research (2024–2027) centers on exploration-based reinforcement learning for formal mathematics reasoning using agentic large language models. I am also actively learning RL infrastructure to support large model training.

Beyond my core focus, I am interested in applications of reinforcement learning in model-based embodied AI and scientific discovery through formal verification (I have not yet published in these areas), such as projects like Scientist AI and PhysLean.

Previously, I pursued doctoral studies at HKUST under Academician Yike Guo. I have also conducted research at Alibaba's Tongyi Lab. Earlier, I received my bachelor's degree in computer science from CUHK-SZ.

郁昼亮,香港中文大学计算机科学与工程学系博士生,在 SphereLab 师从 刘威杨 教授。读博期间,我主要研究以可扩展的学习与推理原理为驱动的大模型训练算法,面向形式化推理及更广义的科学发现场景,提升探索式求解与发现能力。现阶段,我的工作主要聚焦于大模型在形式化数学推理上的训练算法。此前,我曾在香港科技大学攻读博士学位,师从 郭毅可 院士。我亦曾在阿里巴巴通义实验室从事研究工作;更早以前,我于 香港中文大学(深圳) 取得计算机专业学士学位。

Research summary

Most of my research is about reinforcement learning, large language models, AI4Math, and embodied AI. Some papers are highlighted.

Publications

Mathematical Reasoning & AI4Math

Formalmath: Benchmarking formal mathematical reasoning of large language models
Zhouliang Yu, Ruotian Peng, Keyi Ding, Yizhe Li, Zhongyuan Peng, Minghao Liu, Yifan Zhang, Zheng Yuan, Huajian Xin, Wenhao Huang, Yandong Wen, Ge Zhang, Weiyang Liu
arXiv preprint arXiv:2505.02735, 2025
Kimina-prover preview: Towards large formal reasoning models with reinforcement learning
Haiming Wang, Mert Unsal, Xiaohan Lin, Mantas Baksys, Junqi Liu, Zhouliang Yu, et al.
arXiv preprint arXiv:2504.11354, 2025
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Zhongyuan Peng, Yifan Yao, Kaijing Ma, Shuyue Guo, Yizhe Li, Yichi Zhang, Chenchen Zhang, Yifan Zhang, Zhouliang Yu, et al.
arXiv preprint arXiv:2507.06181, 2025

Large Language Models

Map-neo: Highly capable and transparent bilingual large language model series
Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chow Leuang Yu (Core Contributor, authored with my Cantonese name), et al.
Technical Report, 2024
Chinese tiny llm: Pretraining a chinese-centric large language model
Xinwei Du, Zhouliang Yu (Co-First Author), Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Wenhu Chen, Ge Zhang
Conference on Language Modeling, 2024
Codeeditorbench: Evaluating code editing capability of large language models
Jiawei Guo, Ziming Li, Xueling Liu, Kaijing Ma, Tianyu Zheng, Zhouliang Yu, Dawei Pan, Yizhi Li, Ruibo Liu, Yue Wang, Shuyue Guo, et al.
arXiv preprint arXiv:2404.03543, 2024

Reinforcement Learning

Asp: Learn a universal neural solver!
Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang
IEEE Transactions on Pattern Analysis and Machine Intelligence, 46 (6), 4102-4114, 2024
Generating Symbolic World Models via Test-time Scaling of Large Language Models
Zhouliang Yu, Yuhuan Yuan, Tim Z. Xiao, Fuxiang Frank Xia, Jie Fu, Ge Zhang, Ge Lin, Weiyang Liu
Transactions on Machine Learning Research, 2025

Embodied AI & Robotics

ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots
Zhixuan Xu, Chongkai Gao, Zixuan Liu, Gang Yang, Chenrui Tie, Haozhuo Zheng, Haoyu Zhou, Weikun Peng, Debang Wang, Tianrun Hu, Tianyi Chen, Zhouliang Yu, Lin Shao
International Conference on Intelligent Robots and Systems (Oral), 2024
Multireact: Multimodal tools augmented reasoning-acting traces for embodied agent planning
Zhouliang Yu, Jie Fu, Yue Mu, Chenguang Wang, Lin Shao, Yaodong Yang
Robot Learning Workshop at NeurIPS 2023 (Oral), 2023