About me
I am currently a senior researcher at Tencent. I obtained my Ph.D. (2021) and B.S. (2016) in Computer Science at Tsinghua University, advised by Guangwen Yang.
Previously I was a visiting researcher at Stanford, where I was fortunate to advised by Tengyu Ma. Before that, I spent three years of internship at Microsoft Research and worked closely with Lintao Zhang, Tao Qin and Li Zhao.
I am currently interested in research topics related to multimodal reasoning, agent, reinforcement learning, causal inference and their commercial applications. Email: lzcthu12[at]gmail.com
Publications & Preprints
($^*$indicates equal contribution, $^{\dagger}$indicates project lead)
Multimodal Reasoning & Agents
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
Zichuan Lin*$^{\dagger}$, Feiyu Liu*, Yijun Yang*, Jiafei Lyu*, Yiming Gao*, Yicheng Liu*, Zhicong Lu, Yangbin Yu, Mingyu Yang, Junyou Li, Deheng Ye, Jie Jiang
Technical Report, 2026 [Code] [Huggingface] (#2 Daily Paper)AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
Zichuan Lin*$^{\dagger}$, Yicheng Liu*, Yang Yang, Lvfang Tao, Deheng Ye
CVPR 2026 [Code] [Project Page]ProAct: Agentic Lookahead in Interactive Environments
Yangbin Yu, Mingyu Yang, Junyou Li, Yiming Gao, Feiyu Liu, Yijun Yang, Zichuan Lin, Jiafei Lyu, Yicheng Liu, Zhicong Lu, Deheng Ye, Jie Jiang
Technical Report, 2026 [Code]CausalMACE: Causality Empowered Multi-Agents in Minecraft Cooperative Tasks
Qi Chai, Zhang Zheng, Junlong Ren, Deheng Ye, Zichuan Lin, Hao Wang
EMNLP 2025SeeNav-Agent: Enhancing Vision-Language Navigation with Visual Prompt and Step-Level Policy Optimization
Zhengcheng Wang*, Zichuan Lin*, Yijun Yang, Haobo Fu, Deheng Ye
Preprint, 2025 [Code]EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
Kai Yang, Xin Xu, Yangkun Chen, Weijie Liu, Jiafei Lyu, Zichuan Lin, Deheng Ye, Saiyong Yang
Preprint, 2025 [Code]
Reinforcement Learning
Multi-agent In-context Coordination via Decentralized Memory Retrieval
Tao Jiang, Zichuan Lin$^{\dagger}$, Lihe Li, Yi-Chen Li, Cong Guan, Lei Yuan, Zongzhang Zhang, Yang Yu, Deheng Ye
AAAI 2026 (Oral) [Code]Revisiting Discrete Soft Actor-Critic
Haibin Zhou, Tong Wei, Zichuan Lin, Junyou Li, Deheng Ye, Qiang Fu, Wei Yang
TMLR 2024CurrMask: Learning Versatile Skills with Automatic Masking Curricula
Zhihui Xie, Yao Tang, Zichuan Lin, Deheng Ye, Shuai Li
NeurIPS 2024Replay-enhanced Continual Reinforcement Learning
Tiantian Zhang, Kevin Z. Shen, Zichuan Lin, Bo Yuan, Xueqian Wang, Xiu Li, Deheng Ye
TMLR 2023A Survey on Transformers in Reinforcement Learning
Wenzhe Li*, Hao Luo*, Zichuan Lin*, Chongjie Zhang, Zongqing Lu, Deheng Ye
TMLR 2023Future-conditioned Unsupervised Pretraining for Decision Transformer
Zhihui Xie, Zichuan Lin, Deheng Ye, Qiang Fu, Wei Yang, Shuai Li
ICML 2023Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization
Tiantian Zhang, Zichuan Lin, Yuxing Wang, Deheng Ye, Qiang Fu, Wei Yang, Xueqian Wang, Bin Liang, Bo Yuan, Xiu Li
TNNLS 2023JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning
Zichuan Lin*, Junyou Li*, Jianing Shi*, Deheng Ye, Qiang Fu, Wei Yang
IJCAI 2022 (Long Oral top3%) (The champion solution of NeurIPS 2021 MineRL research competition)Model-based Adversarial Meta-Reinforcement Learning [code]
Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma
NeurIPS 2020RD$^2$: Reward Decomposition with Representation Decomposition
Zichuan Lin*, Derek Yang*, Li Zhao, Tao Qin, Guangwen Yang, Tie-yan Liu
NeurIPS 2020Episodic Reinforcement Learning with Associative Memory
Guangxiang Zhu*, Zichuan Lin*, Guangwen Yang, and Chongjie Zhang
ICLR 2020Object-Oriented Dynamics Learning through Multi-Level Abstraction
Guangxiang Zhu*, Jianhao Wang*, Zhizhou Ren*, Zichuan Lin and Chongjie Zhang
AAAI 2020Distributional Reward Decomposition for Reinforcement Learning
Zichuan Lin, Li Zhao, Derek Yang, Tao Qin, Guangwen Yang, and Tie-yan Liu
NeurIPS 2019Fully Parameterized Quantile Function for Distributional Reinforcement Learning [code]
Derek Yang, Li Zhao, Zichuan Lin, Jiang Bian, Tao Qin, and Tie-yan Liu
NeurIPS 2019
Episodic Memory Deep Q-Networks [code]
Zichuan Lin, Tianqi Zhao, Guangwen Yang, and Lintao Zhang
IJCAI 2018Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization
Zichuan Lin, Xiapeng Wu, Mingfei Sun, Deheng Ye, Qiang Fu, Wei Yang, Wei Liu
arXiv:2302.02299, 2023Pretraining in Deep Reinforcement Learning: A Survey
Zhihui Xie, Zichuan Lin, Junyou Li, Shuai Li, Deheng Ye
arXiv:2211.03959, 2022Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System
Zichuan Lin, Jing Huang, Bowen Zhou, Xiaodong He, Tengyu Ma
arXiv:2106.04835, 2021
Causal Inference
- PIPCFR: Pseudo-outcome Imputation with Post-treatment Variables for Individual Treatment Effect Estimation
Zichuan Lin*, Xiaokai Huang*, Jiate Liu, Yuxuan Han, Jia Chen, Xiapeng Wu, Deheng Ye
arXiv:2512.18737, 2025 [Code]
Awards & Honors
- 北京市优秀博士生, 2021.
- 清华国家奖学金 (Top 1%) , 2018 & 2020.
- 清华大学年度人物提名,每年20名, 2019.
- 清华微信公众号采访, 2018.
- 清华大学十佳优秀运动员,每年10名, 2017.
- 清华计算机系本科优秀毕业论文, 2016.
- 首都高校乒乓球比赛男子单打四连冠, 2013-2016.
- 全国中学生乒乓球锦标赛男子双打第三名, 2012.
- 乒乓球国家一级运动员, 2006 & 2012.
- 世界中学生乒乓球锦标赛团体冠军, 2011.
