Zikang Shan

Email: shanzikang [at] stu.pku.edu [dot] cn

I am a first-year Ph.D. student at Peking University, advised by Prof. Liwei Wang. Currently, I am also a research intern at Microsoft Research Asia. Before that, I received my Bachelor's degree from Peking University. During my undergraduate years, I was honored to be advised by Prof. Liwei Wang and Prof. He Wang.

Please feel free to contact me if you want to discuss or collaborate!

Google Scholar  |  Github

Research

I am interested in reinforcement learning, believing its potential remains largely underexplored today as both a general learning paradigm and a behavior optimizer. I particularly focus on applying reinforcement learning to large language model post-training, viewing reinforcement learning as a critical direction to scale up models with compute when high quality data becomes increasingly scarce.

DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong*, Zikang Shan*, Guhao Feng*, Wei Xiong, Xinle Cheng, Li Zhao, Di He, Jiang Bian, Liwei Wang
Under review
Paper  |  Code
Based on theoretical insights, we propose an alignment algorithm that is sample efficient and effective.
UniDexGrasp++: Improving Universal Dexterous Grasping via Geometry-aware Curriculum Learning and Iterative Generalist-Specialist Learning
Weikang Wan*, Haoran Geng*, Yun Liu, Zikang Shan, Li Yi, Yaodong Yang, and He Wang
ICCV, 2023, Oral presentation with all top rankings, best paper finalist
Paper  |  Website  |  Code
We improve our previous method, making it object-agnostic and much more effective.
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
Yinzhen Xu*, Weikang Wan*, Jialiang Zhang*, Haoran Liu*, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, and He Wang
CVPR, 2023
Paper  |  Website  |  Code
We propose a method to learn dexterous grasping policies able to handles diverse objects based on realistic observations.

This website is adapted from Jon Barron's website and deployed on Github Pages. Last updated: Apr. 14, 2025