👤 About Me

I am Weixin Wang, a third year Ph.D. student in Prof. Pan Xu’s lab at Duke University.

My research centers on Machine Learning, with a focus on developing computationally and data-efficient algorithms that feature both strong empirical performance and rigorous theoretical guarantees. My core interests span Reinforcement Learning (RL) Theory, particularly Thompson Sampling, Ensemble Sampling, and other Randomized Exploration methods in RL. I also maintain broad interests in Diffusion Models, Large Language Models (LLMs), Robust RL, Artificial Intelligence, and High-Dimensional Statistics.

I am fortunate to collaborate with Yu Yang, Zhishuai Liu (lab members), Ruoxi Cheng, Haoyang Zheng, Wei Deng, Hao-Lun Hsu and many other excellent researchers.

🔥 News

2025.12: I attended NeurIPS 2025 at San Diego!
2025.8: I attended Princeton 2025 Machine Learning Theory Summer School!
2025.5: 🎉🎉 Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction is accepted as Poster to ICML 2025!
2024.12: I attended NeurIPS 2024 at Vancouver!
2024.9: 🎉🎉 Randomized Exploration in Cooperative Multi-agent Reinforcement Learning is accepted as Poster to NeurIPS 2024!

📝 Publications

Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective [Paper]

Weixin Wang^*, Haoyang Zheng^*, Guang Lin, Wei Deng, Pan Xu

NeurIPS 2025 Workshop: Dynamics at the Frontiers of Optimization, Sampling, and Games (DynaFront).
Diffusion Posterior Sampling for Nonlinear Contextual Bandits

Weixin Wang^*, Yu Yang^*, Pan Xu

Under review at ICLR 2026.
Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment [Paper]

Ruoxi Cheng^*, Haoxuan Ma^*, Weixin Wang^*, Ranjie Duan, Jiexi Liu, Xiaoshuang Jia, Simeng Qin, Xiaochun Cao, Yang Liu, Xiaojun Jia

Under review at ICLR 2026.
Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits [Paper]

Jiazheng Sun^*, Weixin Wang^*, Pan Xu

Under review at ICLR 2026.
Breaking the Total Variance Barrier: Sharp Sample Complexity for Linear Heteroscedastic Bandits with Fixed Action Set

Heyang Zhao^*, Tianyuan Jin^*, Weixin Wang, Vincent Y. F. Tan, Pan Xu, Quanquan Gu

Under review at ICLR 2026.
Upper and Lower Bounds for Distributionally Robust Off-Dynamics Reinforcement Learning [Paper]

Zhishuai Liu^*, Weixin Wang^*, Pan Xu

NeurIPS 2025 Workshop: Reliable ML from Unreliable Data.
Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction [Paper]

Yiting He^*, Zhishuai Liu^*, Weixin Wang, Pan Xu

In Proc. of the 42nd International Conference on Machine Learning (ICML), Vancouver, Canada, 2025.
Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning [Paper] [Code]

Hao-Lun Hsu^*, Weixin Wang^*, Miroslav Pajic, Pan Xu

In Proc. of the 38th Conference on Advances in Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2024.

📖 Educations

2023.08 - now, Ph.D., Department of Electrical and Computer Engineering, Duke University.
2019.09 - 2023.06, B.S., School of the Gifted Young, University of Science and Technology of China.