Zijun Gao

Hello! I am Zijun Gao. I received my B.S. in Mathematics and Computer Science from the University of Illinois Urbana-Champaign in December 2025. My work sits at the intersection of LLM reasoning, Reinforcement Learning (RL), and AI agents.

I have conducted research at Northwestern University's MLL Lab, advised by Prof. Manling Li on agentic RL, and at Arizona State University's ARC Lab under the supervision of Prof. Ben Zhou on improving mathematical reasoning in large language models.

News

[Jan 2026] CORE paper accepted at ICLR 2026!
[Mar 2025] SimWorld accepted at CVPR 2025 Demo Track!

Research Interests

My research studies how reinforcement learning can improve the reasoning capabilities of large language models, with an emphasis on long-horizon decision making, interaction with real-world environments, scalable environment generation, and generalization to unseen tasks.

My current interests include:

Agentic RL: Reinforcement learning for language and multimodal agents, with a focus on tool use, long-context management, credit assignment, and multi-turn interaction.
Reasoning RL: Reinforcement learning for reasoning-intensive tasks and structured decision-making processes.
Environments & Sandboxes: Building scalable environments, benchmarks, and sandboxes for agent training and evaluation.

Academic Research

Northwestern University – MLL Lab May 2025 – Jan 2026

Multi-Agent Collaborative Training: Developed the MAGEN framework, a multi-turn multi-agent reinforcement learning pipeline for collaborative training of LLMs and VLMs.
Outcome: Co-First Author, Project in Progress. Supervised by Prof. Manling Li.

Arizona State University – ARC Lab Feb 2025 – Dec 2025

Concept-Oriented Reinforcement Learning (CORE): Proposed the CORE framework to bridge concept definitions and mathematical reasoning through reinforcement learning, and achieved consistent improvements on both in-domain and out-of-domain mathematical benchmarks.
Outcome: First Author, ICLR 2026. Supervised by Prof. Ben Zhou. [Paper] [Code]

University of California, San Diego Jul 2024 – Jan 2025

Photorealistic World Simulator (SimWorld): Built photorealistic 3D environments for multi-agent interaction. Developed automated asset pipelines using UnrealCV and Blender, and contributed to large-scale dataset generation.
Outcome: CVPR 2025 Demo Track. Supervised by Prof. Zhiting Hu and Prof. Lianhui Qin.

Industry Experience

Meituan – LongCat Team May 2026 – Present

Baidu – ERNIE Team Apr 2026 – May 2026

Education

University of Illinois Urbana-Champaign (UIUC) Jan 2024 – Dec 2025
B.S. in Mathematics and Computer Science, Highest Distinction

Beijing Jiaotong University (BJTU) Aug 2021 – Dec 2023
B.E. in Computer Science and Technology, Top 5% (Transferred)