Personal Homepage

Jiaye Li

I am a master's student at Fudan University, interested in computer vision and generative AI, advised by Prof. Siyu Zhu.

My research focuses on visual generative models, with a specific emphasis on diffusion and flow-based image and video generation, human-centric video generation, and model post-training techniques, including distillation and reinforcement learning.

Email GitHub View Work

Research

Interests

Visual Generative Models

Diffusion and flow-based models for high-quality image and video generation.

Human-Centric Video Generation

Human-focused video generation and data pipelines for realistic and temporally consistent generation.

Model Post-Training

Distillation and reinforcement learning for improving generative model efficiency and alignment.

Selected Work

Projects & Publications

First Author

Image Generation / CVPR 2026

Head-wise Adaptive Rotary Positional Encoding for Fine-Grained Image Generation

Jiaye Li*, Baoyou Chen*, Hui Li, Zilong Dong, Jingdong Wang, Siyu Zhu

HARoPE introduces a head-wise adaptive extension of RoPE for transformer-based image generation, improving fine-grained spatial relations, color fidelity, and object counting in class-conditional and text-to-image settings.

Paper Code

Co-First Author

Avatar Generation / ACMMM 2026

Hallo-Live: Real-Time Streaming Joint Audio-Video Avatar Generation

Chunyu Li*, Jiaye Li*, Ruiqiao Mei, Haoyuan Xia, Hao Zhu, Jingdong Wang, Siyu Zhu

Hallo-Live is a real-time text-driven audio-video avatar generation framework that combines asynchronous dual-stream diffusion with human-centric preference-guided distillation for low-latency, synchronized portrait video and speech synthesis.

Paper Code

Co-Author

Human Video / CVPR 2025

OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

Hui Li*, Mingwang Xu*, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu

OpenHumanVid provides a large-scale, high-quality human-centric video dataset with detailed captions, skeleton sequences, and speech audio to improve human video generation and motion alignment.

Paper Project

Co-Author

Visual Generation / ICLR 2026

Pyramidal Patchification Flow for Visual Generation

Hui Li, Baoyou Chen, Liwei Zhang, Jiaye Li, Jingdong Wang, Siyu Zhu

PPFlow accelerates diffusion and flow-based visual generation by reducing token counts at high-noise timesteps through pyramidal patchification, while preserving generation quality.

Paper Code

Co-Author

Vision-Language / CVPR 2025

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

Yuheng Feng*, Changsong Wen*, Zelin Peng, Jiaye Li, Siyu Zhu

LongD-CLIP uses dual-teacher distillation to improve CLIP's long-text representation ability while retaining foundational short-text and zero-shot classification knowledge.

Paper

Background

Education & Experience

2025-2026
Research Intern at the Shanghai Academy of AI for Science.
2024-Present
Master's student at Fudan University.
2020-2024
Bachelor's degree in Artificial Intelligence, Nanjing University of Aeronautics and Astronautics.

Contact

Get in touch

For research collaboration, project questions, or academic discussions, feel free to reach out by email or GitHub.

lijiaye031@gmail.com github.com/Studentxll