Dong Chen
陈栋
Principal Research Manager
Visual Computing Group, Microsoft Research Asia
About Me
I joined the Visual Computing Group at Microsoft Research Asia in July 2015. Prior to that, I received my B.S. and Ph.D. degrees from the University of Science and Technology of China in 2010 and 2015, respectively. I had internships at Microsoft Research Asia from 2010 to 2015 and was honored with the Microsoft Research Asia Fellowship Award in 2013.
My research focuses on cutting-edge computer vision and machine learning technologies that push the boundaries of visual understanding and generation.
Denoising Diffusion Probabilistic Models (DDPM)
Generative Adversarial Networks (GAN)
Self-/Semi-/Unsupervised Learning
Face Avatar
Computer Vision
Deep Learning
Recent News
- 2025: 4 papers accepted by CVPR'25 (Structured 3D Latents, DesignDiffusion, SmartEraser, ART)
- 2025: 2 papers accepted by ICCV'25 (Gaussian Variation Field Diffusion, Improved Noise Schedule)
- 2025: SinDiffusion published in IEEE TPAMI; Phi-4-Mini Technical Report released
- 2024: 8 papers accepted by top-tier conferences (CVPR'24, ICCV'24, ECCV'24, ICML'24)
- 2024: Co-authored Phi-3 Technical Report on highly capable language models
- 2024: Released InstructDiffusion: A generalist modeling interface for vision tasks
- 2023: 5 papers accepted by CVPR'23
- 2023: Serving as Area Chair for CVPR'23
- 2022: 2 papers accepted by ECCV'22
Selected Publications
2025
Structured 3D Latents for Scalable and Versatile 3D Generation
Computer Vision and Pattern Recognition (CVPR), 2025
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2025
SmartEraser: Remove Anything from Images using Masked-Region Guidance
Computer Vision and Pattern Recognition (CVPR), 2025
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Gaussian Variation Field Diffusion for High-Fidelity Video-to-4D Synthesis
International Conference on Computer Vision (ICCV), 2025
Improved Noise Schedule for Diffusion Training
International Conference on Computer Vision (ICCV), 2025
SinDiffusion: Learning a Diffusion Model from a Single Natural Image
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
HairShifter: Consistent and High-Fidelity Video Hair Transfer via Anchor-Guided Animation
ACM International Conference on Multimedia (ACM MM), 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
arXiv preprint, 2025
VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Graphical Models, 2025
High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
IEEE International Conference on Robotics and Automation (ICRA), 2025
2024
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Computer Vision and Pattern Recognition (CVPR), 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
arXiv preprint, 2024
GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Advances in Neural Information Processing Systems (NeurIPS), 2024
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Advances in Neural Information Processing Systems (NeurIPS), 2024
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
European Conference on Computer Vision (ECCV), 2024
IRGen: Generative Modeling for Image Retrieval
European Conference on Computer Vision (ECCV), 2024
2023
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Computer Vision and Pattern Recognition (CVPR), 2023
RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
Computer Vision and Pattern Recognition (CVPR), 2023
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Computer Vision and Pattern Recognition (CVPR), 2023
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2023
CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning
Computer Vision and Pattern Recognition (CVPR), 2023
2022
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Computer Vision and Pattern Recognition (CVPR), 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation
Computer Vision and Pattern Recognition (CVPR), 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis
Computer Vision and Pattern Recognition (CVPR), 2022
Selected Earlier Work
Cross-domain Correspondence Learning for Exemplar-based Image Translation
Computer Vision and Pattern Recognition (CVPR Oral), 2020
An Efficient Joint Formulation for Bayesian Face Verification
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016
Bayesian Face Revisited: A Joint Formulation
European Conference on Computer Vision (ECCV), 2012
Awards & Honors
Microsoft Research Asia Fellowship, 2013