CN/EN

Research Center for Computer Vision and Robotics

The Research Center for Computer Vision and Robotics is interested in fundamental problems in computer vision and robotics, including, but not limited to, large-scale visual representation learning, zero-shot and few-shot object recognition, reinforcement learning, and intelligent control, striving to empower the next generation manufacturing with research innovations in vision and robotics.

Dr. Lei Zhang is currently the Chair Scientist of the Research Center for Computer Vision and Robotics. Prior to this, he was a Principal Researcher and Research Manager at Microsoft, where he has worked since 2001 in Microsoft Research Asia (MSRA), Microsoft Research (MSR, Redmond), and other computer vision-related product teams. He has led research teams for years, conducting research on computer vision with applications in large-scale image analysis, object detection, and vision-language understanding. His research has led to many practical impacts in Bing Multimedia Search and Microsoft Cognitive Services. Dr. Lei Zhang has published more than 150 papers in top conferences and journals and holds more than 60 US-granted patents. He was named as IEEE Fellow for his contribution in large-scale visual recognition and multimedia information retrieval.

The Research Center for Computer Vision and Robotics aims to advance the state of the art in vision and robotics. Our primary research interest is to pursue visual representation learning by leveraging super large-scale multimodal data and meanwhile study how to add structure and knowledge to improve the robustness, interpretability, and generalization ability of the learned representation. We are also interested in active vision and reinforcement learning required for robotics, striving to develop more robust and explainable technologies to empower the next generation manufacturing.

Research Directions

Large-scale machine learning platform: We will focus on the optimization problem for deep learning at both the system and algorithm levels, including data parallelism and model parallelism, numerical analysis, and more advanced optimization algorithms in a distributed environment, aiming to effectively improve the efficiency of large-scale training.
Visual representation learning: We will pursue visual representation learning by leveraging super large-scale multimodal data and meanwhile study how to add structure and knowledge to improve the robustness, interpretability, and generalization ability of the learned representation. 
Core vision problems: We will study core vision problems, including large-scale image classification, object detection, segmentation and tracking, 3D scene understanding, vision-language understanding, zero-shot and few-shot object recognition, as well as model optimization and transfer learning in real applications.
New generation of robotic technologies: We are keen to study active vision, reinforcement learning, and intelligent control that are required for robotics, striving to develop more robust and explainable technologies to empower the next generation manufacturing.

TOP