Skip to main content
Digital Environment Research Institute (DERI)

QM Centre for Multimodal AI: Research seminar by Dr Liang Zheng: Pairs, Data Pairs!

When: Tuesday, September 3, 2024, 3:00 PM - 4:00 PM
Where: Bancroft building, David Sizer Lecture Theatre, Mile End

Speaker: Dr Liang Zheng, Australian National University

Abstract: Training AI models with image pairs has been studied for a long time and proven very useful. In this talk, Dr. Zheng will first revisit popular practices of using data pairs in various computer vision tasks: from face recognition, person re-identification, to contrastive learning in foundation models. He will then discuss human preference data: between a pair of images, people may generally prefer one over the other. This type of data pair can be used to align diffusion models with human preference, so that diffusion models are more likely to generate images that people like. He will describe how we address this problem by aligning human preference at different denoising steps. This method effectively improves stable diffusion (SD) and SDXL models while accelerating the fine-tuning process by 10 times compared with existing methods.

About the Speaker:  Dr Liang Zheng is an Associate Professor (tenured) at the Australian National University, specialising in computer vision and machine learning. He obtained his Bachelor (2010) and PhD (2015) degrees from Tsinghua University. He is best known for his contributions to the field of object re-identification through useful datasets and algorithms, including Market-1501 (ICCV 2015) and part-based convolutional baseline (ECCV 2018). He also developed a few widely used methods in image classification and multi-object tracking such as random erasing (AAAI 2020) and joint detection and embedding (ECCV 2020). He regularly serves as an area chair for conferences like CVPR and NeurIPS and co-organises the AI CITY Challenges and Vision Datasets Understanding workshops. He is a program chair for ACM Multimedia 2024.

Back to top