CVPR 2025 Tutorial:
Evaluating Large Multi-modal Models: Challenges and Methods
Time: 13:00-17:00
Location: 109



[cvpr25-lmmeval-tutorial-slides]
Overview
The proliferation of large multi-modal models (LMMs) has raised increasing concerns about their security and risks, which are mainly due to a lack of understanding of their capabilities and limitations. In this tutorial, our aim is to fill this gap by presenting a holistic overview of LMM evaluation. First, we discuss the recent advance of LMMs evaluation from the perspectives of what, where, and how to evaluate. Then, we present several key challenges in LMM evaluation such as data contamination and fixed complexity. Based on these challenges, we introduce how to overcome these challenges. Furthermore, our discussion covers key evaluation metrics including trustworthiness, robustness, and fairness, as well as performance across diverse downstream tasks in natural and social sciences. We conclude with an overview of widely-used code libraries and benchmarks that support these evaluation efforts. We hope that academic and industrial researchers continue to work to make LMMs more secure, responsible, and accurate.
Speakers

UCSB

UCSB

UCSB

UCSB

AMD

UCSB

William & Mary
Schedule
Title | Speaker | Time |
---|---|---|
Background and Challenges An introduction to large multi-modal models (LMMs) and key evaluation challenges |
Kaijie Zhu | 20 min |
Dynamic Evaluation Methods and approaches for evaluating LMMs in dynamic contexts |
Kaijie Zhu | 40 min |
Measurement Challenges Key metrics and methodological issues in LMM evaluation |
Sophia Pu | 40 min |
Safety Issues Evaluating and addressing security risks in LMMs |
Yuzhou Nie | 40 min |
Evaluation in Social Science Applications and evaluation methods for LMMs in social sciences |
Hao Chen | 40 min |
Conclusion and Discussion Summary of key takeaways and Q&A |
Hao Chen | 20 min |