CVPR 2025 Tutorial:

Evaluating Large Multi-modal Models: Challenges and Methods

Date: June 11th, 2025
Time: 13:00-17:00
Location: 109

[cvpr25-lmmeval-tutorial-slides]

Overview

The proliferation of large multi-modal models (LMMs) has raised increasing concerns about their security and risks, which are mainly due to a lack of understanding of their capabilities and limitations. In this tutorial, our aim is to fill this gap by presenting a holistic overview of LMM evaluation. First, we discuss the recent advance of LMMs evaluation from the perspectives of what, where, and how to evaluate. Then, we present several key challenges in LMM evaluation such as data contamination and fixed complexity. Based on these challenges, we introduce how to overcome these challenges. Furthermore, our discussion covers key evaluation metrics including trustworthiness, robustness, and fairness, as well as performance across diverse downstream tasks in natural and social sciences. We conclude with an overview of widely-used code libraries and benchmarks that support these evaluation efforts. We hope that academic and industrial researchers continue to work to make LMMs more secure, responsible, and accurate.

Speakers

Kaijie Zhu

UCSB

Michael Saxon

UCSB

Sophia Pu

UCSB

Yuzhou Nie

UCSB

Hao Chen

AMD

Wenbo Guo

UCSB

Jindong Wang

William & Mary

Schedule

Title	Speaker	Time
Background and Challenges An introduction to large multi-modal models (LMMs) and key evaluation challenges	Kaijie Zhu	20 min
Dynamic Evaluation Methods and approaches for evaluating LMMs in dynamic contexts	Kaijie Zhu	40 min
Measurement Challenges Key metrics and methodological issues in LMM evaluation	Sophia Pu	40 min
Safety Issues Evaluating and addressing security risks in LMMs	Yuzhou Nie	40 min
Evaluation in Social Science Applications and evaluation methods for LMMs in social sciences	Hao Chen	40 min
Conclusion and Discussion Summary of key takeaways and Q&A	Hao Chen	20 min