【讲座】学术讲座 2024/05/16
发布者:沙晓燕 发布时间:2024-03-05 浏览次数:593
时 间:2024年5月16日 16:00-17:00
地 点:财科馆316会议室
报告人:徐 洁 教授
Jennifer J. Xu is a Professor of Computer Information Systems at Bentley University. She is currently serving as a Senior Editor at Journal of the Association for Information Systems (JAIS), and an Associate Editor at Decision Support Systems (DSS) and Communications of the AIS (CAIS).
主 题:Do You Get It? Exploring LLMs’ Potential for Multimodal Discourse Understanding
摘 要:This research-in-progress aims to explore large language models' capability of understanding Internet memes as an indicator of their potential for multimodal data analysis and discourse interpretation. At the current stage, we have evaluated four multimodal LLMs (Claude 3, Gemini, GPT-4, and LlaVA-1.5) and compared their performance in recognizing objects and texts in memes, generating coherent descriptions of meme contents, and revealing the punchlines. We have found that OpenAI’s GPT-4 outperformed all other models, demonstrating its excellent capability of understanding memes by leveraging various types of information, including the visual and text cues in the memes, metaphors, and contextual knowledge. Claude 3 and Gemini performed somewhat worse than GPT-4, and LlaVA fell behind to a large extent. The findings of this work open several avenues for future research on discourse understanding and analysis.
 
     
             
                        