Seminar : Multimodal Commonsense Reasoning

Abstract: In my previous work, I have focused on enabling AI models to achieve human-level commonsense reasoning through two complementary avenues. The first avenue enhances reasoning capabilities by extracting and integrating fine-grained, multimodal knowledge—emphasizing the acquisition of contextual information and its incorporation into complex reasoning processes. The second avenue addresses model reliability from three perspectives: prediction consistency, transparent (or explainable) reasoning steps, and faithful performance in biased or ambiguous scenarios. By leveraging such detailed, multimodal knowledge, AI models can improve their reasoning, robustness, and interpretability, thereby strengthening human trust and understanding in human–AI interactions. Building on these foundations, my future research will continue to advance human-centered AI, exploring areas such as real-world learning, interactive learning with humans, agent-based learning, embodied learning, AI for science, social good and beyond.
Bio: Zhecan (James) Wang is a Postdoctoral Research Fellow in Computer Science at UCLA, where he works under the guidance of Prof. Kai-Wei Chang (張凱崴, Amazon Scholar, Sloan Fellow) and Prof. Nanyun Peng. He earned my Ph.D. in Computer Science from Columbia University (2019-2024), mentored by Prof. Shih-Fu Chang (張世富, Dean of the Engineering School, National Academy of Engineering Fellow). His research focuses on Multimodal Learning, Vision-Language Understanding, Commonsense Reasoning, and Human-Centered AI, with applications extending to applied science, healthcare and beyond.