Slice of AI 3 - Bridging Worlds - Recent advances in Multi-modal AI
Paper presentation and discussion session, organized by Stockholm AI and Silo AI, with the topic of Multi-modal AI. This session was moderated by Pier Luigi Dovesi and I was the main speaker. I started the presentation with an introduction of multi-modal models using the transformer architecture to then deep dive into two recent papers:
CogVLM: Visual Expert for Pretrained Language Models (https://lnkd.in/dPaB946G) mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration (https://lnkd.in/dGiGPrmP) A Q&A session followed the presentation.