Skip to the content.

Welcome to The 1st Workshop on Efficiency, Security, and Generalization of Multimedia Foundation Models at ACM Multimedia 2024!


Schedule

Location: Meeting Room 216

Time Topic Speakers/Details
14:00 Start and Welcome  
14:00-14:30 Invited Talk Dr Piotr Koniusz (Data61, CSIRO), “Adversarial Robustness: From Distillation Across Teacher-Student to Few-shot Foundation Models”
14:30-15:00 Invited Talk Prof Jiebo Luo (University of Rochester), ” Improving Alignment in T2I and T2V Generation”
15:00-15:30 Invited Talk Prof Phoebe Chen (La Trobe University), “Multimedia Discovery for Biomedical Applications”
15:30-16:00 Afternoon Tea  
16:00-16:30 Industry Talk Dr Luoqi Liu (MT Lab, Meitu), “Driving R&D with User Needs: Putting MiracleVision to Work”
16:30-16:45 Paper Presentation Object-Driven Human Motion Generation from Images
16:45-17:00 Paper Presentation Rethinking the Role-Play Prompting in Mathematical Reasoning Tasks
17:00-17:15 Paper Presentation ITCD: Image to Text Translation for Classification by Diffusion Models
17:15 End  

Important Dates

Submission Open May 11, 2024 (AoE)
Submission Deadline July 19, 2024 Extended to July 29 (AoE)
Decision Notification August 5, 2024 (AoE)
Camera-Ready Deadline August 19, 2024 (AoE)
Workshop Date Oct 28 PM, 2024

Invited Speakers

Piotr Koniusz
Piotr Koniusz
Data61, CSIRO
Jiebo Luo
Jiebo Luo
University of Rochester
Phoebe Chen
Phoebe Chen
La Trobe University
Luoqi Liu
Luoqi Liu
MT Lab, Meitu

Organizers

Daochang Liu
Daochang Liu
The University of Sydney
Minjing Dong
Minjing Dong
City University of Hong Kong
Yasmeen George
Yasmeen George
Monash University
Chang Xu
Chang Xu
The University of Sydney

Sponsors

Meitu

Contact

Contact the organizers at mm2024-esgmfm@googlegroups.com

Call for Papers

The rapid progress in foundation models has enhanced the capabilities of multimedia models across a broad spectrum of tasks. Despite their exceptional performance, deploying these models in practical settings raises several concerns, particularly regarding efficiency, security, and generalization. As the utility of foundation models in multimedia topics becomes increasingly evident, addressing these issues is crucial. This workshop focuses on these critical aspects in foundation models, where the scope of the foundation model encompasses a wide range of domains such as vision, language, speech etc., with an emphasis on multimedia tasks and multi-modality methods.

Therefore, we solicit original research papers in (but not limited to) the following topics:

Efficiency

Security

Generalization

Submission Deadline: July 19, 2024 Extended to July 29

Submit Platform: OpenReview

We welcome submissions of research papers, demos, datasets, and position papers within the workshop’s scopes. The submission guideline follows the main conference site of ACM Multimedia 2024, including the formatting guideline and submission policies. The review process for this workshop will be “double-blinded”. Submissions should be of up to 4-page length in ACM-MM format, plus up to 1 additional page for the references.

All papers will be peer-reviewed by at least three experts in the field, regarding the relevance to the workshop, scientific novelty, and technical quality. Accepted submissions will be presented via oral or poster sessions. All accepted papers will be published in the ACM Multimedia proceedings in the ACM Digital Library.

Submit your manuscripts through OpenReview. All the authors need to create a profile on OpenReview. New profiles created without an institutional email will go through a moderation process that can take up to two weeks. New profiles created with an institutional email will be activated automatically.