ICML 2024 Workshop

Trustworthy Multi-modal Foundation Models and AI Agents (TiFA)

ICML 2024 @ Vienna, Austria, Jul 27 Sat
The final decisions on paper acceptance have been released. Please check your OpenReview account.


Description/Call for Papers

Advanced Multi-modal Foundation Models (MFMs) and AI Agents, equipped with diverse modalities [1, 2, 3, 4, 15] and an increasing number of available affordances [5, 6] (e.g., tool use, code interpreter, API access, etc.), have the potential to accelerate and amplify their predecessors’ impact on society [7].
MFM includes multi-modal large language models (MLLMs) and multi-modal generative models (MMGMs). MLLMs refer to LLM-based models with the ability to receive, reason, and output with information of multiple modalities, including but not limited to text, images, audio, and video. Examples include Llava [1], Reka [8], QwenVL [9], LAMM [36],and so on. MMGMs refer to a class of MFM models that can generate new content across multiple modalities, such as generating images from text descriptions or creating videos from audio and text inputs. Examples include Stable Diffusion [2], Sora [10], and Latte [11]. AI agents, or systems with higher degree of agenticness, refer to systems that could achieve complex goals in complex environments with limited direct supervision [12]. Understanding and preempting the vulnerabilities of these systems [13, 35] and their induced harms [14] becomes unprecedentedly crucial.
Building trustworthy MFMs and AI Agents transcends adversarial robustness of such models, but also emphasizes the importance of proactive risk assessment, mitigation, safeguards, and the establishment of comprehensive safety mechanisms throughout the lifecycle of the systems’ development and deployment [16, 17]. This approach demands a blend of technical and socio-technical strategies, incorporating AI governance and regulatory insights to build trustworthy MFMs and AI Agents.
Topics include but are not limited to:
  • Adversarial attack and defense, poisoning, hijacking and security [18, 13, 19, 20, 21]
  • Robustness to spurious correlations and uncertainty estimation
  • Technical approaches to privacy, fairness, accountability and regulation [12, 22, 28]
  • Truthfulness, factuality, honesty and sycophancy [23, 24]
  • Transparency, interpretability and monitoring [25, 26]
  • Identifiers of AI-generated material, such as watermarking [27]
  • Technical alignment / control , such as scalable overslight [29], representation control [26] and machine unlearning [30]
  • Model auditing, red-teaming and safety evaluation benchmarks [31, 32, 33, 16]
  • Measures against malicious model fine-tuning [34]
  • Novel safety challenges with the introduction of new modalities

Submission Guide

Submission Instructions

  • Submission site: Submissions should be made on OpenReview.
  • Submission are non-archival: we receive submissions that are also undergoing peer review elsewhere at the time of submission, but we will not accept submissions that have already been previously published or accepted for publication at peer-reviewed conferences or journals. Submission is permitted for papers presented or to be presented at other non-archival venues (e.g. other workshops). No formal workshop proceedings will be published.
  • Social Impact Statement: authors are required to include a "Social Impact Statement" that highlights "potential broader impact of their work, including its ethical aspects and future societal consequences".
  • Submission Length and Format: Submissions should be anonymised papers up to 5 pages (appendices can be added to the main PDF); excluding references and Social Impacts Statement. You must format your submission using the ICML_2024_LaTeX_style_file.
  • Paper Review: All reviews are double-blinded, with at least two reviewers assigned to each paper.
  • Camera Ready Instructions: The camera ready version is composed of a main body, which can be up to 6 pages long, followed by unlimited pages for a Social Impact Statement, references and an appendix, all in a single file. The camera-ready versions of all accepted submissions should be uploaded by the authors to the OpenReview page for corresponding submissions. The camera ready version will be publicly available to everyone on the Camera-Ready Deadline displayed below.

Key Dates

Submissions OpenMay 11, 2024
Submission DeadlineMay 30, 2024
Acceptance NotificationJune 17, 2024June 19, 2024
Camera-Ready DeadlineJuly 7, 2024
Workshop DateJuly 27, 2024
All deadlines are specified in 23:59AoE (Anywhere on Earth).

Frequently Asked Questions

Can we submit a paper that will also be submitted to NeurIPS 2024?

Yes.

Can we submit a paper that was accepted at ICLR 2024?

No. ICML prohibits main conference publication from appearing concurrently at the workshops.

Will the reviews be made available to authors?

Yes.

I have a question not addressed here, whom should I contact?

Email organizers at icmltifaworkshop@gmail.com

References

[1] Liu, H., Li, C., Wu, Q., & Lee, Y. J. (2024). Visual instruction tuning. Advances in neural information processing systems, 36.

[2] Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

[3] OpenAI. (2023). GPT-4 with vision (GPT-4v) system card.

[4] Z. Yang, L. Li, K. Lin, J. Wang, C.-C. Lin, Z. Liu, and L. Wang. The dawn of lmms: Preliminary explorations with gpt-4v(ision), 2023.

[5] Qin, Y., Liang, S., Ye, Y., Zhu, K., Yan, L., Lu, Y., ... & Sun, M. (2023). ToolIIM: Facilitating large language models to master 16000+ real-world APIs.

[6] C. Zhang, Z. Yang, J. Liu, Y. Han, X. Chen, Z. Huang, B. Fu, and G. Yu. Appagent: Multimodal agents as smartphone users, 2023.

[7] T. Eloundou, S. Manning, P. Mishkin, and D. Rock. Gpts are gpts: An early look at the labor market impact potential of large language models, 2023.

[8] Ormazabal, Aitor, et al. "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models." arXiv preprint arXiv:2404.12387 (2024).

[9] Bai, J., Bai, S., Yang, S., Wang, S., Tan, S., Wang, P., ... & Zhou, J. (2023). Qwen-vl: A frontier large vision-language model with versatile abilities. arXiv preprint arXiv:2308.12966.

[10] Sora: Creating video from text. (n.d.). https://openai.com/sora

[11] Ma, X., Wang, Y., Jia, G., Chen, X., Liu, Z., Li, Y. F., ... & Qiao, Y. (2024). Latte: Latent diffusion transformer for video generation. arXiv preprint arXiv:2401.03048.

[12] Shavit, Y., Agarwal, S., Brundage, M., Adler, S., O’Keefe, C., Campbell, R., ... & Robinson, D. G. (2023). Practices for Governing Agentic AI Systems. Research Paper, OpenAI, December.

[13] N. Carlini, M. Nasr, C. A. Choquette-Choo, M. Jagielski, I. Gao, A. Awadalla, P. W. Koh, D. Ippolito, K. Lee, F. Tramer, and L. Schmidt. Are aligned neural networks adversarially aligned?, 2023.

[14] A. Chan, R. Salganik, A. Markelius, C. Pang, N. Rajkumar, D. Krasheninnikov, L. Langosco, Z. He, Y. Duan, M. Carroll, M. Lin, A. Mayhew, K. Collins, M. Molamohammadi, J. Burden, W. Zhao, S. Rismani, K. Voudouris, U. Bhatt, A. Weller, D. Krueger, and T. Maharaj. Harms from increasingly agentic algorithmic systems. In 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’23. ACM, June 2023. doi: 10.1145/3593013.3594033. URL http://dx.doi.org/10.1145/3593013.3594033.

[15] Gemini Team. Gemini: A family of highly capable multimodal models, 2023.

[16] T. Shevlane, S. Farquhar, B. Garfinkel, M. Phuong, J. Whittlestone, J. Leung, D. Kokotajlo, N. Marchal, M. Anderljung, N. Kolt, L. Ho, D. Siddarth, S. Avin, W. Hawkins, B. Kim, I. Gabriel, V. Bolina, J. Clark, Y. Bengio, P. Christiano, and A. Dafoe. Model evaluation for extreme risks, 2023.

[17] L. Weidinger, M. Rauh, N. Marchal, A. Manzini, L. A. Hendricks, J. Mateos-Garcia, S. Bergman, J. Kay, C. Griffin, B. Bariach, I. Gabriel, V. Rieser, and W. Isaac. Sociotechnical safety evaluation of generative ai systems, 2023.

[18] L. Bailey, E. Ong, S. Russell, and S. Emmons. Image hijacks: Adversarial images can control generative models at runtime, 2023.

[19] Jain, N., Schwarzschild, A., Wen, Y., Somepalli, G., Kirchenbauer, J., yeh Chiang, P., ... & Goldstein, T. (2023). Baseline defenses for adversarial attacks against aligned language models.

[20] Robey, A., Wong, E., Hassani, H., & Pappas, G. J. (2023). SmoothLLM: Defending large language models against jailbreaking attacks.

[21] B. Wang, W. Chen, H. Pei, C. Xie, M. Kang, C. Zhang, C. Xu, Z. Xiong, R. Dutta, R. Schaeffer, et al. Decodingtrust: A comprehensive assessment of trustworthiness in gpt models. 2023.

[22] Chan, A., Ezell, C., Kaufmann, M., Wei, K., Hammond, L., Bradley, H., ... & Anderljung, M. (2024). Visibility into AI Agents. arXiv preprint arXiv:2401.13138.

[23] Huang, Q., Dong, X., Zhang, P., Wang, B., He, C., Wang, J., Lin, D., Zhang, W., & Yu, N. (2023). Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation.

[24] Sharma, M., Tong, M., Korbak, T., Duvenaud, D., Askell, A., Bowman, S. R., ... & Perez, E. (2023). Towards understanding sycophancy in language models.

[25] Meng, K., Bau, D., Andonian, A., & Belinkov, Y. (2022). Locating and editing factual associations in GPT.

[26] A. Zou, L. Phan, S. Chen, J. Campbell, P. Guo, R. Ren, A. Pan, X. Yin, M. Mazeika, A.-K. Dombrowski, S. Goel, N. Li, M. J. Byun, Z. Wang, A. Mallen, S. Basart, S. Koyejo, D. Song, M. Fredrikson, J. Z. Kolter, and D. Hendrycks. Representation engineering: A top-down approach to ai transparency, 2023.

[27] Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., & Goldstein, T. (2023). A watermark for large language models. In Proceedings of the 40th International Conference on Machine Learning.

[28] Nasr, M., Carlini, N., Hayase, J., Jagielski, M., Cooper, A. F., Ippolito, D., ... & Lee, K. (2023). Scalable extraction of training data from (production) language models.

[29] S. R. Bowman, J. Hyun, E. Perez, E. Chen, C. Pettit, S. Heiner, K. Lukoˇsi ̄ut ̇e, A. Askell, A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon, C. Olah, D. Amodei, D. Amodei, D. Drain, D. Li, E. Tran-Johnson, J. Kernion, J. Kerr, J. Mueller, J. Ladish, J. Landau, K. Ndousse, L. Lovitt, N. Elhage, N. Schiefer, N. Joseph, N. Mercado, N. DasSarma, R. Larson, S. McCandlish, S. Kundu, S. Johnston, S. Kravec, S. E. Showk, S. Fort, T. Telleen-Lawton, T. Brown, T. Henighan, T. Hume, Y. Bai, Z. Hatfield-Dodds, B. Mann, and J. Kaplan. Measuring progress on scalable oversight for large language models, 2022.

[30] Yao, Y., Xu, X., & Liu, Y. (2023). Large language model unlearning. arXiv preprint arXiv:2310.10683.

[31] S. Casper, C. Ezell, C. Siegmann, N. Kolt, T. L. Curtis, B. Bucknall, A. Haupt, K. Wei, J. Scheurer, M. Hobbhahn, L. Sharkey, S. Krishna, M. V. Hagen, S. Alberti, A. Chan, Q. Sun, M. Gerovitch, D. Bau, M. Tegmark, D. Krueger, and D. Hadfield-Menell. Black-box access is insufficient for rigorous ai audits, 2024.

[32] M. Bhatt, S. Chennabasappa, C. Nikolaidis, S. Wan, I. Evtimov, D. Gabi, D. Song, F. Ahmad, C. Aschermann, L. Fontana, S. Frolov, R. P. Giri, D. Kapil, Y. Kozyrakis, D. LeBlanc, J. Milazzo, A. Straumann, G. Synnaeve, V. Vontimitta, S. Whitman, and J. Saxe. Purple llama cyberseceval: A secure coding benchmark for language models, 2023.

[33] D. Ganguli, L. Lovitt, J. Kernion, A. Askell, Y. Bai, S. Kadavath, B. Mann, E. Perez, N. Schiefer, K. Ndousse, A. Jones, S. Bowman, A. Chen, T. Conerly, N. DasSarma, D. Drain, N. Elhage, S. El-Showk, S. Fort, Z. Hatfield-Dodds, T. Henighan, D. Hernandez, T. Hume, J. Jacobson, S. Johnston, S. Kravec, C. Olsson, S. Ringer, E. Tran-Johnson, D. Amodei, T. Brown, N. Joseph, S. McCandlish, C. Olah, J. Kaplan, and J. Clark. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned, 2022.

[34] Henderson, P., Mitchell, E., Manning, C., Jurafsky, D., & Finn, C. (2023). Self-destructing models: Increasing the costs of harmful dual uses of foundation models. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23.

[35] Y. Dong, H. Chen, J. Chen, Z. Fang, X. Yang, Y. Zhang, Y. Tian, H. Su, and J. Zhu. How robust is google’s bard to adversarial image attacks?, 2023.

[36] Yin, Z., Wang, J., Cao, J., Shi, Z., Liu, D., Li, M., Sheng, L., Bai, L., Huang, X., Wang, Z., & others (2023). LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark. arXiv preprint arXiv:2306.06687.