Visual Memory Injection Attacks for Multi-Turn Conversations
Abstract
Visual Memory Injection attack enables covert manipulation of generative vision-language models through manipulated images that trigger targeted responses only under specific prompts during multi-turn conversations.
Generative large vision-language models (LVLMs) have recently achieved impressive performance gains, and their user base is growing rapidly. However, the security of LVLMs, in particular in a long-context multi-turn setting, is largely underexplored. In this paper, we consider the realistic scenario in which an attacker uploads a manipulated image to the web/social media. A benign user downloads this image and uses it as input to the LVLM. Our novel stealthy Visual Memory Injection (VMI) attack is designed such that on normal prompts the LVLM exhibits nominal behavior, but once the user gives a triggering prompt, the LVLM outputs a specific prescribed target message to manipulate the user, e.g. for adversarial marketing or political persuasion. Compared to previous work that focused on single-turn attacks, VMI is effective even after a long multi-turn conversation with the user. We demonstrate our attack on several recent open-weight LVLMs. This article thereby shows that large-scale manipulation of users is feasible with perturbed images in multi-turn conversation settings, calling for better robustness of LVLMs against these attacks. We release the source code at https://github.com/chs20/visual-memory-injection
Community
We propose “visual memory injection” attacks: small image perturbations that make generative large vision-language models behave normally at first but later trigger targeted harmful responses, even after several conversation turns.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Multi-Turn Adaptive Prompting Attack on Large Vision-Language Models (2026)
- Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization (2026)
- Toward Universal and Transferable Jailbreak Attacks on Vision-Language Models (2026)
- Multimodal Generative Engine Optimization: Rank Manipulation for Vision-Language Model Rankers (2026)
- Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack (2025)
- SoundBreak: A Systematic Study of Audio-Only Adversarial Attacks on Trimodal Models (2026)
- Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper