Vision language models (VLMs) can be manipulated through adversarial image perturbations, similar to traditional image classifiers. Using techniques like Projected Gradient Descent (PGD), attackers can craft pixel-level modifications or adversarial patches that cause VLMs to generate unexpected outputs. The article demonstrates

9m read timeFrom developer.nvidia.com
Post cover image
Table of contents
Vision language modelsEvading image classifiersBuilding adversarial images for VLMsThe difference with VLMsExtending the attackLearn more

Sort: