Advances in Large Language Models (LLMs) also bring powerful attacks such as prompt injection. This is identified as a top threat by OWASP, where an LLM input contains manipulated instructions to control the output. New defenses, Structured Queries (StruQ) and Special Preference Optimization (SecAlign), propose separating prompts and data and training models to ignore injected instructions. Experiments show that these methods significantly reduce attack success rates without adding computational or labor costs.
Table of contents
Prompt Injection Attack: CausesPrompt Injection Defense: StruQ and SecAlignExperimentsSummarySort: