Cisco Talos Incident Response's AI Tiger Team shares lessons from experimenting with LLMs to generate cybersecurity reports. They identified four core inconsistency problems with AI-generated content: variable research sourcing, inconsistent conclusions, unpredictable output formatting, and context drift/pollution. To address these, they developed four prompt engineering controls: prompt specialization, specified source constraints, output format specification, and template-guided prompting. A case study using a Tabletop Exercise report showed a predicted 50% reduction in drafting time and consistent quality that passed blind peer review. Key cautions include data privacy risks with public AI tools, the importance of model selection (Claude Sonnet 4.5 performed best), input quality requirements, and the need for human oversight to catch duplicative or irrelevant AI-generated recommendations.
Table of contents
Defining the inconsistency problem in AI reportingMethods to control inconsistenciesCase study: TTX reportThe benefitsCautionsTechnology limitationsWhat’s nextSort: