Heretic is an automated tool that removes safety alignment from transformer-based language models using directional ablation without expensive retraining. It combines advanced abliteration techniques with TPE-based parameter optimization to automatically find optimal parameters that minimize both refusals and divergence from
Sort: