Researchers from MIT, WPI, and Google have developed a new debiasing technique called WRING (Weighted Rotational DebiasING) for vision language models (VLMs). Unlike the commonly used projection debiasing approach, which suffers from the 'Whac-A-Mole dilemma' — where fixing one bias inadvertently amplifies others — WRING rotates bias-responsible coordinates in the model's high-dimensional space rather than projecting them out. This preserves other learned relationships while eliminating targeted biases. As a post-processing method, it can be applied to pre-trained models like OpenCLIP without retraining. Results show significant bias reduction for target concepts without increasing bias elsewhere, though the approach is currently limited to CLIP-style models.

4m read timeFrom news.mit.edu
Post cover image

Sort: