Explore LLM word representations using similarity analysis (part 2)

A deep dive into Representational Similarity Analysis (RSA) applied to GPT-2-XL's internal attention matrices. Using PyTorch hook functions to extract Query, Key, and Value activations across all 48 transformer layers, the analysis shows that while Q, K, and V vectors are nearly orthogonal (near-zero direct correlations), their RSA scores remain consistently high (~0.85–0.9). Category separability analysis using Cohen's d reveals that semantic world-knowledge is encoded within each attention matrix, with within-category RSA scores consistently exceeding across-category scores across all layers. Includes runnable Google Colab code.

#llm

#nlp

#pytorch

May 20•16m read time•From thepalindrome.org

Table of contents

What you will learn in this 2-part post series What are the Q , K , and V vectors in the attention algorithm?Import and inspect GPT-2-XL Access the internal calculations using hooks Correlating Q , K , and V activations Cosine similarities and RSA (one layer)Laminar profile of RSA scores Category separability in one layer Laminar profile of category separability and RSA So you wanna learn more?

Comment

Bookmark

Copy

Sort: