A retrospective on EmoNet, an MS thesis project that achieved competitive results on the EmoryNLP emotion recognition benchmark using speaker-aware transformers. The system introduced global speaker identity tracking across dialogues, a GRU-based speaker behaviour module, and weighted cross-entropy loss for class imbalance. Two years later, the author reflects on how the field shifted to LLM-based approaches (InstructERC, BiosERC, LaERC-S) and notes that the core architectural intuitions — speaker biography and historical behaviour — survived the paradigm shift, just reimplemented via instruction tuning and retrieval-augmented prompting. The post concludes with how the author would rebuild EmoNet today using LoRA fine-tuning on a small open-source LLM.

10m read timeFrom towardsdatascience.com
Post cover image
Table of contents
What ERC is, and why text-only is hardThe 2024 landscapeThree contributions, with intuitionResults: what worked, and what surprised meReflection (2026): the field moved, and so should weWhere this leaves me

Sort: