Please note that this newsitem has been archived, and may contain outdated information or links.

16 October 2024, Computational Linguistics Seminar, Verna Dankers

Speaker: Verna Dankers (University of Edinburgh)

Title: Analysing memorisation in classification and translation through localisation and cartography

Date: Wednesday 16 October 2024

Time: 16:00

Location: Room L3.36, ILLC Lab42, Science Park 900, Amsterdam / Online

Memorisation is a natural part of learning from real-world data: neural models pick up on atypical input-output combinations and store those training examples in their parameter space. That this happens is well-known, but which examples require memorisation and where in the millions (or billions) of parameters memorisation occurs are questions that remain largely unanswered. In this talk, I first elaborate on the localisation question by examining memorisation in the context of classification in fine-tuned PLMs, using 12 tasks. Our findings give nuance to the generalisation-first memorisation-second hypothesis dominant in the literature and find memorisation to be a gradual process rather than a localised one. Secondly, I discuss memorisation from the viewpoint of the data using neural machine translation (NMT) models by putting individual data points on a memorisation-generalisation map. I illustrate how the data points' characteristics are predictive of memorisation in NMT and describe the influence that subsets of that map have on NMT systems' performance.

For more information, see https://projects.illc.uva.nl/LaCo/CLS/.

Please note that this newsitem has been archived, and may contain outdated information or links.

News and Events: Upcoming Events

16 October 2024, Computational Linguistics Seminar, Verna Dankers