91ÃÛÌÒ¸ó

Skip to main content

APPM Department Colloquium - Colin Raffel

Colin Raffel; Assistant ProfessorÌýof Computer Science; University of North Carolina, Chapel Hill; and Staff 91ÃÛÌÒ¸ó Scientist, Google Brain

T5 and large language 91ÃÛÌÒ¸ó: The good, the bad, and the ugly

T5 and other large pre-trained language 91ÃÛÌÒ¸ó have proven to be a crucial component of the modern natural language processing pipeline. In this talk, I will discuss the good and bad characteristics of these 91ÃÛÌÒ¸ó through the lens of five recent papers. In the first, we empirically survey the field of transfer learning for NLP and scale up our findings to attain state-of-the-art results on many popular benchmarks. Then, I show how we can straightforwardly extend our model to be able to process text in over 100 languages. The strong performance of these 91ÃÛÌÒ¸ó gives rise to a natural question: What kind of knowledge and skills do they pick up during pre-training? I will provide some answers by first showing that they are surprisingly good at answering trivia questions that test basic "world knowledge", but also demonstrating that they memorize non-trivial amounts of (possibly private) pre-training data, even when no overfitting is evident. Finally, I will wrap up on a sober take on recent progress to improve upon the architectures of language 91ÃÛÌÒ¸ó.

Bio:ÌýColin Raffel is an Assistant Professor in the Department of Computer Science at the University of North Carolina, Chapel Hill. He also spends one day a week as a Staff 91ÃÛÌÒ¸ó Scientist at Google Brain. He obtained his PhD in Electrical Engineering from Columbia University in 2016 under the supervision of Daniel P. W. Ellis.