Universal Transformer: Toward the Big Dream of Universal AI
Halls department, Hall 5
Thursday, 27 December 2018
10:15 - 11:15
A key signature of human intelligence is the ability to make "infinite use of finite means", thus having no limitation on the computational budget appears to be an essential ingredient for intelligent systems to eventually get to the human-like intelligence. In this talk, we mainly talk about a new family of models, Universal Transformers, which are powerful concurrent-recurrent sequence processing models that under certain assumptions, can be shown to be Turing-complete. Universal Transforms combine the parallelizability and global receptive field of feed-forward sequence models, like Transformer, with the recurrent inductive bias of RNNs. We start with a background on the Transformer architecture and discuss them as an attractive alternative to RNNs for many of the language understanding and generation tasks. We then introduce Universal Transformers and their variants and show that these models not only are theoretically-appealing but also perform well on many practical real-world tasks.
Mostafa Dehghani is a Research Scientist at Google Brain, working on generative models, reinforcement learning, and artificial general intelligence. He has done his PhD in Machine Learning at the University of Amsterdam. During his PhD he focused on data-efficient deep learning, in particular training neural networks with weak supervision and using inductive biases to help models to find more generalizable solutions where the training data is limited. Before that, Mostafa was a student at the University of Tehran.