Understanding the mixture of the expert layer in Deep Learning - MBZUAI MBZUAI

Understanding the mixture of the expert layer in Deep Learning

Monday, November 14, 2022

The Mixture of Expert (MoE) layer is a newly invented sparsely activated layer in deep learning. Using a router network to route each token to one of the experts, the MoE layer can achieve comparable performance to the standard dense layer while significantly reducing the inference time.

Speaker/s

Yuanzhi Li is an assistant professor at CMU, Machine Learning Department, and an affiliated faculty at MBZUAI. His primary research area is deep learning theory and natural language processing. He did his Ph.D. at Princeton University under the advice of Sanjeev Arora.

Related

thumbnail
Thursday, January 30, 2025

Research Talks: Bridging neuroscience and AI

Caltech’s Surya Narayanan Hari explains how the human brain can be the perfect model for improving memory.....

  1. computer science ,
  2. Research talks ,
  3. computational biology ,
  4. biology ,
  5. memory ,
Read More