Nathan Labenz sits down with Lili Yu, a researcher of Meta AI to discuss the paper she authored: MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers. In this conversation, they discuss the architecture and breakthroughs of their research, and the opportunity to eliminate the need for tokenization.
LINK:
MEGABYTE Paper: https://arxiv.org/pdf/2305.07185.pdf
TIMESTAMPS:
(00:00) Episode preview
(07:41) Takeaways from Lili Yu's paper: MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
(17:00) Architecture
(24:59) Embeddings
(27:43) Different local models
(34:23) Encoder model
(36:35) Transformer Architecture
(48:10) Choosing patch size
(01:08) What happens when you scale up?
(01:19:20) Big picture for Meta AI
(01:22:57) Responsible AI
(01:27:02) China and AI
TWITTER:
@labenz (Nathan)
@liliyu_lili (Lili)
@eriktorenberg (Erik)
@cogrev_podcast
SPONSOR:
Thank you Omneky (www.omneky.com) for sponsoring The Cognitive Revolution. Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work, customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.
MUSIC CREDIT:
MusicLM