Dr. Lucas Beyer’s slides:
Full Transformer tutorial slides: http://lucasb.eyer.be/transformer
Main papers covered (at high level) in the computer vision talk:
Part1:
– Big Transfer: https://arxiv.org/abs/1912.11370
– Vision Transformer: https://arxiv.org/abs/2010.11929
– Are we done with ImageNet? https://arxiv.org/abs/2006.07159
– Scaling laws for shape optimization https://arxiv.org/abs/2305.13035
– Distillation of large models https://arxiv.org/abs/2106.05237
Part2:
– CLIP: https://openai.com/research/clip
– LiT (Locked image-text Tuning): https://arxiv.org/abs/2111.07991
– PaLI image-text model: https://sites.research.google/pali/
– UViM: Unified Vision Model: https://arxiv.org/abs/2205.10337
– Unified loss (RL tuning): https://arxiv.org/abs/2205.10337