Dr. Thomas Kipf’s slides – part II

Dr. Thomas Kipf’s slides – part II:
Structured Scene Understanding talk material:
* UCL Talk: https://www.youtube.com/watch?v=oLKwRBeBRRA

* Greff et al., On the Binding Problem in Artificial Neural Networks (2020): https://arxiv.org/abs/2012.05208
* Sajjadi et al., OSRT (2022): https://osrt-paper.github.io/

* Wu et al., SlotFormer (2023): https://slotformer.github.io/
* Seitzer et al., Bridging the Gap to Real-World Object-Centric Learning (2023): https://arxiv.org/abs/2209.14860

Dr. Lucas Beyer’s slides

Dr. Lucas Beyer’s slides:

Full Transformer tutorial slides: http://lucasb.eyer.be/transformer
Main papers covered (at high level) in the computer vision talk:
Part1:
– Vision Transformer: https://arxiv.org/abs/2010.11929
– Are we done with ImageNet? https://arxiv.org/abs/2006.07159
– Scaling laws for shape optimization https://arxiv.org/abs/2305.13035
– Distillation of large models https://arxiv.org/abs/2106.05237
Part2:
– LiT (Locked image-text Tuning): https://arxiv.org/abs/2111.07991
– PaLI image-text model: https://sites.research.google/pali/
– UViM: Unified Vision Model: https://arxiv.org/abs/2205.10337
– Unified loss (RL tuning): https://arxiv.org/abs/2205.10337