Watch the talk by David Fouhey (University of Michigan) entitled „Understanding how to get to places and do things“ that took place at CIIRC.
The lecture was organized by the IMPACT project
The whole video available here
Abstract: What does it mean to understand an image or video? One common answer in computer vision has been that understanding means naming things:
this part of the image corresponds to a refrigerator and that to a person, for instance. While important, this ability is not enough:
humans can effortlessly reason about the rich world that images depict and what they can do in it. For example, if a friend shows you the way to their kitchen for you to get something, they won’t worry that you’ll get lost walking back (navigation) or that you’d have trouble figuring out how to open their refrigerator or cabinets. While both are an ordinary feat for humans (or even a dog or cat), they are currently far beyond the abilities of computers.
In my talk, I’ll discuss my efforts towards bridging this gap. In the first part, I’ll discuss the task of navigation, getting from one place to another. In particular, our goal is to take a single demonstration of a path and retrace it, either forwards or backwards, under noisy actuation and a changing environment. Rather than build an explicit model of the world, we learn a network that attends to a sequence of memories in order to make decisions. In the second part, I will discuss how to scalably gather data of humans interacting with the world, resulting in a new dataset of human interactions, VLOG, as well as and what we can learn from this data.
Bio: David Fouhey is starting as an assistant professor at the University of Michigan in January 2019 and is currently a visitor at INRIA Paris.
His research interests include computer vision and machine learning, with a particular focus on scene understanding. He received a Ph.D. in robotics in 2016 from Carnegie Mellon University where he was supported by NSF and NDSEG fellowships, and was then a postdoctoral fellow at UC Berkeley. He has spent time at the University of Oxford’s Visual Geometry Group and at Microsoft Research. More information is