[SAIF2020] Day2: Vision – Devi Parikh | Samsung

Session 2. Vision

“Multimodal and Creative AI Systems”

This talk will cover computer vision systems that humans can interact with via language, and AI systems that can enhance humans in their creative endeavors. In the first part of the talk I will describe our work on learning general visiolinguistic representations that can be used for a variety of vision & language tasks such as answering natural language questions about images, grounding natural language phrases in image regions, describing an image via a caption, etc. In the second part of the talk I will describe our work in automatically generating “dance” sequences from input music, generating abstract visualizations to represent salient aspects of a user’s day and associated emotional states, neuro-symbolic generative art, and evaluating collaboration mechanisms in the context of sketching.

