Sponsored by


Jose Alvarez

Call for Papers

Download CfP



Invited Speakers








Opening Remarks



Invited Talk: Unsupervised Cross-Domain Mapping

Lior Wolf (Facebook AI Research)



Invited Talk: Optimization as a Model for Few-Shot Learning

Hugo Larochelle (Universite de Sherbrooke, Canada)



Morning Break



Invited talk: How studying brains can help us get a deeper vision

Gabriel Kreiman (Harvard University, USA)



Invited talk: TBD

Sanja Fidler (University of Toronto, Canada)



Invited talk: Deep RL for Navigation in Complex Environments

Raia Hadsell (Google DeepMind, UK)






Session Remarks



Invited talk: Deep models for activity detection and description

Kate Saenko (Boston University, USA)



Poster Spotlight



Afternoon Break / poster session



Invited talk: Tubelet-based Video Object Detection

Xiaogang Wang (Chinese University of Hong Kong)



Invited talk: TBD

Maja Pantic (Imperial College, UK)



Closing remarks


Poster Session



Concurrence-Aware Long Short-Term Sub-Memories for Person-Person Action Recognition

Xiangbo Shu, Jinhui Tang, Guojun Qi, Yan Song, Zechao Li, liyan zhang  

Crowd-11: A Dataset for Fine Grained Crowd Behaviour Analysis

Camille Dupont, Luis Tobias, Bertrand Luvison 

Temporal Domain Neural Encoder for Video Representation Learning

Hao Hu, Zhaowen Wang, Joon-Young Lee, Zhe Lin, Guo-Jun Qi 

Recurrent Memory Addressing for describing videos

Arnav Jain, Abhinav Agarwalla, Kumar Agrawal, Pabitra Mitra 

Temporally Steered Gaussian Attention for Video Understanding

Shagan Sah, Thang Nguyen, Miguel Dominguez, Felipe Petroski Such, Raymond Ptucha 

SANet: Structure-Aware Network for Visual Tracking

Heng Fan, Haibin Ling 

Fixation Prediction in Videos using Unsupervised Hierarchical Features

Tzujui Wang, Hamed Rezazadegan Tavakoli, Jorma Laaksonen 

Learning Latent Temporal Connectionism of Deep Residual Visual Abstractions for Identifying Surgical Tools in Laparoscopy Procedures

Kaustuv Mishra, Rachana Sathish, Debdoot Sheet  

Kernalised Multi-resolution Convnet for Visual Tracking

Di Wu, Wenbin Zou, Xia Li, Yong Zhao 

Description of the workshop

The goal of the DeepVision Workshop is to accelerate the study of deep learning algorithms in computer vision problems. With the increase of acceleration of digital photography and the advances in storage devices over the last decade, we have seen explosive growth in the available amount of visual data and equally explosive growth in the computational capacities for image understanding. Instead of hand crafting features, recent advancement in deep learning suggests an emerging approach to extracting useful representations for many computer vision tasks.

Paper Submission

Important Dates

  1. Paper Submission: March 31st, 2017
  2. Supplemental Material Submission: March 31st, 2017
  3. Author Notification: April 21st, 2017
  4. Camera Ready: May 15th, 2017

Organizing Commitee

Temporal Deep Learning

Videos contain valuable temporal information that can be exploited to achieve better performance. Exploiting temporal information is of great importance in computer vision applications, like object tracking and recognition, scene analysis and understanding, etc. Deep learning based techniques are challenged to employ temporal information in such applications. Although some advances have been performed in this direction, mainly involving 3D convolutions, motion-based input features, or deep temporalbased models such as RNN-LSTM, significant advances are expected to be performed in this field.

Invited Speakers

Organizing Commitee