Scene Understanding

Scene understanding in video is an emerging problem for visual surveillance and the video understanding problem. Kitware is working to create solutions in this area, including functional object recognition. Functional objects recognition is the ability to define objects with specific purposes such as a postman and delivery truck that are defined more by their actions and behaviors than by appearance. We are developing an approach for content-based learning and recognition of the function of moving objects given video-derived tracks. In particular, we have determined that semantic behaviors of movers can be captured in a location-independent manner by attributing them with features which encode their relations and actions with respect to scene contexts, which are local scene regions with different functionalities such as doorways and parking spots which moving objects often interact with. Based on these representations, functional models are learned from examples and novel instances are identified from unseen data afterwards.

We have written and presented several papers in this area, including Functional Scene Element Recognition for Video Scene Analysis” presented at the Workshop on Motion and Video Computing December 2009 and “Unsupervised Learning of Functional Categories in Video Scenes” presented at the European Conference on Computer Vision (ECCV) in September 2010.