THE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)
Hosted by the IEEE Computer Society and the Computer Vision Foundation
January 3-7, 2023 at the Waikoloa Beach Marriott Resort in Waikoloa, Hawaii
WACV is a premier international computer vision conference that attracts vision researchers and practitioners from around the world. Being an academic conference, WACV emphasizes papers on systems and applications with significant, interesting vision components and is highly selective, with fewer than 30% of submissions accepted.
Kitware has supported WACV over the past few years as a sponsor, exhibitor, and presenter. This year, we are a Gold-level sponsor and will have an in-person exhibit space where we will highlight our ongoing research. Visit us to learn how we apply computer vision to solve challenging problems across sea, air, space, terrestrial, and internet domains. For those who can’t attend, we have all of our resources available on our WACV 2023 landing page.
Kitware’s Activities And Involvement
We are proud to have three papers accepted at WACV 2023. “Reconstructing Humpty Dumpty: Multi-feature Graph Autoencoder for Open Set Action Recognition,” written by Dawei Du (Kitware), Ameya Shringi, Anthony Hoogs (Kitware), and Christopher Funk (Kitware), focuses on the open set problem for action recognition, where test samples may be drawn from either known or unknown classes. Existing open set action recognition methods are typically based on extending closed set methods by adding post hoc analysis of classification scores or feature distances and do not capture the relations among all the video clip elements. Our approach uses reconstruction error to determine the novelty of the video since unknown classes are harder to put back together and, therefore, have a higher reconstruction error than videos from known classes. Our solution is a novel graph-based autoencoder that accounts for contextual and semantic relations among the clip pieces for improved reconstruction.
Our second paper, “MEVID: Multi-view Extended Videos with Identities for Video Person Re-Identification,” was written by Daniel Davila, Dawei Du, Bryon Lewis, Christopher Funk, Joseph Van Pelt, Roderic Collins, Kellie Corona, Matt Brown, Scott McCloskey, Anthony Hoogs, and Brian Clipp from Kitware. In this paper, we present the Multi-view Extended Videos with Identities (MEVID) dataset for large-scale, video person re-identification (ReID) in the wild. To our knowledge, MEVID represents the most-varied video person ReID dataset, spanning an extensive indoor and outdoor environment across nine unique dates in a 73-day window, various camera viewpoints, and entity clothing changes. While other datasets have more unique identities, MEVID emphasizes a richer set of information about each individual. Being based on the MEVA video dataset, we also inherit data that is intentionally demographically balanced to the continental United States. To accelerate the annotation process, we developed a semi-automatic annotation framework and GUI that combines state-of-the-art, real-time models for object detection, pose estimation, person ReID, and multi-object tracking.
In the third paper, “Handling Image and Label Resolution Mismatch in Remote Sensing,” we explore one of the unique challenges to semantic segmentation in the remote sensing domain-differing ground sample distance. Authors Scott Workman, Armin Hadzic, and M. Usman Rafique (Kitware) explain how these differences result in a resolution mismatch between overhead imagery and ground-truth label sources. We present a supervised method using low-resolution labels (without upsampling) that takes advantage of an exemplar set of high-resolution labels to guide the learning process. Our method incorporates region aggregation, adversarial learning, and self-supervised pre-training to generate fine-grained predictions without requiring high-resolution annotations. Extensive experiments demonstrate the real-world applicability of our approach.
In addition to these papers, Kitware is co-chairing the following workshops:
Co-organizer: Matthew Dawkins
January 3, Full Day
Over the past few years, many computer vision applications have emerged in the maritime and freshwater domains. Autonomous vehicles have made accessing maritime environments easier by providing the potential for automation on busy waterways and shipping routes and airborne applications. Computer vision plays an essential role in accurate navigation when operating these vehicles in busy traffic or close to the shores. This workshop aims to bring together Maritime Computer Vision researchers and promote deploying modern computer vision approaches in airborne and surface water domains.
January 3, Full Day
Computer vision algorithms are often developed inside a closed-world paradigm (e.g. recognizing objects from a fixed set of categories). However, the real world is open and constantly changing. Most computer vision algorithms do not look for change and continue to perform their tasks with incorrect and sometimes misleading predictions as the world evolves. Many real-world applications considered at WACV must deal with changing worlds where a variety of novelty is introduced, such as new classes of objects. In this workshop, we aim to facilitate research directions that operate well in the open world while maintaining performance in the closed world. We will explore mechanisms to measure competence at recognizing and adapting to novelty.
Co-chair: Anthony Hoogs, Ph.D.
January 3, Full Day
This workshop will cover topics related to the application of computer vision in real-world video surveillance, the challenges associated with this surveillance, and mitigation strategies on topics such as object detection, scene understanding, and super-resolution. The workshop will also address legal and ethical issues of computer vision applications in these real-world scenarios, for example, detecting bias toward gender or race.
Co-chair: Scott McCloskey, Ph.D.
January 7, Full Day
Vision-based recognition in uncontrolled environments has been a topic of interest for researchers for decades. The addition of large standoff (lateral and/or vertical) distances between sensing platforms and the objects being sensed adds new challenges to this area. This workshop will cover some of the focused research programs addressing this issue, along with the supporting challenges of data collection, data curation, etc., and highlight architectures for multimodal recognition that seem to offer the potential for strong performance. This workshop also aims to develop implicit consensus for current best practices on data-related matters and identify topics that need newly focused attention.