Computer Vision

Our computer vision team is a leader in creating cutting-edge algorithms and software for automated image and video analysis. Our solutions harness the power of artificial intelligence and deep learning to address challenging problems in converting pixels and other data streams into actionable information. We operate within many domains – land, sea, air, and space – providing significant value to government agencies, commercial organizations, and academic institutions worldwide. Our team understands the difficulties in extracting, interpreting, and utilizing information across images, video, other sensors (MSI, HSI, SAR, LiDAR), metadata, and text, and we recognize the need for robust, affordable solutions. We also seek to advance the fields of AI, computer vision and deep learning through research and development and collaborative projects that build on our open source software tools, such as the Video and Image Analytics for Marine Environments (VIAME) toolkit and Telesculptor.

Our customers include the Defense Advanced Research Project Agency (DARPA), the Intelligence Advanced Research Projects Activity (IARPA), the Air Force Research Laboratory (AFRL), the National Oceanic and Atmospheric Association (NOAA), and commercial companies. We have worked with these and other agencies to deploy operational systems in various domains, including a Wide­ Area Motion Imagery (WAMI) real-time tracking system for Intelligence, Surveillance, and Reconnaissance (ISR) in theater. Our team develops and transitions advanced sensor exploitation capabilities that focus on the fusion of sensors, platforms, and people. We also work on customized software R&D projects for small and large private companies, such as Lockheed Martin and Raytheon.

Conferences and Events


Find out more about the USGIF’s Geospatial Intelligence Symposium

ICCV 2021

Find out more about the International Conference on Computer Vision

Event Archive

Learn more about previous conferences and events Kitware has taken part in

Headlines and Beyond

Areas of Focus

Deep Learning

Through our extensive experience in AI and our early adoption of deep learning, we have made dramatic improvements in object detection, recognition, tracking, activity detection, semantic segmentation, and content-based retrieval. In doing so, we have addressed different customer domains with unique data, as well as operational challenges and needs. Our expertise focuses on hard visual problems, such as low resolution, very small training sets, rare objects, long-tailed class distributions, large data volumes, real-time processing, and onboard processing. Each of these requires us to creatively utilize and expand upon deep learning, which we apply to our other computer vision areas of focus to deliver more innovative solutions.

Interactive Do-It-Yourself Artificial Intelligence

DIY AI enables end users – analysts, operators, engineers – to rapidly build, test, and deploy novel AI solutions without having expertise in machine learning or even computer programming. Using Kitware’s interactive DIY AI toolkits, you can easily and efficiently train object classifiers using interactive query refinement, without drawing any bounding boxes. You are able to interactively improve existing capabilities to work optimally on your data for tasks such as object tracking, object detection, and event detection.  Our toolkits also allow you to perform customized, highly specific searches of large image and video archives powered by cutting-edge AI methods. Currently, our DIY AI toolkits, such as VIAME, are used by scientists to analyze environmental images and video. Our defense-related versions are being used to address multiple domains and are provided with unlimited rights to the government. These toolkits enable long-term operational capabilities even as methods and technology evolve over time.

Explainable and Ethical Artificial Intelligence (AI)

Integrating AI via human-machine teaming can greatly improve capabilities and competence as long as the team has a solid foundation of trust. To trust your AI partner, you must understand how the technology makes decisions and feel confident in those decisions. Kitware has developed powerful tools to explore, quantify, and monitor the behavior of deep learning systems. Our team is making deep neural networks explainable and robust when faced with previously-unknown conditions. In addition, our team is stepping outside of classic AI systems to address domain independent novelty identification, characterization, and adaptation to be able to acknowledge the introduction of unknowns. We also value the need to understand the ethical concerns, impacts, and risks of using AI. That’s why Kitware is developing methods to understand, formulate and test ethical reasoning algorithms for semi-autonomous applications. 

Object Detection

Object Detection, Recognition and Tracking

Our video object detection and tracking tools are the culmination of years of continuous government investment. Deployed operationally in various domains, our mature suite of trackers can identify and track moving objects in many types of intelligence, surveillance, and reconnaissance data (ISR), including video from ground cameras, aerial platforms, underwater vehicles, robots, and satellites. These tools are able to perform in challenging settings and address difficult factors, such as low contrast, low resolution, moving cameras, occlusions, shadows, and high traffic density, through multi-frame track initialization, track linking, reverse-time tracking, recurrent neural networks, and other techniques. Our trackers can perform difficult tasks including ground camera tracking in congested scenes, real-time multi-target tracking in full-field WAMI and OPIR, and tracking people in far-field, non-cooperative scenarios.

Complex Activity, Event, and Threat Detection

Kitware’s tools recognize high-value events, salient behaviors and anomalies, complex activities, and threats through the interaction and fusion of low-level actions and events in dense cluttered environments. Operating on tracks from WAMI, FMV, MTI or other sources, these algorithms characterize, model, and detect actions, such as people picking up objects and vehicles starting/stopping, along with complex threat indicators such as people transferring between vehicles and multiple vehicles meeting. Many of our tools feature alerts for behavior, activities and events of interest, including highly efficient search through huge data volumes, such as full frame WAMI missions using approximate matching. This allows you to  identify actions in massive video streams and archives to detect threats, despite missing data, detection errors and deception.

Cyber-Physical Systems (CPS)

The physical environment presents a unique, ever-changing set of challenges to any sensing and analytics system. Kitware has designed and constructed state-of-the-art cyber-physical systems that perform onboard, autonomous processing to gather data and extract critical information. Computer vision and deep learning technology allow our sensing and analytics systems to overcome the challenges of a complex, dynamic environment. They are customized to solve real-world problems in aerial, ground, and underwater scenarios. These capabilities have been field-tested and proven successful in programs funded by R&D organizations such as DARPA, AFRL, and NOAA. 

Dataset Collection and Annotation

The growth in deep learning has increased the demand for quality, labeled datasets needed to train models and algorithms. The power of these models and algorithms greatly depends on the quality of the training data available. Kitware has developed and cultivated dataset collection, annotation, and curation processes to build powerful capabilities that are unbiased and accurate, and not riddled with errors or false positives. Kitware can collect and source datasets and design custom annotation pipelines. We can annotate image, video, text and other data types using our in-house, professional annotators, some of whom have security clearances, or leverage third-party annotation resources when appropriate. Kitware also performs quality assurance that is driven by rigorous metrics to highlight when returns are diminishing. All of this data is managed by Kitware for continued use to the benefit of our customers, projects, and teams. 

Image and Video Forensics

In this new age of disinformation, it has become critical to validate the integrity and veracity of images, video, and other sources. As photo-manipulation and photo generation techniques are evolving rapidly, we are continuously developing algorithms to automatically detect image and video manipulation that can operate at scale on large data archives. These advanced deep learning algorithms give us the ability to detect inserted, removed, or altered objects, distinguish deepfakes from real images, and identify deleted or inserted frames in videos in a way that exceeds human performance. We continue to extend this work through multiple government programs to detect manipulations in falsified media exploiting text, audio, images, and video.

3D Reconstruction, Point Clouds, and Odometry

Kitware’s algorithms can extract 3D point clouds and surface meshes from video or images, without metadata or calibration information, or exploiting it when available. Operating on these 3D datasets or others from LiDAR and other depth sensors, our  methods estimate scene semantics and 3D reconstruction jointly to maximize the accuracy of object classification, visual odometry, and 3D shape. Our open source 3D reconstruction toolkit, Telesculptor, is continuously evolving to incorporate advancements to automatically analyze, visualize, and make measurements from images and video. LiDARView, another open source toolkit developed specifically for LiDAR data, performs 3D point cloud visualization and analysis in order to fuse data, techniques, and algorithms to produce SLAM and other capabilities.

Super Resolution and Enhancement

Kitware’s super-resolution techniques enhance single or multiple images to produce higher-resolution, improved images. We use novel methods to compensate for widely spaced views and illumination changes in overhead imagery, particulates and light attenuation in underwater imagery, and other challenges in a variety of domains. The resulting higher-quality images enhance detail, enable advanced exploitation, and improve downstream automated analytics, such as object detection and classification.

Scene Understanding

Kitware’s knowledge-driven scene understanding capabilities use deep learning techniques to accurately segment scenes into object types. In video, our unique approach defines objects by behavior, rather than appearance, so we can identify areas with similar behaviors. Through observing mover activity, our capabilities can segment a scene into functional object categories that may not be distinguishable by appearance alone. These capabilities are unsupervised so they automatically learn new functional categories without any manual annotations. Semantic scene understanding improves downstream capabilities such as threat detection, anomaly detection, change detection, 3D reconstruction, and more.

Featured Programs

For more than ten years, Kitware has been involved in major, national R&D programs funded by the Defense Advanced Research Projects Agency’s (DARPA) and Intelligence Advanced Research Projects Activity (IARPA). On these highly competitive, high-profile programs, Kitware has developed state-of-the-algorithms, integrated them into prototype systems, supported third-party evaluations, and coordinated large teams with multiple universities and industry subcontractors. A selection of our prominent R&D programs are described here.


Media Forensics

DARPA’s Media Forensics program aims to develop a platform to automatically detect video and image manipulations, provide details about how the manipulations were performed, and investigate the overall integrity of visual media.


Video and Image Analytics for the Marine Environment

VIAME is an open source software platform designed for do-it-yourself AI for analyzing imagery and video. Originally developed for analytics in the maritime domain, it now contains many generic algorithms and capabilities that apply to virtually any video or image domain.


Deep Intermodal Video Analytics

Funded by IARPA, the DIVA program is developing automatic activity detection for a multi-camera streaming video scenes, including person actions and person-vehicle interactions, in both indoor and outdoor locations.


Explainable AI

Funded by DARPA, the Explainable AI (XAI) program aims to create a suite of machine learning techniques that will produce more explainable models while maintaining a high level of accuracy and enabling human users to understand, appropriately trust, and manage their AI partners. In addition to being a performer, Kitware has created the Explainable AI Toolkit (XAITK <link to>) to integrate XAI contributions into an open source toolkit.


Computation of Operationally-Realistic 3D Datasets

Funded by IARPA, the CORE3D program developed technology that automatically generates accurate 3D object models using satellite imagery. Kitware focused on converting 3D point clouds to low-complexity 3D surface meshes, and recognizing surface material types.


Video and Image Retrieval and Analysis Toolkit

Funded by DARPA, the VIRAT program developed capabilities for automatic detection and real-time alerts of events and human actions in aerial surveillance video, both visible and IR. It also indexed the descriptors into a database to enable subsequent search for similar and related events.

What We Offer for Your Project

We provide custom research and software development, collaboration, support, training, and books to help in the above areas of focus.


We have over 300 publications in journals, conference proceedings, etc., on topics such as convolutional neural networks, deep learning, image and video analytics, and situational awareness.

Computer Vision Platform


Through funding and guidance from the National Oceanic and Atmospheric Administration’s (NOAA) Automated Image Analysis Strategic Initiative (AIASI), Kitware has developed the Video and Image Analytics for Marine Environment (VIAME) toolkit. VIAME is an open source software platform for do-it-yourself artificial intelligence that enables end-users to customize cutting-edge, deep learning methods for their specific problems, without any programming or knowledge of how AI works. Used by dozens of marine science labs around the world, it is an evolving toolkit that contains many workflows for generating specialized object detectors, full-frame classifiers, and image mosaics; for performing  image and video search combined with rapid detector generation; and for stereo measurement, 3D extraction, and camera calibration.


TeleSculptor is a cross-platform desktop application for photogrammetry. It was designed with a focus on aerial video, such as video collected from UAVs. It can handle geospatial coordinates and can make use of metadata from GPS and IMU sensors. The software can also work with non-geospatial data and with collections of images. TeleSculptor uses Structure-from-Motion techniques to estimate camera parameters as well as a sparse set of 3D landmarks. It uses Multiview Stereo techniques to estimate dense depth maps on key frames, and then fuses those depth maps into a consistent surface mesh which can be colored from the source imagery.


The Social Media Query Toolkit (SMQTK) is a scalable framework with bundled algorithms for indexing, searching, and query refinement on images and video clips. It contains a user-driven search workflow, known as interactive query refinement (IQR), to easily and quickly refine search results through positive and negative adjudication to locate information of interest. SMQTK includes a web based IQR graphical user interface plus RESTful services.


The Kitware Image and Video Exploitation and Retrieval (KWIVER) toolkit is an open source, production-quality image and video exploitation platform. It comes with unrestricted licensing to engage the full-spectrum video research and development community, including academic, industry, and government organizations. KWIVER provides advanced capabilities in video object detection and tracking, in addition to supporting tools for quantitative evaluation, pipeline processing, software builds, and more.