Overview
Out of core image processing is necessary for dataset sizes larger than a computer’s main memory. ITK’s pipeline architecture typically buffers the entire image for each filter. This behavior produces memory requirements that are multiples of the dataset size, which inhibits the processing of large datasets. Fortunately ITK was designed to accommodate the
sequential processing of sub-regions of the data object, a process called streaming. Previously, to easily stream data objects one had to use the StreamingImageFilter, which
sequentially requests sub-regions, causing the input to stream, and then reassembles the regions into a buffered output image. However, if the dataset exceeds the size of system memory this approach will not work. For large datasets the entire image must never be in memory at once; therefore pipelines must stream from the reader to the writer.
Section 13.3 of the ITK Software Guide provides a detailed explanation of the internals of how streaming works in ITK’s pipeline execution [1]. For streaming to work correctly, each filter in a pipeline must be able to stream. The PropagateRequestedRegion() phase of the pipeline update is crucial for streaming, as a filter negotiates what region the filter should process on its input(s) and output(s). Filters incapable of streaming will set their input requested region to the largest possible. This action will make all upstream (closer to the first) filters process the largest possible region, but they will not actually stream. That is to say that streaming must begin with the writer.
Implementation
One of the challenges for implementing streaming IO was the need for the readers and writers to negotiate the requested regions with ImageIO classes so the regions match the ImageIO object’s capabilities. The readers and writers use an object factory to create an ImageIO object to read the specified file type at runtime. New methods were added to ImageIOBase to facilitate the negotiations of requested regions. The default implementation does not provide support for streaming. Currently few file types support streaming, namely Meta Image has robust support for reading and writing, while Nifti currently has support only for reading. These changes are essentially complete with ITK version 3.12.
Support was added to itkImageFileReader for streaming as of ITK version 3.4 [2]. The changes required for the reader were minimal. The main functionality needed was the ability to respond to requested regions and to query the capabilities of an ImageIO object with the request. Filters capable of streaming do so upon request making the process and their usage relatively transparent. As of ITK 3.12, itkImage-FileReader has the same behavior.
Streaming in ImageSeriesReader is also enabled by default. This class uses a series of readers to read each file containing a slice into a higher dimension image. The interesting implication for streaming is that even if the ImageIO class does not support streaming, the ImageSeriesReader will have reasonable streaming performance. Entire individual slices may be loaded into memory when only a small fraction of the slice is requested; the slices are frequently of a manageable size for many datasets. If each slice can individually reside in memory before the smaller requested region is copied to the output image, then we are capable of processing the large dataset.
Unlike the readers, the ImageFileWriter class can drive the pipeline to stream, similar to StreamingImageFilter. To enable streaming the SetNumberOfStreamDivisions(unsign ed int) must be set to a number greater than one. As this method is only a request to stream, the ImageIO type must also support streaming. The method is a request because the writer does not buffer each sub-region, it only passes a streamable region to the ImageIO object for output. One of the important parts of streaming is how the region is broken up into sub-regions. It is likely that the ImageIO object will have certain restrictions on what it can write, perhaps only slices. To accommodate the restrictions, an interface between the writer and the ImageIO object was implemented. This interface allows the ImageIO object to determine the actual number of divisions and how the region is divided for streaming and writing.
In addition to streaming an entire image piece by piece, another type of streaming has been defined for writing called pasting. Pasting is the writing of a sub-region to a file, which contains the entire image. If the file does not exist a new one will be created, otherwise the meta-data and image type must match what is in the file. If neither of these conditions is met or if the ImageIO is not capable of pasting, an exception will be thrown. The pasting operation is designed to be scalable to large files, where rewriting the entire file will be too costly.
Visible human examples
The Insight Toolkit was originally motivated by a need for software tools to segment and register the National Library of Medicine’s Visible Human Project datasets. The data is
freely available through NLM’s website [3]. The original Visible Male cryosectional images are non-interlaced 24-bit RGB pixels with a resolution of 2048x1216 pixels by 1871
slices, and a physical spacing of approximately 0.33 mm in slice and 1.0 mm between slices. These dimensions results in about 13GB of data, which is an appropriate size to demonstrate streaming. The following two examples of streaming show the three IO classes capable of streaming along with the two types of streaming supported by the writer.
Streaming
A coronal slice is a classic view of the Visible Male. The following is an example that reads the entire raw dataset and generates that image:
This example creates a RawImageIO and ImageSeriesReader for each color channel in the data. Notice that there are no special methods needed to enable streaming; streaming is in response to requests from the pipeline. In the ComposeRGBImageFilter, the channels are composited into a single color image. Then the information is updated to initialize the coronal slice region to be extracted. The final filter, ImageFileWriter, writes out the file as a Meta Image type, which fully supports IO streaming.
The performance of this code is excellent, in both speed and memory usage. On a MacBook Pro laptop execution was completed within 3 minutes, while on an Intel dual quadcore Mac Pro workstation it finished in less then 30 seconds. For memory, only about 50 megabytes were used.
The most interesting aspect of this example is not the filters used, but how ITK’s pipeline manages its execution. The final output image is 2048 by 1878 pixels. The ImageFileWriter breaks this 2D image into 200 separate regions, each about 2048 by 10 pixels. Each region is then streamed and processed through the pipeline. The writer makes 200 calls to its ImageIO object to write the individual regions. The extractor converts this 2D region into a 3D region of 2048 by 1 by 10 pixels, which is propagated to the ImageSeriesReader.
Then the reader reads the entire slice, but only copies the requested sub-region to its output. This pipeline is so efficient because very little data is actually processed at any one stage of the pipeline due to streaming IO.
Pasting
Pasting enables the writing of a sub-region to a file. This example updates a small portion of the 2D coronal slice. The file streamed_paste_vm.mha can either not exist or can be
copied from the output of the previous example [4].
Below we begin by creating a reader for the file just written that is capable of streaming.
The pipeline is continued through a gradient magnitude filter which works on vector images to produce a scalar output. Then a color image is recreated by compositing the output as red, green and blue channels.
Next we begin to specify the paste region, by creating an ImageIORegion that is half the size and centered on the entire image. The ImageIORegion class is similar to the ImageRegion class except that it is not templated over the image dimension because of the runtime nature of IO.
After using an adaptor to convert the color image into a vector image (so that the pixel type will match the type in the file) we create a writer. Here both streaming and pasting are used. To enable pasting, a call to SetIORegion is made with a valid region. Finally, the pipeline is updated, causing the streaming of regions.
This pasting example only writes the small halfIO region to the file, the remainder is not touched. The manner in which the pipeline executed is very similar to the previous streaming example. The main difference is that the writer only breaks up the IORegion for streaming, not the entire image. The other difference is that the reader fully supports streaming and only reads the required region from the file.

The resulting image after the pasting the gradient magnitude onto a coronal cross section of the Visible Male. This image was also run through a ShrinkerImageFilter to make the pixels square.
Future work and conclusions
Currently only the Meta Image fully supports streaming, however, with the current framework, other formats can also be upgraded to support streaming. The ImageSeriesWriter
could also implement streaming of sequences for files when a single output file is not desired. Many filters do not currently support streaming and as such, filter modification
may be required to support streaming.
Streaming IO was a bottleneck to support processing of very large data. Our examples have shown out-of-core processing can now be performed on datasets such as the Visible Human Project and large microscopy scans.
Acknowledgements
Much of this work has been accomplished through the collaborative weeks provided by NAMIC. The author would also like to thank Luis Ibáñez and Stephen Aylward for their help
with ITK and the MetaIO library along with David T. Chen for his valuable feedback on this article.
Footnotes:
Bradley Lowekamp is a Lockheed Martin Software Engineer Contractor for the Office of High Performance Computing and Communications at the National Library of Medicine. He is part of the intramural research program where he utilizes ITK to analyze solid dose pharmaceuticals, microscopy data and the Visible Human Project collections.