ParaView's ability to process and visualize massive datasets is based on VTK's streaming functionality. In ParaView, identical copies of the visualization pipeline are run on many machines and each pipeline is asked to process a different small portion, or piece, of the input data. Together the machines process the entire dataset simultaneously, and no machine ever exceeds the capacity of its local RAM.

 

Refinement at work in the analysis of a 3600x2400x42 chlorofluorocarbon (CFC) concentration simulation study, being performed on a 32-bit laptop PC. Yellow outlines identify “pieces”. At this point, ParaView has progressed seven levels down into a nine level deep piece refinement tree. Blue outlines show individual cells. At the lowest level cells are already at sub-pixel resolution.

VTK's ability to break up data is called streaming, because when the data can be divided it is also possible to iterate over the pieces. In data parallel processing you stretch the problem over a larger amount of RAM whereas in streaming you stretch the problem over a longer time. Iterative streamed rendering proceeds, for example, by rendering each piece in turn without clearing the color or depth buffer, using the persistent Z-buffer to resolve occlusion for individual triangles both within a piece and across pieces. In practice, streaming and data parallel processing are orthogonal and can be combined by dividing the problem into P sets of I pieces.

In many computational processes and especially rendering, it is often the case that only a small fraction of the entire data contributes to the end result. Prioritized streaming processes only those pieces that contribute to the final result (ignoring pieces that are off-screen, for example) and processes them in a most important (i.e. nearest to the camera) to least important order. This variation of streaming has great benefits including eliminating unnecessary processing and IO [1] and providing more immediate feedback to the user to speed the data discovery process [2]. Prioritized streaming is the basis for the experimental branded application called StreamingParaView. StreamingParaView was first introduced in ParaView 3.6.

VTK's streaming has an interesting feature in that it’s possible to ask for data in arbitrarily small chunks. Streaming is driven by asking for different pieces at the downstream end of the pipeline (ex vtkPolyDataMapper::SetPiece()). One asks for smaller pieces by asking for pieces out of a larger set (ex vtkPolyDataMapper::SetNumberOfPieces()). What is interesting about this is that as the chunk size decreases - assuming prioritization is in effect - the work done to produce a given visualization approaches the minimal amount necessary.

We recently added the ability to ask for data at differing resolution levels to VTK's streaming support. It is now possible to not only ask for arbitrary pieces, but also to ask for them at arbitrary resolutions. The mechanics are similar to VTK's temporal support [3]. One asks the pipeline to provide data at a given requested resolution. This request is a dimensionless number that ranges from 0.0 meaning lowest resolution to 1.0 meaning full resolution. If unspecified, full resolution is assumed. The request travels up the pipeline to the reader, which decides how to interpret this number. Structured data sources use the resolution to choose how coarsely to subsample in the i, j and k dimensions and adjust their x, y and z spacing to compensate accordingly. As in the temporal pipeline, the result is free to vary from what was requested. To support this, the reader inserts a resolution answer key into the data it produces, which then travels back down the pipeline to the requester.

The following is a code example from VTK which asks the pipeline to: compute the priority for a particular piece at a particular resolution, conditionally update the pipeline to produce the requested piece and then examine the returned resolution result.

vtkStreamingDemandDrivenPipeline * filtersExec =
  vtkStreamingDemandDrivenPipeline::SafeDownCast
    (aFilter->GetExecutive());

filtersExec->SetUpdateResolution
  (port, requestedResolution);
filtersExec->SetUpdateExtent
  (port, pieceNum, numPieces, ghostLevel);

//the next call is very fast
double priority = filtersExec->ComputePriority();

if (priority > 0.0)
  {
  //the next call is potentially very slow
  aFilter->Update();
  vtkDataObject *data =
    aFilter->GetOutputDataObject(port);
  vtkInformation* dataInfo = data->GetInformation();
  double resultResolution = dataInfo->Get
    (vtkDataObject::DATA_RESOLUTION());
  }

ParaView 3.8 also includes AdaptiveParaView, another experimental application which exercises this new multi-resolution capability. Multi-resolution streaming begins by rendering a single piece that covers the entire domain at minimum resolution with the very first update giving the user valuable feedback about the entire domain. This is a great advantage over standard streaming in which global information sometimes becomes apparent only when the entire domain is covered at full resolution and after a much longer delay.

Refinement of isosurfaces in the CFC data off the coast of Newfoundland. A laptop computer is able to show the full resolution data since areas off-screen are ignored and only a few pieces stay resident in memory at any given instant.

The adaptive streaming algorithm then recursively splits pieces, increasing the resolution with each split, to show the data in greater detail. As it processes pieces it gathers meta-information (world extent and data ranges) and uses this information to improve its importance estimates. Throughout the algorithm, prioritization guides the choice of which pieces need to be refined immediately, which can be deferred, and which can removed from further consideration. The algorithm eventually converges to processing just the important features of the data at the fullest resolution.

THE NEED FOR CACHING
Unfortunately streaming adds overhead. Every pipeline update can take a significant amount of time. In standard VTK, the pipeline internally caches intermediate results so that data processing time can be amortized over many later updates if those do not invalidate the results. Unfortunately, changing the requested piece or resolution invalidates the internal caches. We minimize the problem by aggressively caching results at the end of the pipeline. In particular, whenever the visible geometry is small enough, our cache allows all of the data to be re-rendered in a single pass.
In this situation, camera movement proceeds as fast as it does in non-streaming ParaView. Despite caching, the convergence process itself can take a significant amount of time, therefore AdaptiveParaView has controls that allow the user to pause and restart, limit, or manually control the refinement process.

CONSERVATIVE PRIORITIZATION
A key point is that it’s generally impossible to know a priori what pieces contribute the most or least without executing the pipeline. This is unfortunate because the goal of prioritization is to avoid executing the pipeline on unimportant data. Consider culling pieces that fall outside of the view frustum. Many VTK readers can determine the initial world-space bounding box of any requested piece. However filters exist to change data, and any filter along the pipeline might transform the data arbitrarily changing the bounding box before the data is rendered. In order to find which pieces are not visible, one must first know what the world space bounding box of each piece is after that piece is processed by the pipeline.

To solve this chicken and egg problem, a facility for per piece meta-information propagation was added to the VTK pipeline. Readers and sources can provide piece level meta-information, and filters can describe what types of data modifications they do. With information about what is provided and what is potentially changed, the pipeline is better able to provide a conservative estimate of the importance of any piece without doing any time consuming computations. When either type of information is missing, the pipeline falls back to the non-prioritized behavior of iteratively processing every piece. See VTK/Rendering/Testing/Cxx/TestPriorityStreaming.cxx for a demonstration.

FUTURE WORK
We continue to "refine" our implementation of adaptive streaming. Our most immediate challenge is to improve the client parallel server implementations of both streaming applications. Currently, pure streaming requires too frequent communication with the server to be efficient and adaptive streaming has only been implemented to work with serial mode ParaView runs.

Next, there are fairly major robustness and usability limitations of our preliminary experimental prototypes. Our current work is available in source format only and there are outstanding unsolved issues regarding how to deliver global up-to-date meta-information to the end user as computation progress.

Lastly, we are working to extend the work to be compatible with more data formats. Our first streaming capable reader reads simple raw structured data in either preprocessed or raw format. For this data type, changing resolution is easily achieved by sub-sampling [4]. We have since extended the framework to handle cloud data, of which the LANL cosmology format was our first target. For this we devised an importance sampling mechanism that chooses representative samples and limits the resolution so as not to overfill the displayed image resolution.

We anticipate extending the framework to work on AMR data, which has multi-resolution information written in by the simulation codes that generate it; and to wavelet compressed image data, as exemplified by NCAR's Vapor Data Format files. Extending the framework to handle non-preprocessed unstructured data types is a long-term goal.

REFERENCES
[1] Childs, H., Brugger,E., Bonnell, K., Meredith, J., Miller, M., Whitlock, B., Max, N. "A Contract Based System For Large Data Visualization." Proceedings of the IEEE Visualization Conference 2005.

[2] Ahrens, J., Desai, N., McCormick, P., Martin, K., Woodring, J. "A modular extensible visualization system architecture for culled prioritized data streaming." Proceedings of the SPIE, 2007.

[3] Biddiscombe, J., Geveci, B., Martin, K., Moreland, K., and Thompson, D. "Time Dependent Processing in a Parallel Pipeline Architecture." IEEE Transactions on Visualization and Computer Graphics, 2007.

[4] Ahrens J., Woodring J., DeMarle D., Patchett J., Maltrud M. "Interactive Remote Large-Scale Data Visualization via Prioritized Multi-resolution Streaming." Proceedings of the UltraScale Visualization Workshop, 2009

David DeMarle is a member of the R&D team at Kitware where he contributes to both ParaView and VTK. He frequently teaches Kitware's professional development and training courses for these product applications and enjoys putting puns in Kitware Source articles. Dave's research interests are in systems level aspects of visualization, in particular memory optimizations for parallel visualization of large datasets.

 

Jonathan Woodring is a new staff scientist in CCS-7 Applied Computer Science at Los Alamos National Laboratory. He received his PhD in Computer Science from The Ohio State University in 2009. His research interests include scientific visualization and analysis, high performance supercomputing, and data intensive supercomputing.


James Ahrens is a team leader in the Applied Computer Science Group at Los Alamos National Laboratory. His research focuses on large-data visualization and scientific data management. Ahrens received his Ph.D. in Computer Science from the University of Washington and is member of the IEEE Computer Society.

Share