Catalyst-ADIOS2: a new Catalyst2 implementation for in-transit analysis

In this blog post, we introduce a new Catalyst 2 implementation for in transit analysis, to easily process your data on the fly via a dedicated visualization node. It is available in a public repository named AdiosCatalyst.

Context

Nowadays, to reduce expensive operations such as input and output (I/O) and increase feedback in many numerical simulations, in-situ approaches are widely implemented in many scenarios. Unfortunately, in in-situ processing, the simulation is halted during the analysis stage since it operates on the numerical simulation data loaded in memory (see fig1). Due to this major drawback the simulation will spend useful computation time on analysis time. This is the challenge that in-transit visualization tackles. By avoiding blocking the simulation for analysis, thanks to dedicated visualization nodes and data streaming, gains in “time to result” of the simulation can be significant. Even if this streaming leads to some overhead, it can be definitively worth it in cases with heavy analysis.

Fig 1: Execution pipeline in In Situ and In Transit

In-situ processing can already be performed with Catalyst2, an API specification that simulations can use to build data processing pipelines. Catalyst2 also offers the possibility to develop your own implementation of this API. For in-transit scenarios, a new implementation of Catalyst2 has been developed (illustrated in fig2). There are 2 parts to handle such use cases. The first one is the Catalyst-ADIOS2 implementation itself which will transfer data from the simulation to a streaming interface, here ADIOS2. The second one, named AdiosReplay, will perform the opposite operation: receive streaming data and pass it to another implementation of Catalyst2 (like the ParaView one for example).

Fig 2: In Situ (1) vs In Transit (2) Analysis with ParaView

Catalyst-ADIOS2: for simulation nodes

To achieve in-transit capabilities with Catalyst2 we needed to create our own implementation to stream the data. ADIOS2, an open source library for scalable and parallel I/O capabilities, offers a lot of functionalities to treat I/O and is also highly configurable to fit with a wide range of use cases. As such, we decided to implement the Catalyst API with the ADIOS2 library as the backend. For in-transit use cases, we use the SST Engine. Designed for HPC environments, it fully supports MxN data distribution where the number of computational and visualization nodes can differ.


To keep it simple, a simulation can roughly be resumed in 3 parts: an initialization stage, a loop where we will execute our simulation and finalization where global results are stored and resources are freed. The Catalyst2 API offers analog methods for these parts (catalyst_initialize(), catalyst_execute(),…) that we implemented in order to convert simulation data into ADIOS2 data structures. The fig3 describes with more detail what will happen during a simulation when we use this new implementation.

Fig 3: Pipeline execution on simulation nodes with Catalyst-ADIOS2

AdiosReplay: for analysis node

Now that we have an ADIOS2 streaming point setup on the simulation side, we need to recover data from it on the analysis side. This is the purpose of the Replay executable. After connecting to the streaming point, we can perform the opposite operation and send this data to another implementation of Catalyst2 like Catalyst-ParaView (fig4).

Fig 4: Pipeline execution on analysis nodes with the Replay executable

Demonstration

This video shows how to run the unstructured grid example provided here. For this toy example, we run the simulation with 4 nodes and the visualization part with a different number of nodes, here only 2. If you’re interested in running this example, all steps to be able to compile and run it have been described in the repository.

Conclusion

In-transit capabilities with Catalyst2 have many benefits over other workflows, especially when running a heavy visualization pipeline. With this new implementation for Catalyst2, we give an overview about how this in-transit capability has been integrated into the ecosystem. Of particular note is that it is publicly available and you can already try it now by following the steps described in the readme.

This work has led to a short paper presented by Kitware at the WOIV’23: 7th International Workshop on In Situ Visualization held in conjunction with ISC 2023:

Mazen, F., Givord, L., Gueunet, C. (2023). Catalyst-ADIOS2: In Transit Analysis for Numerical Simulations Using Catalyst 2 API. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023. Lecture Notes in Computer Science, vol 13999. Springer, Cham. https://doi.org/10.1007/978-3-031-40843-4_20

Leave a Reply