UVis: Web-based Analysis and Visualization for Large Climate and Geospatial Datasets
At present, the majority of the climate science community still relies heavily on primitive analysis and visualization tools that are based on the thick (or fat) client application concept. This means that the user must download software to the appropriate machines or hardware where the data resides (e.g., laptops, desktops, or HPC machines). In such cases, the user encounters multiple levels of installation challenges such as finding the right prerequisite software packages, software versions, and currently supported hardware and operating systems.
Analysis and visualization tools have thus begun moving toward the thin client application concept, where the user installs very little software. In most cases, only a web browser is needed. In such cases, an analysis and visualization software is deployed on a central server rather than on each individual system, which eliminates the user’s installation and operating system requirement challenges. This approach also provides the flexibility to install the software on the user’s system using VM (virtual operating system) in case scientists need an installation.
Thin clients are well suited for environments in which the same information is going to be accessed by a general group of users. The tradeoff, however, is that thick clients provide users with more features, analysis and visualization, and interaction choices that make the software more customizable. Nevertheless, this trend is changing due to new web standards (HTML5) and more capable browsers that have been developed by leading web technology companies.
Figure 1: UVis overall architecture
We consider our web-based analysis and visualization toolkit clients that use UVis to be neither thick nor thin clients. Instead, we consider them to be smart clients. That is, they are based more on the traditional client-server architecture concept within the web-based model, as is shown in Figure 1. They are more similar to thick clients in that UVis smart clients are Internet-connected devices that allow a user’s local applications to interact with server-based applications through the use of web services. As is the case with thick clients, this provides for increased analysis and visualization interaction, as well as software customization. On the other hand, like thin clients, software downloads are minimal. In most cases, the only software application needed is the appropriate web browser for a particular operating system. With UVis, a user will be able to work offline via the locally installed UVis backend (using VM) and access the appropriate local data or connect to the Internet and access distributed data from Earth System Grid Federation (ESGF). With this architecture in place, UVis has the ability to be deployed and updated in real-time over the network from a centralized server, support multiple platforms and operating systems, and run on almost any device including mobile phones, notebooks, tablet PCs, laptops, desktops, and HPC machines.
Figure 2: UVis client interface showing comparison between model output and observation
Although the system is standalone, it is part of the large end-to-end climate science system that will enable users to run models and collect workflow and provenance for sharing scientific results and reproducibility. The system is directly connected to the ESGF distributed archive, requiring all users to log in at ESGF. ESGF login accounts are required for authentication and authorization before any data is accessed. However, a login is not needed for local data or for ESGF data that is open to the public.
Server Side Analysis and Visualization
One of the important components of Uvis will be server-side analysis and visualization. Server-side computation is necessary, as the increase in data size and complexity of algorithms has led to data- and compute-intensive challenges for climate data analysis and visualization. In this architecture, a web-server receives a request for data processing, analysis, or remote rendering from a thin client such as a web browser. A prerequisite to this is user authentication, which involves the user providing a username/password to the server. If the prerequisite is satisfied, upon receiving the request, the server may create a process specific to the user’s session (per session ID). A separate process per session strengthens security and offers solutions for scalability. Any subsequent requests for data processing are then forwarded to this process until the session ends. At this time, the process is deleted on the server side. Certain requests that do not require intensive computing are served directly by the web-server. For instance, accessing pre-computed climatology for diagnostics is served directly by the server. The server delivers most of the static content to thin clients, as needed.
In the design of server-side analysis, we have adopted standard protocols and communication channels between the client and the server. Most of the static content and some dynamic content for exploratory analysis are served in a RESTful manner. The client can access this content by invoking an AJAX call over the web. For interactive visualization and analysis, data is accessed via WebSockets, as it provides the required flexibility and interactivity.
We chose Python as the binding language for the server-side analysis. Python was the natural choice for the framework due to its support in the scientific computing community and widespread use in almost every field of computer science. In addition, a Python-based platform enabled us to integrate existing UV-CDAT source code on the backend with minimal effort.
Conclusion & Remarks
In this article, we presented some of our initial work in creating an open-source, web-based toolkit for the analysis and visualization of large climate datasets as part of the UV-CDAT project. Although we did not describe the integration of UVis with the diagnostics framework in this article, it has been successfully demonstrated and deployed on a working system. The preliminary results are encouraging, as they have proven the usability of the toolkit in terms of its ease of use and effectiveness in providing a sophisticated computing and visualization environment for climate scientists. We are very thankful to the Department of Energy (DOE) and NASA for providing the support required for this effort.
[UVCDAT] D. Williams, A. Chaudhary, et. al. The ultra-scale visualization climate data analysis tools (uv-cdat): Data analysis and visualization for geoscience data. Computer, vol. 46, no. 9, pp. 68-76, Sept. 2013
[ParaView] J. Ahrens, B. Geveci, C. Law, C. D. Johnson, C. R. Hansen, Eds. Energy 836, 717-732 (2005).
Elo Leung is a computer scientist at Lawrence Livermore National Laboratory. She received her Ph.D in bioinformatics from George Mason University in 2008. Her research interests include bioinformatics, machine learning, scientific visualization, and high-performance computing.
Aashish Chaudhary is an R&D Engineer on the Scientific Computing team at Kitware. Prior to joining Kitware, he developed a graphics engine and open-source tools for information and geo-visualization. Some of his interests are software engineering, rendering, and visualization
Chris Harris is an R&D Engineer at Kitware. His background includes middleware
development at IBM and working on highly-specialized, high performance, mission critical systems.
Charles Doutriaux is a senior Lawrence Livermore National Laboratory research computer scientist, where he is known for his work in climate analytics, informatics, and management systems supporting model intercomparison projects. He shares in the recognition of the Intergovernmental Panel on Climate Change 2007 Nobel Peace Prize.
Thomas Maxwell is a lead scientist in analysis and visualization at the NASA Center for Climate Simulation, Goddard Space Flight Center. He is the developer of DV3D and a collaborator in the UV-CDAT development program.
Dean N. Williams is the Analytical and Informatics Management Systems project lead at LLNL. He is the principal investigator for several large-scale international software projects.
Gerald Potter works for NASA GSFC and the University of Michigan. His current interests are testing software for use by the climate modeling community and exploring new ways to aid in analyzing and displaying climate related data.