VTK-PROMPT: Simplifying Scientific Visualization with AI

In this first post of the VTK AI blog series we will explore the efforts of the VTK development team to make VTK more accessible, and to help new users understand the complex VTK API. This series is part of a larger effort to describe recent innovation efforts in the VTK project. Here is the introduction to the series, along with the first,  second, and third posts addressing using WebAssembly (WASM) for web visualization using VTK.

The Challenge: VTK’s Complexity Barrier

VTK is one of the world’s most widely utilized 3D scientific visualization libraries. A significant factor in this success is its extensive library, which can be daunting to novice users. Kitware has consistently addressed this challenge through a two-pronged approach to streamline the VTK learning process. Firstly, by developing educational materials such as tutorials, examples, workshops, and books, which facilitate the acquisition of VTK knowledge. Secondly, by creating bindings and wrappers for more accessible programming languages, such as Python, to simplify the practical application of VTK. While these initiatives have substantially reduced the challenge of learning VTK, considerable entry barriers persist. This is due to the large scale of the VTK API, and the required knowledge of the fundamentals of scientific visualizations and computer graphics, domains which many potential users may possess only a high-level understanding.

VTK-Prompt

In the age of AI, we have recognized the potential to significantly improve the way users and developers use VTK. By “significant improvement,” we refer to a fundamental shift in our methodology, enabling us to offer a novel natural language interface for VTK (simplifying its use) and to provide guidance to users during their interaction with VTK (facilitating the learning process). This ambitious undertaking has been realized in our latest open-source project, with codename VTK-Prompt. VTK-Prompt provides many ambitious capabilities including running explanations, code generation, and interactive visualizations (see images in the following).

Core Architecture and Features

VTK-Prompt offers both command-line and web-based interfaces, catering to different user preferences and workflows:

CLI

# Generate basic VTK code
vtk-prompt "Create a red sphere with lighting"

# Enhanced with RAG for better accuracy
vtk-prompt "Create a textured cone with custom lighting" --rag

# Using different LLM providers
vtk-prompt "Generate a volume rendering pipeline" --provider openai --model gpt-4o

Web-Based UI

The trame-powered web interface provides automatic VTK rendering, allowing users to see their generated visualizations immediately. The UI features:

  • A VTK rendering viewport
  • Interactive model and parameter controls
  • Token usage tracking and cost monitoring
  • Conversation export and history management

Advanced RAG (Retrieval-Augmented Generation) System

One of VTK-Prompt’s most innovative features is its RAG implementation, which significantly improves code generation accuracy by leveraging VTK’s extensive examples database. The RAG system transforms generic requests into curated Python VTK code by retrieving relevant examples and injecting them into the prompt context.

Future Directions and Impact

VTK-Prompt demonstrates how AI can bridge the gap between complex technical libraries and user accessibility, making sophisticated 3D visualization capabilities available to researchers, educators, and developers through natural language interaction. This represents not just a tool, but a fundamental shift toward more intuitive scientific computing interfaces.

We are rapidly developing and improving this application with very exciting and innovative new features, perhaps the most notable future plans are extracting the VTK knowledge components into a standalone public Model Context Protocol (MCP) which could be accessed by major LLM providers clients, powering tools such as ChatGPT and Gemini with VTK intelligence. This will be described in a future blog post.

Acknowledgments

VTK is a creative work produced from an extended community. Refer to VTK’s GitLab repository for a detailed capture of contributions and enhancements. Research reported in this publication was supported by the National Institute Of Biomedical Imaging And Bioengineering of the National Institutes of Health under Award Number R01EB014955. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Leave a Reply