Alvin Lang
June 4, 2025 15:44
NVIDIA’s multi -process service optimizes GPU usage in molecular epidemiological simulations, improving the process by running simultaneous processes in a single GPU.
Over time, molecular dynamics (MD) simulations, which are essential for modeling atomic interactions, require significant computational resources. Nevertheless, many simulations contain small system size and often use modern GPUs. According to NVIDIA, NVIDIA’s MULTI-Process Service (MPS) allows you to run multiple simulations at the same time in the same GPU to maximize the GPU usage rate and improve the process of handling to provide solutions.
MP understanding
MP is a binary compatible implementation of CUDA API, which facilitates efficient GPU sharing by multiple processes. Reduce the context conversion overhead and allow all processes to share the scheduling resources to improve the full GPU utilization. Since NVIDIA Volta GPU is generated, MPS supports simultaneous kernel execution in other processes, so it improves performance when individual processes cannot fully saturate GPUs. In particular, MP can start with ordinary user privileges and simplify distribution.
Implement MP with Openmm
In order to take advantage of MP in Openmm, a popular MD engine, users can run multiple simulations simultaneously. This is carried out with multiple instances of the simulation scripts with a separate process. Individual simulation may be slow, but the overall throughput increases due to parallel execution. Simple command structure allows users to control GPU targeting and process management to improve resource allocation efficiency.
Benchmarking performance
The benchmark test shows significant improvement of throughput when applying MP to a system of various sizes. For example, DHFR systems with 23,000 atoms benefit significant performance in high -end GPUs such as the NVIDIA H100 Tenser Core. Larger systems, such as cellulose benchmarks with 409,000 atoms, also experience an increase of about 20%.
CUDA_MPS_ACTIVE_THREAD_PERCENTAGE Optimization
By default, MPS allows the entire GPU resource access to all processes. But set CUDA_MPS_ACTIVE_THREAD_PERCENTAGE
Environmental variables can be optimized for additional throughput by limiting thread availability per process. This adjustment has been shown to greatly improve collective throughput in simulations that contain multiple simultaneous processes.
Application for free energy calculation
MPs are also found to be advantageous in free energy interruption (FEP) simulations that depend on replica-exchange molecular epidemiology. By running multiple simulations simultaneously in different λ Windows, MPS alleviates the utilization of the GPU, resulting in an increase of 36%throughout when using three MPS processes in NVIDIA’s L40 or H100 GPUs.
conclusion
NVIDIA’s MPS is a useful tool for improving MD simulation throughput with a minimum coding effort. By optimizing GPU resource utilization, MPS greatly improves performance in various simulation scenarios. For those who are interested in exploring these features, NVIDIA offers additional resources and tutorials to support implementation and experiments.
Image Source: Shutter Stock