Build Scalable GPU-Accelerated Applications. Faster.

Researchers, scientists, and developers are advancing science by accelerating their high performance computing (HPC) applications on NVIDIA GPUs using specialized libraries, directives, and language-based programming models. From computational science to AI, CUDA-X HPC, OpenACC, and CUDA® are GPU-accelerating applications to deliver groundbreaking scientific discoveries. And popular languages like C, C++, Fortran, and Python are being used to develop, optimize, and deploy these applications.


A collection of optimized, GPU-accelerated libraries for commonly used computing operations.

Learn More >


A directive-based programming model for easy on-ramping to parallel computing on GPUs, CPUs, and other devices.

Learn More >


A parallel computing platform and programming model designed to deliver the most flexibility and performance for GPU-accelerated applications

Learn More >

Scientists and Researchers

With scientific discoveries taking priority, domain scientists are utilizing GPUs to achieve faster results while minimizing programming efforts. Scientists in fields such as computational fluid dynamics, climate, weather and ocean, molecular dynamics, quantum chemistry, and physics, among others, see 3–10X code speedups on GPUs.

"Our initial OpenACC implementation required only minor efforts, and more importantly, no modifications of our existing CPU implementation"

Janus Juul Eriksen, Ph.D. fellow at Aarhus University in Denmark

Getting Started

  1. Check if your application is already accelerated on GPUs.
  2. If your code is not yet GPU-accelerated follow these steps to achieve initial acceleration results:
  3. To continue acceleration of you code, use CUDA for selected kernels or parts of your code.

Application Developers

Beyond accelerating science, application developers are looking to achieve mission critical productivity. They strive to be efficient, simplify support to provide code longevity, and get maximum performance for their users. Across a variety of domains like computational fluid dynamics, computational chemistry, bioinformatics, and physics, GPUs bring acceleration and programming tools help to maintain productivity.

"On a supercomputer, time is the biggest cost,” he said. “With the acceleration, users spend half the hours to get the same results."

David Gutzwiller, Numeca

Getting Started

  1. Step 1 - Use CUDA-X HPC libraries for supported algorithms.
  2. Step 2 - Add OpenACC Directives, offers portability and simplified code support in the future.
  3. Step 3 - For critical kernels and parts of the code, use CUDA for maximum performance.

Educators & Facilitators

Educators and facilitators are tasked with helping students, computer and domain scientists to start accelerating their codes or continue optimizing them on GPUs. Students and scientists are working on problems representing a variety of scientific domains including astrophysics, computational fluid dynamics, quantum chemistry, molecular dynamics, material science and others.

"I used Accelerating Computing and OpenACC Teaching Kits content for teaching and research - very productive and students loved it. I turned most of the OpenACC Online Course office hours information into quizzes! (Very helpful)"

Sunita Chandrasekaran, Assistant Professor, University of Delaware.

Getting Started Teaching GPU Programming:

  1. NVIDIA Teaching Kits are complete course solutions across a variety of academic disciplines that benefit from GPU-accelerated computing. Co-developed with leading university faculty, Teaching Kits provide full curriculum design coupled with ease-of-use. Explore NVIDIA Teaching Kits.
  2. The NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI and accelerated computing to solve real-world problems. We offer self-paced, online training for individuals, instructor-led workshops for teams, and downloadable course materials for university educators. Deep Learning Institute (DLI)Courses Workshops, Training.
  3. OpenACC resources
  4. NGC is the hub for GPU-optimized software for deep learning, machine learning, and HPC that takes allowing developers to focus on building solutions and gathering insights. Explore Containers, Models, Model Scripts, and More in NGC.

IT Support

HPC scientists working on accelerating their codes on GPUs require a robust hardware and software infrastructure to support their work.

Getting Started

  1. Use CUDA-X HPC libraries for supported algorithms
  2. OpenACC training materials, courses, talks, tutorials and more
  3. PGI OpenACC Toolkit Download
  4. Explore CUDA Toolkit
  5. Download CUDA Now
  6. Explore Containers, Models, Model Scripts, and More in NGC

Developer Resources

CUDA 10 Features Revealed: Turing, CUDA Graphs, and More

For the last eleven years, NVIDIA’s CUDA development platform has unleashed the power of GPUs for general purpose processing in a wide variety of applications. These include: high performance computing (HPC), data center applications, and content creation workflows. Most recently, artificial intelligence systems and applications ranging from embedded systems to the cloud have benefited from high-performance GPUs.

Read Developer Blog

Building HPC Containers Demystified

Whether you are a HPC research scientist, application developer, or IT staff, NVIDIA has solutions to help you use containers to be more productive. NVIDIA is enabling easy access and deployment of HPC applications by providing tuned and tested HPC containers on the NGC registry. Many commonly used HPC applications such as NAMD, GROMACS, and MILC are available and ready to run just by downloading the container image.

Read Developer Blog

Using Nsight Compute to Inspect your Kernels

Both Nsight Compute and Nsight Systems, have been added to the repertoire of CUDA tools available for developers. The tools become more and more important when using newer GPU architectures. For the example project in this blog, using the new tools will be necessary to get the results we are after for Turing architecture GPUs and beyond.

Read Developer Blog

View All Developer Blogs>

GPU Programming with Standard C++17

The C++17 language standards include parallel programming constructs well-suited for GPU computing. Discover how you can use these constructs to program GPUs and how they fit alongside libraries, OpenACC directions and CUDA for HPC application development.

View Talk

Porting VASP to GPUs using OpenACC

VASP is one of the most widely used codes for electronic-structure calculations and first-principles molecular dynamics. The improved performance and the vastly decreased maintenance effort has led the VASP group to adopt OpenACC as the programming model for all future GPU porting of VASP.

View Talk

CUDA C++ Standard Library

An in-depth review of the newest and upcoming C++ capabilities, and explanation of how they can be used to build complex concurrent data structures and enable new classes of applications on modern NVIDIA GPUs.

View Talk

View All HPC GTC Talks>

CUDA Programming and Performance

Find information from how to report a bug to

Learn More

Using PGI Compilers & Tools

No cost technical support from the PGI user community and PGI technical team.

Learn More

NGC Users

Get answers about containers, account usage, or general discussion points.

Learn More

Training and Education

The NVIDIA Deep Learning Institute (DLI) offers hands-on training in accelerated computing and AI to allow you to solve real-world problems. Training is available as self-paced, online courses or in-person, instructor-led workshops.

Fundamentals of Accelerated Computing with CUDA C/C++

The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. Upon completion, you’ll be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.

Get Started

Fundamentals of Accelerated Computing with CUDA Python

This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.

Get Started

Fundamentals of Accelerated Computing with OpenACC

Learn the basics of OpenACC, a high-level programming language for programming on GPUs. This course is for anyone with some C/C++ experience who is interested in accelerating the performance of their applications beyond the limits of CPU-only programming. Upon completion, you’ll be able to build and optimize accelerated heterogeneous applications on multiple GPU clusters using a combination of OpenACC, CUDA-aware MPI, and NVIDIA profiling tools.

Get Started

High-Performance Computing with Containers

Learn how to reduce complexity and improve portability and efficiency of your code by using a containerized environment for high-performance computing (HPC) application development. Upon completion, you'll be able to quickly build and utilize Docker, Singularity, and HPCCM for portable, bare-metal performance in your HPC applications.

Get Started

Accelerating Applications with CUDA C/C++

Learn how to accelerate your C/C++ application using CUDA to harness the massively parallel power of NVIDIA GPUs. Upon completion, you'll be able to use the CUDA platform to accelerate C/C++ applications.

Get Started

OpenACC – 2X in 4 Steps

Learn how to accelerate your C/C++ or Fortran application using OpenACC to harness the massively parallel power of NVIDIA GPUs. OpenACC is a directive-based approach to computing where you provide compiler hints to accelerate your code, instead of writing the accelerator code yourself. Upon completion, you will be ready to use a profile-driven approach to rapidly accelerate your C/C++ applications using OpenACC directives.

Get Started

Dive Deeper Into HPC

The HPC Summit at GTC 2020 brings together HPC researchers and developers to advance the state of the art of HPC. Explore content from different HPC communities, engage with experts, and learn about new trends and innovations.

Learn More

NVIDIA Supercomputing News

Australia’s Newest and Fastest Supercomputer to be Powered by NVIDIA V100 GPUs

Fujitsu announced today they have been awarded the contract to build Australia’s newest and fastest supercomputer, powered by NVIDIA V100 Tensor Core GPUs.

Read More

Developer Spotlight: Enabling the SKA Radio to Explore the Universe

The Square Kilometre Array (SKA) project is an effort to build the world’s largest radio telescope, with a collecting area of over one square kilometre.

Read More

View More News>

Join the NVIDIA Developer Program

Join Now