NVIDIA® Nsight™ Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you identify the largest opportunities to optimize, and tune to scale efficiently across any quantity or size of CPUs and GPUs; from large server to our smallest SoC.


Overview

NVIDIA Nsight Systems is a low overhead performance analysis tool designed to provide insights developers need to optimize their software. Unbiased activity data is visualized within the tool to help users investigate bottlenecks, avoid inferring false-positives, and pursue optimizations with higher probability of performance gains. Users will be able to identify issues, such as GPU starvation, unnecessary GPU synchronization, insufficient CPU parallelizing, and even unexpectedly expensive algorithms across the CPUs and GPUs of their target platform. It is designed to scale across a wide range of NVIDIA platforms such as: large Tesla multi-GPU x86 servers, Quadro workstations, Optimus enabled laptops, DRIVE devices with Tegra+dGPU multi-OS, and Jetson. NVIDIA Nsight Systems can even provide valuable insight into the behaviors and load of deep learning frameworks such as PyTorch and TensorFlow; allowing users to tune their models and parameters to increase overall single or multi-GPU utilization.

Features

Learn about feature support per target platform group

Feature
Linux
Workstations and Servers
Windows
Workstations and Gaming PCs
Jetson
Autonomous Machines
DRIVE
Autonomous Vehicles
View system-wide application behavior across CPUs and GPUs
CPU cores utilization, process, & thread activities
yes
yes
yes
yes
CPU thread periodic sampling backtraces
yes*
no
yes
yes
CPU thread blocked state backtraces
yes
yes
yes
yes
CPU performance counter sampling
no
no
yes
yes
GPU workload trace
yes
yes
yes
yes
GPU context switch trace
no
no
yes
yes
SOC hypervisor trace
-
-
-
yes
SOC memory bandwidth sampling
-
-
yes
yes
SOC Accelerators trace
-
-
Xavier
Xavier
Investigate CPU-GPU interactions and bubbles
User annotations API trace
NVIDIA Tools Extension API (NVTX)
yes
yes
yes
yes
CUDA API
yes
yes
yes
yes
CUDA libraries trace (cuBLAS & cuDNN)
yes
no
yes
yes
OpenGL API trace
yes
no
yes
yes
Direct3D12, DXR, & PIX APIs
-
yes
-
-
Bidirectional correlation of API and GPU workload
yes
yes
yes
yes
Identify GPU idle and sparse usage
yes
yes
yes
yes
Ready for big data
Fast GUI capable of visualizing in excess of 10 million events on laptops
yes
yes
yes
yes
Additional command line collection tool
yes
no
no
no
NV-Docker container support
yes
-
-
-
NVIDIA GPU Cloud support
yes
-
-
-
Minimum user privilege level
user
administrator
root
root
Platform details
Linux
Workstations and Servers
Windows
Workstations and Gaming PCs
Jetson
Autonomous Machines
DRIVE
Autonomous Vehicles

* On Intel Haswell and newer CPU architectures

Platforms

Learn about Nsight Systems on your platform:

Linux Workstations and Servers

Windows Workstations and Gaming PCs

Jetson Autonomous Machines

DRIVE Autonomous Vehicles


What Users Are Saying

Tracxpoint

We noticed that our new Quadro P6000 server was ‘starved’ during training and we needed experts for supporting us. NVIDIA Nsight Systems helped us to achieve over 90 percent GPU utilization. A deep learning model that previously took 600 minutes to train, now takes only 90.

Felix Goldberg, Chief AI Scientist

NIH Center for Macromolecular Modeling and Bioinformatics at University of Illinois at Urbana-Champaign

Watch John Stone, present how he achieved over a 3x performance increase in
VMD; a popular tool for analyzing large biomolecular systems.

Related Media

Optimizing HPC simulation and visualization code

Watch John Stone, of the NIH Center for Macromolecular Modeling and Bioinformatics at University of Illinois at Urbana-Champaign, discuss how he achieved over a 3x performance increase of VMD, a popular tool for analyzing large biomolecular systems.

Watch Video

NVIDIA Jetson Partner Stories: Stereolabs

In the drone industry, the weight and size of the main board is critical. With the ZED stereo camera by Stereolabs, developers can capture the world in 3D and map 3D models of indoor and outdoor scenes up to 20 meters. The small form factor of the Jetson TX1 enables Stereolabs to bring advanced computer vision capabilities to smaller and smaller systems. See what is possible when these two technologies come together in drones to power the latest virtual reality applications.

Watch Video

NVIDIA System Profiler - Introduction

An introduction to the latest NVIDIA System Profiler. Includes an UI workthrough and setup details for NVIDIA System Profiler on the NVIDIA Jetson Embedded Platform. Download and learn more here.

Watch Video


Release Highlights

  • Introducing Nsight Systems 2018.3
  • Improved CPU-GPU correlation experience
  • Improvements to quality, usability, and scalability

Downloads

Available for profiling directly on Linux workstations and servers, including the NVIDIA DGX line, or remotely from a variety of hosts: Windows, Linux, or MacOSX.

Download Now


Not profiling Linux workstations or servers?
Learn about other target platforms.

Documentation

Support

To provide feedback, request additional features, or report support issues, please use the Developer Forums.

System Requirements

Supported target operating systems for data collection:

  • Ubuntu 14.04, 16.04, and 18.04
  • CentOS 7+*
  • Red Hat Enterprise Linux 7+*
* In distribution versions below 7.4, some features will be disabled unless the OS kernel has been upgraded to kernel version 4.3 or greater

Supported target hardware

  • GPU: Pascal or newer
  • CPU: x86-64 processors*
* Intel Haswell architecture or newer is required for LBR sampling backtraces

Supported target software

  • 64 bit applications only
  • CUDA 9.0+ for CUDA tracing

Supported host operating systems for data visualization:

  • Windows 7+
  • Mac OS X 10.9+
  • Ubuntu 14.04, 16.04, and 18.04

Release Highlights

  • Introducing Nsight Systems 2018.1
  • Improvements to quality, usability, and scalability

Downloads

Nsight Systems is bundled as part of the following product development suites:

DRIVE via DriveInstall
Jetson via Jetpack (under its former name of Tegra System Profiler)

Documentation

Support

To provide feedback, request additional features, or report support issues, please use the Developer Forums.

System Requirements

Supported Target Hardware

  • ShieldTV
  • Jetson AGX Xavier, Jetson TX2, Jetson TX1
  • DRIVE AGX Pegasus, DRIVE AGX Xavier, DRIVE PX Parker AutoChauffeur, DRIVE PX Parker AutoCruise

Supported target operating systems for data collection:

  • QNX
  • Linux
  • Android

Supported host operating systems for data visualization:

  • Ubuntu 14.04, 16.04, and 18.04

Release Highlights

  • Introducing Nsight Systems 2018.3
  • Preliminary support for Windows 7+ including
    • CPU core, process, and thread activity
    • CPU thread state trace
    • CPU thread blocked state backtraces
    • Direct3D 12 (single GPU only)
      • API trace
      • GPU workload trace
      • Including DXR
    • CUDA
    • NVTX

Downloads

Available for profiling directly on Linux workstations and servers, including the NVIDIA DGX line, or remotely from a variety of hosts: Windows, Linux, or MacOSX

Download Now


Not profiling Windows targets?
Learn about other target platforms.

Documentation

Support

To provide feedback, request additional features, or report support issues, please use the Developer Forums.

System Requirements

Supported operating systems

  • Windows 10*
  • * Windows remote profiling is not supported at this time

    Supported target hardware

  • GPU: Pascal or newer
  • CPU: x86-64 processors

Supported target software

  • 64 bit applications only
  • CUDA 10.0+ for CUDA tracing
  • Requires driver r411.63 or newer