NVIDIA® Nsight™ Graphics 2024.1.0 is released with the following changes:

New Features:

  • GPU Trace
    • GPU Trace now has an API Event List view, similar to the Frame Debugger. This view provides an API-oriented list of commands that are executed in the frame, ordered by Queues → CommandLists → Perf Markers and Actions.
    • The Event List view indicates whether the GPU Context was interrupted, in a dedicated column. Use this in conjunction with the GPU Contexts timeline row to identify when background processes interfered with your application’s graphics performance.
    • A new “Top-Level Triage” configuration has been introduced, as an additional option in the Launch Settings | Timeline Metrics dropdown . This configuration provides a back-of-pipe to front-of-pipe view of the GPU, physical quantities in metric tables and tooltips, a Screen Pipe Data Flow row containing pixels at each stage of the raster pipeline, and many more improvements.
    • In Vulkan applications, NGX Shaders and workloads are shown in GPU Trace. This includes work items created by DLSS, DLSS Frame Generation, and similar NVIDIA SDKs.
  • Shader Profiler
    • The Ray Tracing Live State table now supports exporting to CSV. Either use the right-click menu command, or expand the rows of interest and copy-to-clipboard.
  • Frame Debugger
    • Added support for capture, replay, and frame debugging of OpenXR applications.

Improvements:

  • GPU Trace
    • Trace Analysis now provides the top 3 issues per Perf Marker in its Markers table. This provides an overview of every performance issue in the frame within a single compact view. Each issue hyperlinks to its detailed description.
    • Multi-selection is now supported on the timeline, and via the Event List view. To access this feature, select the Trace Analysis button at the top of a trace report.
    • The timeline shows a red striped overlay in regions with missing data. This is useful for identifying truncated traces, or to identify regions where external processes’ GPU contexts were running.
    • In the Real-Time Shader Profiler, the time correlation of PC Samples has been improved. Selecting a region on the timeline will result in a more accurate portrayal of the shaders running in that time span.
    • The Instruction Mix view now updates upon new selections in all cases. The Instruction Mix is located in the lower-left region of the tracer report.
    • D3D12 Work Graph support includes the official interfaces published by Microsoft in the Agility SDK 1.613.0. Programs that rely on the experimental D3D12 interfaces will continue to work in GPU Trace, by virtue of the official interfaces sharing the same UUIDs and layout as the experimental interfaces.
  • Shader Profiler
    • Line mappings between (Source, IL, SASS) have undergone major improvements, resulting in stable and reliable performance stats at the shader source-code level.
    • In the Hot Spots view, an issue where duplicate entries were being shown has been fixed.
    • When Shader Pipelines view’s Group By is set to Shader Source, the Instruction Mix | Scoreboard Producer table is now populated correctly.
    • The SASS view has been renamed to Disassembly, signifying that it shows mixed-mode disassembly. Disassembly is available in the Shader Source tab, under the Languages dropdown.
  • Frame Debugger
    • Consolidated the “Instance AABB Overlap Heatmap” capabilities of the Ray Tracing Inspector into a singular “world space” option.
    • Support has been added for the OpenGL extension ARB_sparse_buffer.
  • Aftermath
    • Aftermath now tracks additional history for resources, enabling you to better identify when a deleted resource leads to an MMU fault.

Known Issues:

  • GPU Trace
    • The GPU Contexts row may contain incomplete data in scenarios with high frequency context switching This is a limitation of the NVIDIA driver.
    • On Windows, CommandList timeline events may appear to be active for longer than their true duration, when in reality the underlying hardware queue was in a wait state for the initial portion of that time. This only occurs when Windows Hardware Accelerated GPU Scheduling is enabled.
    • The Average Warp Latency column in the shader profiler will provide accurate values within regions of execution where a single workload was executing; however, selecting broader regions will wash out the results (caused by averaging). The recommended approach is to use the Avg. Warp Latency timeline row to identify a region of execution, select a perf marker or range within that, and then use the shader profiler’s values.
    • When the Real-Time Shader Profiler is enabled, the new SM Warp Occupancy timeline row (per-shader-stage) is only able to show complete stats for D3D12, Vulkan, and OpenGL contexts in the traced process. Other contexts (such as CUDA, D3D11, or EGL) will have incomplete values in this row.
    • In D3D12 applications, NGX shaders and workloads are not shown in GPU Trace.
  • Shader Profiler
    • The Vulkan Shader Profiler’s support for KHR_non_semantic_info is contingent on shader compiler support. dxc -fspv-debug=vulkan-with-source works well, aside from this compiler bug. In the Vulkan SDK, glslangValidator -gVS does not produce sufficient info at the time of this release.
    • In the Frame Debugger, instruction execution counters are no longer collected by default. To re-enable these counters, we recommend first increasing the Windows TDR Timeout to 30 seconds; then in the Frame Debugger launch settings | Troubleshooting tab, set Collect SASS Execution Counters to “Yes”.
    • In the Frame Debugger, instruction execution counters for D3D12 pixel shaders will only be collected on R545.95 and newer drivers, due to limitations in the DX driver.
    • In the Frame Debugger, instruction execution counters for VkShaderEXT objects will only be collected on R546.24 and newer drivers, where the capability was newly introduced.
    • Unattributed samples are shown when the following types of shader code execute.
      • Ray Tracing Traversal, and similar ray tracing internals.
      • D3D12 NGX Shaders, including DLSS and similar technologies.
      • Driver-internal shaders

For more details and known issues, please see the full release notes!

For an overview of Nsight™ Graphics and access to resources, please visit the main Nsight™ Graphics page.

NVIDIA® Nsight™ Graphics 2024.1.0 is available for download under the NVIDIA Registered Developer Program.

 Download   Documentation