Recently we introduced the VK_NVX_device_generated_commands (DGC) Vulkan extension, which allows rendering commands to be generated entirely on the GPU. Earlier this week, we added support for VK_NVX_device_generated_commands to our Windows and Linux release drivers. Today we are releasing the ‘BasicDeviceGeneratedCommandsVk’ SDK GameWorks sample. We highly recommend reading the introductory Vulkan Device-Generated Commands article in addition to this blog post.


BasicDeviceGeneratedCommandsVk GameWorks SDK Sample


The sample renders a model split into two parts where each part is a subset of the geometry selected via indexCount and firstIndex. Each part is then rendered using pipeline state objects (PSO) with different polygon modes. Various methods are implemented to render those parts with different amount of work generated on the GPU, as the following table summarizes:

Draw mode Commands generated via API calls Commands generated on device
Core Vulkan
DrawIndexed VBO/IBO bindings
PSO bindings
Descriptor set bindings
Draw calls
DrawIndirect VBO/IBO bindings
PSO bindings
Descriptor set bindings
Draw calls
VK_NVX_device_generated_commands
DeviceGeneratedDrawIndirect VBO/IBO bindings
PSO bindings
Descriptor set bindings
Draw calls
DeviceGeneratedPsoDrawIndirect VBO/IBO bindings
Descriptor set bindings
PSO bindings
Draw calls
DeviceGeneratedVboIboPsoDrawIndirect Descriptor set bindings
VBO/IBO bindings
PSO bindings
Draw calls

Notes

  • The DeviceGeneratedDrawIndirect and DrawIndirect modes are functionally equivalent and are intended to show the device generated commands API calls corresponding to core Vulkan indirect draw API calls. The other DeviceGenerated* modes build on this to illustrate more interesting use case of device generated commands.
  • DeviceGeneratedPsoDrawIndirect changes the PSO from within the token buffer and thus allow both parts of the model to be rendered with a single draw call, which is impossible with (multi) draw indirect
  • There are new bits for pipeline barrier stages and access that should be used to synchronize access to the buffers used for command generation:
    To sync from writing the indirect commands to vkCmdProcessCommandsNVX:
    • srcStage/AccessMask = whatever wrote the buffer
    • dstStageMask = VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
    • dstAccessMask = VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX
  • To sync from vkCmdProcessCommandsNVX to vkCmdExecuteCommands:
    • srcStageMask = VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX
    • srcAccessMask = VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX
    • dstStageMask = VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
    • dstAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT
  • Compute support will be added in a future driver release.
  • vkGetPhysicalDeviceGeneratedCommandsPropertiesNVX is known to crash with an unextended loader, due to the way physical devices arguments to functions are handled. Here is what’s currently implemented:
Feature/Limit Value
computeBindingSupport false
maxIndirectCommandsLayoutTokenCount 32
maxObjectEntryCounts 2^31
minSequenceCountBufferOffsetAlignment 256
minSequenceIndexBufferOffsetAlignment 32
minCommandsTokenBufferOffsetAlignment 32
  • A few things are not implemented in this sample for simplicity; they are however straightforward to add:
    • Generating the token buffers from a shader instead of uploading to the device via vkCmdBufferUpdate
    • Binding descriptor sets from the token buffer instead of binding them via API calls

References

  • Drivers
    • Windows, version 376.09 or newer
    • Linux,version 375.20 or newer
  • Headers
    • A future LunarG SDK release is expected to include headers for the extension. In the meantime, definitions and declarations are provided as a part of the sample in vk_nvx_device_generated_commands.h /.cpp

Sample Code

Specifications and Documentation