On using VRS as an antialiasing optimization

An idea I came up with while having few spare moments from work.

How about using VRS as a means to optimize SSAA?
SSAA is known to be very slow due to the need to render 4x more as render targets are 4x larger.
Or rather having 4x more of costly fragment shader invocations – which VRS could in principle mitigate.

When VRS (Variable Rate Shading) extension is enabled, it permits you to have for example 1 fragment invocation to cover 4 pixels tile.
Which means you could save up some render time by reducing quantity of very costly fragment shader invocations.
I would like to refer to this optimization method as VRSAA.

Now the details with images and statistics of SSAA and VRS:

rt4x

rt4x vrs

rt4x -> resize to swapchain

rt4x vrs -> resize to swapchain

There is hardly any difference visible after resizing and copying render target image onto swapchain image.
Here are same settings but with intermediate FXAA:

rt4x fxaa

rt4x vrs fxaa

rt4x fxaa -> resize to swapchain

rt4x vrs fxaa -> resize to swapchain

Full image with VRSAA + FXAA (click to view)

The following statistics has been measured on RTX 3070 Laptop, Nvidia driver ver. 565.77 Linux, swapchain resolution 3440×1440.

Render method	Average frame time
No AA	1.002ms
FXAA	1.466ms
SSAA	2.392ms
SSAA + FXAA	3.846ms
VRSAA	2.347ms
VRSAA + FXAA	3.704ms

Overall gain of 0.05ms per frame does not mean much, especially when there is literally 30 draw calls on a scene with very simple fragment shaders.
When extrapolating data, assuimg my frame time was 33ms (30FPS) with SSAA, the benefit gain would be 0.62ms per frame – which in turn would translate as 0.88 FPS improvement.
However, assuming there is more complex scene with more advanced shaders like PBR, then there is a chance the results still could be quite different.

As a side observation worth mentioning:
If you have FXAA enabled, or possibly other antialiasing method, you could also enable VRS 2×2 tiling, as final result of these two features combined could be very close – to non-observable in scenes with motion (aka regular gameplay).

Extra details how to enable VRS in Vulkan:

To enable VRS extension, you have to query and enable the VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME extension on your device.

uint32_t count = 0;
vkEnumerateDeviceExtensionProperties( thePhysicalDevice, nullptr, &count, nullptr );
std::vector<VkExtensionProperties> existingExtensions( count );
vkEnumerateDeviceExtensionProperties( thePhysicalDevice, nullptr, &count, existingExtensions.data() )

auto vrscmp = []( const auto& prop )
{
    return std::strcmp( prop.extensionName, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME ) == 0;
};
bool vrsAvailable = std::ranges::find_if( existingExtensions, vrscmp ) != existingExtensions.end();

std::vector<const char*> enableExtensions{ /* ... other extensions to use ... */ };
if ( vrsAvailable ) enableExtensions.emplace_back( VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME );

If the VRS extension is present, you have to pass VkPhysicalDeviceFragmentShadingRateFeaturesKHR structure as part of VkDeviceCreateInfo::pNext field (or chain of .pNext fields in subsequent structs).

VkPhysicalDeviceFragmentShadingRateFeaturesKHR vrsInfo{
    .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_SHADING_RATE_FEATURES_KHR,
    .pipelineFragmentShadingRate = VK_TRUE,
    .primitiveFragmentShadingRate = VK_TRUE,
    .attachmentFragmentShadingRate = VK_TRUE,
};

VkDeviceCreateInfo deviceCreateInfo{
    .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
    .pNext = vrsAvailable ? &vrsInfo : nullptr, /* ... nullptr or other create infos you wish ... */
    /* ... rest of your device create info fields ... */
};
vkCreateDevice( thePhysicalDevice, &deviceCreateInfo, nullptr, &theLogicalDevice );

Now the path diverges what you can do. You can do either, or both if you desire:

Enable VRS as a pipeline property
Enable VRS as a pipeline dynamic state + do at least one call to vkCmdSetFragmentShadingRateKHR before draw call

VRS as pipeline property:
When creating pipeline, create VkPipelineFragmentShadingRateStateCreateInfoKHR structure as part of VkGraphicsPipelineCreateInfo::pNext field (or chaing of .pNext fields in subsequent structs).
This will make pipeline use the specified VRS tile size for every draw call, no further input required.

VkPipelineFragmentShadingRateStateCreateInfoKHR vrs{
    .sType = VK_STRUCTURE_TYPE_PIPELINE_FRAGMENT_SHADING_RATE_STATE_CREATE_INFO_KHR,
    .fragmentSize{ 2, 2 },
    .combinerOps{ VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR, VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR },
};
VkGraphicsPipelineCreateInfo pipelineInfo{
    .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
    .pNext = vrsAvailable ? &vrs : nullptr, /* ... nullptr or other create infos you wish ... */
    /* ... rest of your pipeline create info fields ... */
};
vkCreateGraphicsPipelines( theLogicalDevice, nullptr, 1, &pipelineInfo, nullptr, &thePipeline );

VRS as dynamic state:
When creating pipeline, add VK_DYNAMIC_STATE_FRAGMENT_SHADING_RATE_KHR to the collection of enabled dynamic states.
This will make pipeline use the information about VRS tile set during command buffer recording – which means you have to call vkCmdSetFragmentShadingRateKHR at least once during recording, before any draw call.

std::vector<VkDynamicState> enabledDynamicStates{ /* ... other enabled dynamic states ... */ };
// refer to the device code above to see the vrsAvailable variable
if ( vrsAvailable ) enabledDynamicStates.emaplace_back( VK_DYNAMIC_STATE_FRAGMENT_SHADING_RATE_KHR );

VkPipelineDynamicStateCreateInfo dynamicState{
    .sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
    .dynamicStateCount = static_cast<uint32_t>( enabledDynamicStates.size() ),
    .pDynamicStates = enabledDynamicStates.data(),
};
VkGraphicsPipelineCreateInfo pipelineInfo{
    .sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
    .pDynamicState = &dynamicState,
    /* ... rest of your pipeline create info fields ... */
};

vkCreateGraphicsPipelines( theLogicalDevice, nullptr, 1, &pipelineInfo, nullptr, &thePipeline );

const VkExtent2D vrs{ 2, 2 };
const VkFragmentShadingRateCombinerOpKHR combiner[]{
    VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR,
    VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR,
};
vkCmdSetFragmentShadingRateKHR( cmd, &vrs, combiner );

However please note the function vkCmdSetFragmentShadingRateKHR is extension function and may not be present in your vulkan library. If such case occurs your program may crash long before it reaches main().
To mitigate this, you’d need to manually resolve it from your vulkan instance with function vkGetInstanceProcAddr.

A reference implementation of VRS is available at:
https://github.com/xmaciek/starace/commit/4b01ab86127f702dcf8fec0da84a0053f1ef1b8f

DreamTalon

On using VRS as an antialiasing optimization

Optimizing std::unique into warp speed

Decentralized cooking

On using VRS as an antialiasing optimization