An idea I came up with while having few spare moments from work.
How about using VRS as a means to optimize SSAA?
SSAA is known to be very slow due to the need to render 4x more as render targets are 4x larger.
Or rather having 4x more of costly fragment shader invocations – which VRS could in principle mitigate.
When VRS (Variable Rate Shading) extension is enabled, it permits you to have for example 1 fragment invocation to cover 4 pixels tile.
Which means you could save up some render time by reducing quantity of very costly fragment shader invocations.
I would like to refer to this optimization method as VRSAA.
Now the details with images and statistics of SSAA and VRS:
rt4x

rt4x vrs

rt4x -> resize to swapchain

rt4x vrs -> resize to swapchain

There is hardly any difference visible after resizing and copying render target image onto swapchain image.
Here are same settings but with intermediate FXAA:
rt4x fxaa

rt4x vrs fxaa

rt4x fxaa -> resize to swapchain

rt4x vrs fxaa -> resize to swapchain

Full image with VRSAA + FXAA (click to view)

The following statistics has been measured on RTX 3070 Laptop, Nvidia driver ver. 565.77 Linux, swapchain resolution 3440×1440.
Render method | Average frame time |
No AA | 1.002ms |
FXAA | 1.466ms |
SSAA | 2.392ms |
SSAA + FXAA | 3.846ms |
VRSAA | 2.347ms |
VRSAA + FXAA | 3.704ms |
Overall gain of 0.05ms per frame does not mean much, especially when there is literally 30 draw calls on a scene with very simple fragment shaders.
When extrapolating data, assuimg my frame time was 33ms (30FPS) with SSAA, the benefit gain would be 0.62ms per frame – which in turn would translate as 0.88 FPS improvement.
However, assuming there is more complex scene with more advanced shaders like PBR, then there is a chance the results still could be quite different.
As a side observation worth mentioning:
If you have FXAA enabled, or possibly other antialiasing method, you could also enable VRS 2×2 tiling, as final result of these two features combined could be very close – to non-observable in scenes with motion (aka regular gameplay).
Extra details how to enable VRS in Vulkan:
To enable VRS extension, you have to query and enable the VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME
extension on your device.
uint32_t count = 0;
vkEnumerateDeviceExtensionProperties( thePhysicalDevice, nullptr, &count, nullptr );
std::vector<VkExtensionProperties> existingExtensions( count );
vkEnumerateDeviceExtensionProperties( thePhysicalDevice, nullptr, &count, existingExtensions.data() )
auto vrscmp = []( const auto& prop )
{
return std::strcmp( prop.extensionName, VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME ) == 0;
};
bool vrsAvailable = std::ranges::find_if( existingExtensions, vrscmp ) != existingExtensions.end();
std::vector<const char*> enableExtensions{ /* ... other extensions to use ... */ };
if ( vrsAvailable ) enableExtensions.emplace_back( VK_KHR_FRAGMENT_SHADING_RATE_EXTENSION_NAME );
If the VRS extension is present, you have to pass VkPhysicalDeviceFragmentShadingRateFeaturesKHR
structure as part of VkDeviceCreateInfo::pNext
field (or chain of .pNext
fields in subsequent structs).
VkPhysicalDeviceFragmentShadingRateFeaturesKHR vrsInfo{
.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FRAGMENT_SHADING_RATE_FEATURES_KHR,
.pipelineFragmentShadingRate = VK_TRUE,
.primitiveFragmentShadingRate = VK_TRUE,
.attachmentFragmentShadingRate = VK_TRUE,
};
VkDeviceCreateInfo deviceCreateInfo{
.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
.pNext = vrsAvailable ? &vrsInfo : nullptr, /* ... nullptr or other create infos you wish ... */
/* ... rest of your device create info fields ... */
};
vkCreateDevice( thePhysicalDevice, &deviceCreateInfo, nullptr, &theLogicalDevice );
Now the path diverges what you can do. You can do either, or both if you desire:
- Enable VRS as a pipeline property
- Enable VRS as a pipeline dynamic state + do at least one call to
vkCmdSetFragmentShadingRateKHR
before draw call
VRS as pipeline property:
When creating pipeline, create VkPipelineFragmentShadingRateStateCreateInfoKHR
structure as part of VkGraphicsPipelineCreateInfo::pNext
field (or chaing of .pNext
fields in subsequent structs).
This will make pipeline use the specified VRS tile size for every draw call, no further input required.
VkPipelineFragmentShadingRateStateCreateInfoKHR vrs{
.sType = VK_STRUCTURE_TYPE_PIPELINE_FRAGMENT_SHADING_RATE_STATE_CREATE_INFO_KHR,
.fragmentSize{ 2, 2 },
.combinerOps{ VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR, VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR },
};
VkGraphicsPipelineCreateInfo pipelineInfo{
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.pNext = vrsAvailable ? &vrs : nullptr, /* ... nullptr or other create infos you wish ... */
/* ... rest of your pipeline create info fields ... */
};
vkCreateGraphicsPipelines( theLogicalDevice, nullptr, 1, &pipelineInfo, nullptr, &thePipeline );
VRS as dynamic state:
When creating pipeline, add VK_DYNAMIC_STATE_FRAGMENT_SHADING_RATE_KHR
to the collection of enabled dynamic states.
This will make pipeline use the information about VRS tile set during command buffer recording – which means you have to call vkCmdSetFragmentShadingRateKHR
at least once during recording, before any draw call.
std::vector<VkDynamicState> enabledDynamicStates{ /* ... other enabled dynamic states ... */ };
// refer to the device code above to see the vrsAvailable variable
if ( vrsAvailable ) enabledDynamicStates.emaplace_back( VK_DYNAMIC_STATE_FRAGMENT_SHADING_RATE_KHR );
VkPipelineDynamicStateCreateInfo dynamicState{
.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
.dynamicStateCount = static_cast<uint32_t>( enabledDynamicStates.size() ),
.pDynamicStates = enabledDynamicStates.data(),
};
VkGraphicsPipelineCreateInfo pipelineInfo{
.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
.pDynamicState = &dynamicState,
/* ... rest of your pipeline create info fields ... */
};
vkCreateGraphicsPipelines( theLogicalDevice, nullptr, 1, &pipelineInfo, nullptr, &thePipeline );
const VkExtent2D vrs{ 2, 2 };
const VkFragmentShadingRateCombinerOpKHR combiner[]{
VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR,
VK_FRAGMENT_SHADING_RATE_COMBINER_OP_KEEP_KHR,
};
vkCmdSetFragmentShadingRateKHR( cmd, &vrs, combiner );
However please note the function vkCmdSetFragmentShadingRateKHR
is extension function and may not be present in your vulkan library. If such case occurs your program may crash long before it reaches main()
.
To mitigate this, you’d need to manually resolve it from your vulkan instance with function vkGetInstanceProcAddr
.
A reference implementation of VRS is available at:
https://github.com/xmaciek/starace/commit/4b01ab86127f702dcf8fec0da84a0053f1ef1b8f