Begins transform feedback when rendering with an xfb-enabled
pipeline bound, and ends transform feedback as needed, while
writing back the counters supplied by the app. This does not
yet support transform feedback queries or the draw command.
Introduces an OpenGL-style bind point for the argument buffer, which
means we can avoid a lot of unnecessary reference tracking in games
that do a lot of indirect draw calls.
Reduces CPU overhead in Assassin's Creed Odyssey.
This is optimized to allow a large number of indirect draws to be
submitted if they all access the same argument buffer, as is the
case in Assassin's Creed Syndicate and Odyssey.
The stat counter struct no longer has to be passed to
the pipeline compiler function.
The new implementation uses atomic counters of the pipeline manager
rather than per-command list counters, which removes the need to
pass the counter structure to the compiler function.
These new methods can support overlapped subresource copies by
creating a temporary resource and effectively using two copy
operations. This is required for D3D11 overlapped copies.
We don't support rasterization with a sample count different from
the framebuffer sample count, but if there are no attachments, any
sample count is allowed.
Moved all query-related state tracking and management into a
separate class. This allows for new query types to be added
in the future, and makes less dodgy assumptions about the
current state when beginning or ending a query.
If a game clears the depth and stencil aspects of a depth-stencil
buffer separately, we must not override the load op and clear value
of the previously set aspect. Fixes a rendering issue in Hitman
Absolution.
Some operations can operate on only one of the two aspects
of a depth-stencil image. This fixes two possible issues:
- Image memory barriers must be applied to all image aspects
- VK_IMAGE_LAYOUT_UNDEFINED is no longer used as a source layout
if the operation requiring the transition only uses one aspect
We don't need to force layout transitions and emit double pipeline
barriers in case the GENERAL layout is being used for both images.
This is somewhat common for images used by compute shaders, and
this optimization ensures that only required barriers are emitted.
Instead of inserting a barrier after every single buffer copy, update
or clear operation, we batch them up and execute the barrier when the
first dirty buffer is used by a command. This significantly reduces
the number of pipeline barriers in some games, e.g. Final Fantasy XV.
Spilling the render pass should make shader storage buffer/image writes
visible due to how external subpass dependencies are defined. For UAV
rendering, we need to do this when changing the UAVs, even if the render
targets themselves do not change.
Works around an issue with some games not setting the D3D11 depth
bias state correctly, which can result in an excessive number of
pipelines being compiled.