Some operations can operate on only one of the two aspects
of a depth-stencil image. This fixes two possible issues:
- Image memory barriers must be applied to all image aspects
- VK_IMAGE_LAYOUT_UNDEFINED is no longer used as a source layout
if the operation requiring the transition only uses one aspect
We don't need to force layout transitions and emit double pipeline
barriers in case the GENERAL layout is being used for both images.
This is somewhat common for images used by compute shaders, and
this optimization ensures that only required barriers are emitted.
Instead of inserting a barrier after every single buffer copy, update
or clear operation, we batch them up and execute the barrier when the
first dirty buffer is used by a command. This significantly reduces
the number of pipeline barriers in some games, e.g. Final Fantasy XV.
Spilling the render pass should make shader storage buffer/image writes
visible due to how external subpass dependencies are defined. For UAV
rendering, we need to do this when changing the UAVs, even if the render
targets themselves do not change.
Works around an issue with some games not setting the D3D11 depth
bias state correctly, which can result in an excessive number of
pipelines being compiled.
Rather than creating just one image view per DxvkImageView, we create
views for all compatible types in an attempt to work around game bugs
in Diablo 3, Far Cry 5, Nier Automata, Dishonored 2, Trackmania etc.,
which bind incompatible resource views to some resource slots.
Reduces the number of Vulkan calls for vertex buffer bindings and
works around incorrect validation errors emitted when applications
do not use a consecutive range of vertex bindings. No performance
impact is expected in most games.
We have to use VK_IMAGE_LAYOUT_GENERAL for those. On top of that,
we should avoid image transitions when the image is in GENERAL
layout anyway in order to save some time on the GPU.
This should help reduce the number of redundant render pass spills,
especially in games which use deferred contexts for rendering. This
optimization mostly helps in GPU-bound scenarios.
An alternative to manually creating a framebuffer object and binding
it via bindFramebuffer. Future optmizations can use this to bring
down the number of redundant render pass changes.
Since we create only one DxvkContext per D3D11Device, rather than
per D3D11DeviceContext as originally planned, there is no need to
keep the pipeline manager as a global thread-safe object. This may
slightly reduce CPU overhead.
This is required for resource mapping on deferred contexts.
May also fix a potential synchronization issue where a buffer
could be mapped multiple times before the CS thread would mark
the physical buffer as used, which would result in invalid data.
Reduces the CPU overhead of descriptor set updates, which usually
happen once per draw call. Gains seem to be minor in most games,
some outliers show significantly better performance (i.e. Tomb Raider).
Closer to the D3D11 API. We cannot use the normal clearColorImage and
clearDepthStencilImage methods in case the game uses a 2D array view
for a 3D image. Fixes some validation issues in Hellblade.