Fixed in issue with adjusting the game speed slider in StarCraft's Battle.net interface.
It was resulting in a WM_HSCROLL message erasing the parent window background without also
redrawing the child windows (labels) on it.
Because both the DirectDraw and GDI thread locks are held by the thread that
initially calls beginGdiRendering, a deadlock can occur if a complex rendering
operation uses synchronized worker threads for some subtasks and they also need
access to the resources locked by the initial thread.
To resolve this, the GDI thread lock is no longer held after beginGdiRendering
returns, and the DirectDraw thread lock is only taken during the initial entry.
However, this introduces another problem, because now the final endGdiRendering
might not be called by the same thread that initially called beginGdiRendering.
If this happens, a deadlock will occur because the initial thread is still
holding the DirectDraw thread lock while the other thread is trying to acquire
it to unlock the primary surface.
To resolve this, the initial thread will always be the one to release the lock
on the primary surface, waiting for other threads to finish using GDI DCs if
necessary. This also means that other threads won't be able to create new
cached DCs (as they would need the DD thread lock), so to prevent yet another
deadlock, the initial thread always preallocates a number of DCs in the cache,
and only the initial thread is allowed to extend the cache.
Disabled visual styles and added WM_NCPAINT handling (as a simple BitBlt copy
from the original DC) to reduce glitches in rendering of common controls and
windows, such as the GetSaveFileName dialog window in StarCraft.
Fixed a deadlock in the caret emulation code caused by locking the GDI
critical section prior to calling beginGdiRendering, which is locking both the
DirectDraw and GDI critical sections. Another thread that only calls
beginGdiRendering without entering the GDI critical section first could thus
run into a deadlock as both threads were waiting on each other's critical
sections.
Now the global caret data has it its own critical section instead of sharing
the GDI critical section.
Some new files were accidentally left out of a previous commit:
"Revamping GDI interworking - Part 2" 1999d1c56ef4af2b173764a0dffff0c270357628
The missing files are now added.
The vsynced paletted update is causing performance problems in games that
frequently update the palette outside of a palette animation loop, e.g. StarCraft.
Vsync has been removed from the palette update, and instead a primary surface
synchronization is forced on the main thread, which still seems to preserve
the palette fade-in/fade-out effects in Fallout. Perhaps only the small delay
introduced by the sync was needed? (Seems it would work with Sleep(1) too).
Extended GDI redirection to most of the GDI rendering methods.
Readded handling of window messages and extended it to support scrolling.
Readded manual drawing of the caret.
Simplified the DC cache.
Previous method of GDI interworking does not seem feasible anymore as it is
leaking GDI resources in more complex scenarios (StarCraft). There does not
seem to be a way to prevent all leaks as the ReleaseDC hook does not capture
all released DCs.
New method redirects the individual GDI drawing methods instead by replacing
DCs only temporarily for each operation. Currently only BitBlt is supported
(which seems sufficient for Deadlock 2).
Also, writing to unlocked video surface memory no longer works on Windows 10.
To work around this restriction, the primary surface is temporarily locked
for the duration of each GDI rendering operation.
A certain number of DCs are now created upfront and cached to reduce the likelihood
of deadlocks. This avoids having to enter the DD critical section to create new DCs
in most cases.
This change seems to resolve deadlock issues when Alt-Tabbing in Deadlock 2.
On some systems, a newly created DirectDraw video memory surface will initially
have a different surface memory address (obtainable with Lock) than the address
returned after the first Blt operation.
Previously a color fill Blt was used on newly created surfaces to make the
returned address consistent (which GDI interworking currently relies on, as well
as some games, e.g. Nox). But this may not work correctly on some low-end systems
that don't support hardware accelerated color fills.
Now a simple 1x1 Blt is used instead with a temporary (cached) surface as source.
A separate source surface is used because I have concerns (based on some MSDN
articles about WDDM) that same-surface Blts may not be handled the same way on
all drivers.
The previous GDI synchronization overhaul introduced mouse trails in Deadlock 2.
To resolve it, the scope of the GDI critical section was extended to the entire
getCompatDc/releaseCompatDc functions.
Not sure why this helps as I didn't care to debug the issue deeply enough,
but it happens to work so I'll leave it at that for now.
GDI emulation was using the DirectDraw critical section for thread safety,
but it caused a deadlock when Alt-Tabbing in Commandos BCD.
Now it uses its own critical section and some atomic shared variables.
Locking the emulated primary surface can run into issues with the fake surface caps
added by GetSurfaceDesc, which Lock seems to call internally.
Modification of the primary surface caps is temporarily disabled in GetSurfaceDesc
for the duration of Lock calls, and instead is done only after Lock returns.
This fixes a crash at startup in Commandos.
Apparently normal surfaces can be created without any caps, or with caps
that don't specify a surface type (e.g. only memory flags or completely zero).
Some games even seem to use such surfaces (e.g. Planescape: Torment) and
were broken by the earlier fix.
Now the surface type detection is based on a lack of certain caps instead of
the presence of them.
Some D3D buffers (e.g. vertex buffers) use DirectDraw surfaces for storage.
Since a previous fix for forcing the emulated pixel format on surfaces broke
Alt-Tabbing in Messiah, this fixes it by automatically restoring D3D buffers
during Lock.
If this causes other issues, the only alternative seems to be forcing these
surfaces to system memory (which was the previous broken behavior).
Earlier the update thread was made persistent accross multiple primary surfaces,
but some parts were not correctly updated. The update event handle was still being
closed on release of the primary surface, and the update thread still exited if
there was no primary surface immediately after it received an update event.
When forcing the emulated pixel format to surfaces in CreateSurface,
the original surface description parameter was being modified, leading to
issues with the mouse cursor in Populous 3. Now a copy is modified instead.
Also avoided unnecessary palette update when the palette is being set to null.
The internal surface pointers to the compatible primary surface were not being reset
when the surface was released, possibly leading to other surfaces being misidentified
as the compatible primary surface.
All internal primary surface pointers are now reset to null on release.
Because the actual display mode is always forced to 32 bits color depth,
surfaces that would be created without a pixel format need to be explicitly
set to the emulated display mode's pixel format that the application expects,
rather than letting them use the actual display mode's pixel format.
However, the emulated pixel format was being forced on surface types that aren't
normal surfaces and shouldn't inherit the display mode's pixel format
(e.g. vertex buffers and depth buffers), which prevented some of these surfaces
from being created correctly.
Now the surface type checks are more restrictive.
The palette converter surface was being released even when it was null,
leading to potential problems with releasing the primary surface.
The update thread is also no longer terminated on each release.
Termination may have been necessary more often than expected because
the thread may be waiting on the DirectDraw critical section on release.
There doesn't seem to be any issues with letting the OS clean up
the running thread on exit instead.