Fixes a few bottlenecks that were encountered in the Cascading Shadow Maps demo from the Microsoft SDK. Performance is now slightly better than wined3d with CSMT, MESA_NO_ERROR and mesa_glthread enabled.