From b3cc8723c14a48bd75f124bed67ac5a2b44e013a Mon Sep 17 00:00:00 2001 From: CamilleLaVey Date: Fri, 24 Apr 2026 16:37:18 +0200 Subject: [PATCH] [vulkan] 2nd Vulkan Global Maintenance (#3853) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This pr is a sequel to the one merged some days ago (#3839); which aims to improve stability, graphical accuracy and better Vulkan implementation and coherency among all platforms, contains the next changes: -> Removal of VK_EXT_unified_image_layouts: The removal of this ext was for cleaning purposes since the only part of this extension implemented was the activator; meanwhile a proper structure of use for this extension was not implemented, currently it's not viable to keep following an idea of a proper implementation due to complexity of this feature and the state of buffer cache and texture cache, which it's task that we must do near in the future, when this happens a better oportunity will arise to properly set layouts along a proper implementation of VK_EXT_descriptors_indexing, practically this feature was dead code. -> Adjustment of VK_EXT_custom_border_color: The implementation of this feature was handled poorly and worsened during the first tries of making ExtendedDynamicState stable, by gating it's use to the slider of EDS (dyna_state) if the counter was at least in 1, even tho this entered in a bug with the RemoveUnsuitableExtension, when is not a requirement for enabling in Vulkan's documentation and was my mistake, some time later in ExtendedDynamicState refactor (#3074) I tried to make the implementation more robust in comparison the Yuzu's implementation which had bans on vendor drivers, the new handling was requesting if extension was available and what kind of support feature it had, enabling what it was available and wiring an adequate path for said available feature; which leads us to today's change, after reading carefully how certain paths weren't triggered or caused mostly issues on how extension should work I did the next changes: - I removed the forced disabling with ExtendedDynamicState setting - Resolved the bug with RemoveUnsuitableExtension + dyna_state - Removed comments of explanation + log_debug warning - Set extension to be disabled if customBorderColorWithoutFormat is not available - Helps to solidify the removal of bans in vendor drivers This changes fixes the VUID 04015 for the handling with undefined format and made the usage of the extension more near to what Vulkan specification expects, yet there is still cases where we can't emulate properly samplers and some translucid black boxes will still appear, yet, now alleviated by allowing extension choose the proper custom available in Vulkan or degrade into a fallaback of solid colors. -> Adjustment of VK_EXT/KHR_robustness2: This feature was introduced in ExtendedDynamicState refactor (#3074), as safety measure for descriptors during the Write of buffers, providing robustness with an upgraded access to image, buffers and proper discard of null data in descriptors, however, despite the configuration the logs during debug sessions never stopped to bring the next VUID-VkWriteDescriptorSet-descriptorType-00324 and VUID-VkWriteDescriptorSet-descriptorType-00325, being the first one, the most constant issue plaguing logs; the approach was not only ensuring device can access between each of the version of this feature, whether is an EXT or KHR (drivers can report one of them or both, yet, if we call the one of them and it's not the version supported, driver would not load the feature, there's a priority to the KHR version) with a simplified configuration of the extension to use only nullDescritor to deflect properly buffers and other trash data outside of descriptors bound; ensuring to wire the path when it's and not available and also with BindVertexBuffers2EXT when it's or not available; fixing both VUID's. This changes helps to save some CPU resources and memory on binding routes. - Fixes VUID-VkWriteDescriptorSet-descriptorType-00324 - Fixes VUID-VkWriteDescriptorSet-descriptorType-00325 - Fixes VUID-vkCmdBindVertexBuffers-pBuffers-00621 (alongisde a latter adjustment for pStrides) -> Adjustment VK_EXT_image_robustness: As other features, this was implemented during the ExtendedDynamicState refactor (#3074), currently this change it's just to ensure more drivers are accessing this feature by changing the modality from extension to an explicit feature, some other redundant code was cleaned within this change. -> Restored gating flush operation on removed gpu accuracy: An issue report from an user called CaptFaraday in https://github.com/eden-emulator/Issue-Reports/issues/425, posted a behavior appearing after the rework of gpu accuracy levels (#3129), which broke the rendering in Paper Mario - The-Thousand-Year Door where some graphical issues such as black flash and missing rendering from many animations through the game thanks to the removal of the flush inside FlushAndInvalidateRegion gated with IsGPULevelExtreme and suggested a possible fix with resting the missing gating and flush; which I did and properly restoring the complete behavior of this functionality + wiring to the new IsGPULevelHigh for a better semantic correctness, the change was tested and didn't affected Yoshi's Crafted World graphical problems and main reason behind the deletion of this function, fixed in fcfcee7247. - Solves https://github.com/eden-emulator/Issue-Reports/issues/266 - Solves https://github.com/eden-emulator/Issue-Reports/issues/425 - Fixes Paper Mario - The-Thousand-Year Door - Keeps Yoshi's Crafted World issue still fixed -> Adjustment VK_EXT_conditional_rendering: Yuzu inherited us in their Vulkan backend multiple flaws which got worsened with the time as game and drivers changed, aside that, with the time studying this source code and especially the Vulkan-side of Eden, I started to learn and recognize some extensions that required a wide and robust modification to ensure the logic of the extension works as intended; currently ConditionalRendering had a lot of minimal modification: - Reordering the functions from "_NotifySegment_" to avoid a masive ram leak coming from query_cache (@weakboson) (#131) - Removing a function "_NotifySegment_" from rasterizer to ensure Metroid Prime 4 stopped crashing due to serious ram leaks in query_cache (@Maufeat) (#3142) And other intents to make ConditionalRendering fully working, such as happened in ExtendedDynamicState refactor (#3074) but only patched an horrible situation with how the extension was truly working, after spending more than 3 months studying how this and other sub-sequential and essential extensions touched in this PR worked in Vulkan, I dived once again to make it work properly; one of the first changes was to fix an invalid reference lookup of queries, which fixed the removal of "NotifySegment" inside rasterizer and start to adjusting other parts of the implementation of ConditionalRendering minimally and switching with heavy tests to ensure not a single game gets broken among the changes; yet the initial benefits from fixing the indirection in the lookups to query cache, was to reduce the amount of time of GPU was spending in the constant state of queries, which proved to reduce flickering in Pokémon ZA among others, with also an small increase of performance but more noticeable stability, starting to reduce stutering bit by bit. This advances allowed me to fix one of the the functions of IsGPULevelHigh, where existed a bypass to accelerate conditional rendering without the proper checks if the extension was truly supported, freeing QCOM from the flag of a fixated presync workaround; which also improved the usage of QCOM driver for 8 Elite devices and Unreal Engine 4 - 5 games, such as Dragon Ball Z - Sparking Zero; but that's not the only benefit from the current tries to make ConditionalRendering implementation more robust and accurate to specifications, but also started to show key points of where VK_EXT_transform_feedback was also failing to work properly. -> Adjustment VK_EXT_transform_feedback: Like many other features in this PR, this one was also adjusted minimally in ExtendedDynamicState refactor (#3074), with the ConditionalRendering refactor going on, the solutions for the usage of this extensions started by ensuring each key function is properly gated by a getter which would only be enabled if the extension was already being loaded in the virtual device, if wasn't the case I was making sure to wire the fallback correctly, which wasn't in place and didn't had a robust handling since ever, this way games started to not only improve graphical accuracy, like some games such as Zelda - Echoes Of Wisdom where the lightning and dark border moves/ reacts dynamically. Besides that, this change brought the possibility to finally get rid of the indirection of the buffers synchronization which often take a non-synchornization path to ensure a faster reply but with higher possibilities to cause graphical issues in among several games; aside that, also helped to ensure "_query_cache.CounterEnable(VideoCommon::QueryType::StreamingByteCount, false)_;" function is properly allocated and reset in a different place than where this was placed. - Maintain Metroid Prime 4 fixes even after returning the lines that caused the game to be unplayable, whether was an instant crash or crash after some minutes of gameplay, fixes that were introduced in other work (#3142). Within the first step in the refactor of this extension, this work was reviewed by @wildcard which made me notice of an issue of handling inside buffers, there existed a mismatch on the tracking of feedback buffers and since we're treating them as buffer_slot, counterstream was still tracking and consuming stream_buffers and not where data was really passing through; derivating the counter selection of counterBufferCount by stream indexes and not by slot, which could cause cases of _Stream =! slot_, a solution proposed for me was to add stream mapper function where the stream slot were located + updating UpdateBuffers() to calculate buffer counts per slot and not stream, allowing to fully map the map funtion of stream mapper; along other changes on the WriteBuffers + ProduceBufferCounter to avoid any misaligment. -> Other minimal adjustments: Alongside these important adjustment, others were also made to ensure logical coherency to this recent changes, such as ensuring FullSynchronization of buffers path, since GPU has mostly a syncing issue with certain type of textures and vertex calculations, the original behavior of jumping into a non-synchronize path of buffers let GPU ran without proper awareness of the textures being loaded, ensuring more performance if all the textures reached properly inside GPU, but with no safety provided for buffers, even we added a cases were dummies and mostly buffer trash data gets discarded by nullDescriptors, this won't ensure graphical artifacts or a bad calculation on the range of lightning/ gfx could appear. I dare to think this was thought to be implemented due to the original heavy costs on Yuzu's time, this along the removal of QCOM's drivers from Query's Presync funtions. A small adjustment to the mutable functions inside the CreateImageView structure to add the extended usage: "Because Switch's GPU creates incompatible views (sRGB and UNORM) on the same image. A sRGB image can't be used as storage but it is in a UNORM view. Which is exactly the use case of these flags." - @weakboson During all of this changes inside queries, we started to get in some devices "Device Loss" warning from Vulkan along 2 specific warnings: - [ 102.384895] Render.Vulkan video_core/vulkan_common/vulkan_debug_callback.cpp:69:DebugUtilCallback: vkDeviceWaitIdle(): THREADING ERROR : object of type VkQueue is simultaneously used in current thread 517024609280 and thread 517864567808 - [ 114.003530] Render.Vulkan video_core/vulkan_common/vulkan_debug_callback.cpp:69:DebugUtilCallback: vkCmdBeginQuery(): VkQueryPool 0x15e400000015e4 and query 172: query not reset. After query pool creation, each query must be reset (with vkCmdResetQueryPool or vkResetQueryPool) before it is used. Queries must also be reset between uses. Since before all of this adjustments, original GPU thread often take it's time to stop and look for a moment to synchronize with CPU (non-TimelineSemaphore), the whole flow data was improved that we were producing more data stale than we could really take due to the lack of a Reset to avoid pools being filled with old data, in order to get rid of this, another try to implement ResetQueryPool's appeared which was intented to be implemented some months ago and got removed in #3270, this time aligning the vkDeviceWaitIdle + ResetQueryPool was proved to be effective than first implementation and didn't caused major issues, now GPU can Vulkan can reset staling data, which can catalogued as old once they were used and displayed in frame, keeping a more fluid exchance and discard of data. - Fixes VUID-vkCmdBeginQuery-None-00807 - Fixes multithreading error with vkDeviceWaitIdle and data allocation We have some other changes to the coherency of ExtendedDynamicState2 and the feature of restart primitives, which now patches topologies once are processed if they pass through ExtendedDynamicState2 enabled and get reset before every draw to prevent another topology VUID; also I ensured refresh, reset, clamp and overall improve the math inside the Viewport/ Scissor feature operations inside DynamicState and later upgrades. --------------------------- UPDATE (23/04/2026): After passing a heavy testing phase, an issue was encountered with AMD drivers on Windows which based on the commit: c07dfa6fb4, Super Mario Odissey started to show vertex glitches on the the waterfall + water fog being rendered incorrectly, if VertexInputDynamicState was disabled caused black screen on ExtendedDynamicState (1 - 3) and hard crash if ExtendedDynamicState it was disabled; this situation was caused to the vertex input dynamic tied to ExtendedDynamicState1, AMD driver didn't allowed fast access to BindVertexBuffers2EXT without binding strides first, which caused a syncing problem between the binded vertex and the missing buffer in the same chain, this got fixed by removing the conditions for vertex input dynamic. Aside that; another pair of issues were addressed in the meantime of refining this PR, one of them was to solve the failing BGR565 formats to swizzle into RGBA5 which allows to swap between red and blue; solving the inverse situation of blue icons on Mario Kart 8 Deluxe for older QCOM drivers and SoC's, such as Snapdragon 855 - 870; which will also help some Exynos processors to render properly. This solution was converted into a toggle/hack because it's use it's very conditional on older hardware; newer SoC's such as 8 Elite won't longer require this handling to convert properly BGR565 texture even if the support for the format is not available. --------------------------- Here an small preview of what this pr has been fixed so far, but our testing range may be more limited than what this can actually do: - Allow to display new effect on games 1. Jump Force: New particle on stages and main menu. 2. The Legend Of Zelda - Echoes of Wisdom: darkness post processing effect on screen filter such as game intro and smokes on houses (8 Elite). 3. Reduce texture flickering on games such as EoW, Monster Hunter Rise. 4. Improved performance stability on various games with Android. 5. Improved Xenoblade games rendering with QCOM stock drivers by improve viewpoint handling (8 Elite). 6. Fixes vertex explosion on Xenoblade 3 with AMD GPU with extended dynamic state enabled. 7. Fixed Mario Kart 8 Deluxe rendering with VK_EXT_vertex_input_dynamic_state enabled. 8. Fixes certain angle of Pokemon Legend Z-A would look mono color with vertex input dynamic state. 9. Fixed graphical issue with VK_EXT_vertex_input_dynamic_state on mobile drivers. 10. Fixed vertex explosion with Turnip (8 Elite) on The Legend Of Zelda- Breath Of The Wild during loading screen. 11. Fixed issue of vertex on Pokémon Legends ZA with VK_EXT_vertex_input_dynamic_state enabled. 12. Improved rendering and stability of Inmortal Fenyx Rising, including QCOM drivers being able to reach into gameplay. 13. Fixes Paper Mario - The-Thousand-Year Door missing rendering on animations through the whole game. 14. Fixed Mario Kart 8 Deluxe blue tint icon on Snapdragon 855 - 870 (by enabling Emulated BGR565 hack toggle). 15. Fixed Naruto Ultimate Ninja Storm issue rendering on characters like Naruto being blue on older QCOM SoC's and Exynos (by enabling Emulated BGR565 hack toggle) 16. Fixed Dangaronpa Killing Harmony v3 issue rendering on characters with blue tint on older QCOM SoC's. --------------------------- _**Special Thanks - Credits**_ -> @Gidoly for being able to keep track of the intensive testing phase this pr required and the will to keep helping in development, you're a good friend and very useful. -> @CaptFaraday for the suggestion of the fix for Paper Mario. -> @wildcard for the review during the refactor of VK_EXT_transform_feedback, without this comment I would probably ran into many untrackable issues. -> @weakboson for the suggestion into the solution for sRGB's and UNORM's in the incompatible views. Co-authored-by: lizzie Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3853 Reviewed-by: crueter --- .../features/settings/model/BooleanSetting.kt | 1 + .../settings/model/view/SettingsItem.kt | 7 + .../settings/ui/SettingsFragmentPresenter.kt | 1 + .../app/src/main/res/values/strings.xml | 6 +- src/common/settings.h | 5 +- .../backend/spirv/spirv_emit_context.cpp | 3 + src/shader_recompiler/runtime_info.h | 1 + src/video_core/buffer_cache/buffer_cache.h | 33 ++-- src/video_core/gpu_thread.cpp | 13 +- .../resolve_conditional_render.comp | 21 ++- src/video_core/macro.cpp | 4 +- src/video_core/query_cache/query_cache.h | 1 + src/video_core/query_cache/query_stream.h | 5 +- .../renderer_opengl/gl_rasterizer.cpp | 3 + .../renderer_vulkan/fixed_pipeline_state.cpp | 30 ++-- .../renderer_vulkan/fixed_pipeline_state.h | 8 +- .../renderer_vulkan/renderer_vulkan.cpp | 7 +- .../renderer_vulkan/vk_blit_screen.cpp | 6 +- .../renderer_vulkan/vk_buffer_cache.cpp | 8 +- .../renderer_vulkan/vk_compute_pass.cpp | 18 +- .../renderer_vulkan/vk_graphics_pipeline.cpp | 4 + .../renderer_vulkan/vk_pipeline_cache.cpp | 6 + .../renderer_vulkan/vk_present_manager.cpp | 2 +- .../renderer_vulkan/vk_query_cache.cpp | 155 +++++++++++++++--- .../renderer_vulkan/vk_query_cache.h | 3 +- .../renderer_vulkan/vk_rasterizer.cpp | 137 ++++++++++++---- .../renderer_vulkan/vk_rasterizer.h | 3 +- .../renderer_vulkan/vk_scheduler.cpp | 2 + src/video_core/renderer_vulkan/vk_scheduler.h | 5 + .../renderer_vulkan/vk_state_tracker.cpp | 3 +- .../renderer_vulkan/vk_state_tracker.h | 2 +- .../renderer_vulkan/vk_texture_cache.cpp | 50 +++--- src/video_core/transform_feedback.cpp | 4 +- .../vulkan_common/vulkan_device.cpp | 57 ++++--- src/video_core/vulkan_common/vulkan_device.h | 53 ++---- .../vulkan_common/vulkan_wrapper.cpp | 1 + src/video_core/vulkan_common/vulkan_wrapper.h | 5 + 37 files changed, 467 insertions(+), 206 deletions(-) diff --git a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/BooleanSetting.kt b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/BooleanSetting.kt index e47263bfb2..e482725196 100644 --- a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/BooleanSetting.kt +++ b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/BooleanSetting.kt @@ -17,6 +17,7 @@ enum class BooleanSetting(override val key: String) : AbstractBooleanSetting { USE_CUSTOM_CPU_TICKS("use_custom_cpu_ticks"), SKIP_CPU_INNER_INVALIDATION("skip_cpu_inner_invalidation"), FIX_BLOOM_EFFECTS("fix_bloom_effects"), + EMULATE_BGR565("emulate_bgr565"), CPUOPT_UNSAFE_HOST_MMU("cpuopt_unsafe_host_mmu"), USE_DOCKED_MODE("use_docked_mode"), USE_AUTO_STUB("use_auto_stub"), diff --git a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/view/SettingsItem.kt b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/view/SettingsItem.kt index 8ca9533f83..daeee398d4 100644 --- a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/view/SettingsItem.kt +++ b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/model/view/SettingsItem.kt @@ -756,6 +756,13 @@ abstract class SettingsItem( descriptionId = R.string.fix_bloom_effects_description ) ) + put( + SwitchSetting( + BooleanSetting.EMULATE_BGR565, + titleId = R.string.emulate_bgr565, + descriptionId = R.string.emulate_bgr565_description + ) + ) put( SwitchSetting( BooleanSetting.CPUOPT_UNSAFE_HOST_MMU, diff --git a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/ui/SettingsFragmentPresenter.kt b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/ui/SettingsFragmentPresenter.kt index 542215fa97..0487339f68 100644 --- a/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/ui/SettingsFragmentPresenter.kt +++ b/src/android/app/src/main/java/org/yuzu/yuzu_emu/features/settings/ui/SettingsFragmentPresenter.kt @@ -293,6 +293,7 @@ class SettingsFragmentPresenter( add(IntSetting.FAST_GPU_TIME.key) add(BooleanSetting.SKIP_CPU_INNER_INVALIDATION.key) add(BooleanSetting.FIX_BLOOM_EFFECTS.key) + add(BooleanSetting.EMULATE_BGR565.key) add(BooleanSetting.RENDERER_ASYNCHRONOUS_SHADERS.key) add(BooleanSetting.RENDERER_ASYNCHRONOUS_GPU_EMULATION.key) add(BooleanSetting.RENDERER_ASYNC_PRESENTATION.key) diff --git a/src/android/app/src/main/res/values/strings.xml b/src/android/app/src/main/res/values/strings.xml index d42fb37d58..02860364a9 100644 --- a/src/android/app/src/main/res/values/strings.xml +++ b/src/android/app/src/main/res/values/strings.xml @@ -501,7 +501,7 @@ Enable buffer history Enables access to previous buffer states. This option may improve rendering quality and performance consistency in some games. Optimized Vertex Buffers - Enables optimized vertex buffer binding for improved performance. Requires Mesa 26.0+ Turnip drivers. Will crash on older drivers. + Enables optimized vertex buffer binding for improved performance. Requires Mesa 26.0+ Turnip drivers/ QCOM drivers. Will crash on older Turnip drivers. Hacks @@ -510,7 +510,9 @@ Skip CPU Inner Invalidation Skips certain CPU-side cache invalidations during memory updates, reducing CPU usage and improving it\'s performance. This may cause glitches or crashes on some games. Fix Bloom Effects - Reduces bloom blur in LA/EOW (Adreno 700), removes bloom in Burnout. Warning: may cause graphical artifacts in other games. + Reduces bloom blur in LA/EOW (Adreno A6XX - A7XX/ Turnip), removes bloom in Burnout. Warning: may cause graphical artifacts in other games. + Emulate BGR565 + Fixes problems with inverted colors in games or strange artifacts or strange shadows. Use asynchronous shaders Compiles shaders asynchronously. This may reduce stutters but may also introduce glitches. GPU Unswizzle Settings diff --git a/src/common/settings.h b/src/common/settings.h index ca61ac906e..13ccf5a1d5 100644 --- a/src/common/settings.h +++ b/src/common/settings.h @@ -555,6 +555,9 @@ struct Values { SwitchableSetting fix_bloom_effects{linkage, false, "fix_bloom_effects", Category::RendererHacks}; + SwitchableSetting emulate_bgr565{linkage, false, "emulate_bgr565", + Category::RendererHacks}; + SwitchableSetting rescale_hack{linkage, false, "rescale_hack", Category::RendererHacks}; @@ -584,7 +587,7 @@ struct Values { SwitchableSetting dyna_state{linkage, #if defined(ANDROID) - ExtendedDynamicState::EDS1, + ExtendedDynamicState::Disabled, #elif defined(__APPLE__) ExtendedDynamicState::Disabled, #else diff --git a/src/shader_recompiler/backend/spirv/spirv_emit_context.cpp b/src/shader_recompiler/backend/spirv/spirv_emit_context.cpp index b9a24496c9..169e83d9fd 100644 --- a/src/shader_recompiler/backend/spirv/spirv_emit_context.cpp +++ b/src/shader_recompiler/backend/spirv/spirv_emit_context.cpp @@ -178,6 +178,9 @@ void DefineGenericOutput(EmitContext& ctx, size_t index, std::optional invo ctx.Decorate(id, spv::Decoration::XfbBuffer, xfb_varying->buffer); ctx.Decorate(id, spv::Decoration::XfbStride, xfb_varying->stride); ctx.Decorate(id, spv::Decoration::Offset, xfb_varying->offset); + if (ctx.stage == Stage::Geometry && xfb_varying->stream != 0) { + ctx.Decorate(id, spv::Decoration::Stream, xfb_varying->stream); + } } if (num_components < 4 || element > 0) { const std::string_view subswizzle{swizzle.substr(element, num_components)}; diff --git a/src/shader_recompiler/runtime_info.h b/src/shader_recompiler/runtime_info.h index be10a9bb08..e6e1284762 100644 --- a/src/shader_recompiler/runtime_info.h +++ b/src/shader_recompiler/runtime_info.h @@ -76,6 +76,7 @@ enum class TessSpacing { struct TransformFeedbackVarying { u32 buffer{}; + u32 stream{}; u32 stride{}; u32 offset{}; u32 components{}; diff --git a/src/video_core/buffer_cache/buffer_cache.h b/src/video_core/buffer_cache/buffer_cache.h index 014b4a318e..c857e90e02 100644 --- a/src/video_core/buffer_cache/buffer_cache.h +++ b/src/video_core/buffer_cache/buffer_cache.h @@ -1067,26 +1067,29 @@ void BufferCache

::BindHostTransformFeedbackBuffers() { HostBindings host_bindings; for (u32 index = 0; index < NUM_TRANSFORM_FEEDBACK_BUFFERS; ++index) { const Binding& binding = channel_state->transform_feedback_buffers[index]; - if (maxwell3d->regs.transform_feedback.controls[index].varying_count == 0 && - maxwell3d->regs.transform_feedback.controls[index].stride == 0) { - break; + const auto& control = maxwell3d->regs.transform_feedback.controls[index]; + const bool has_layout = control.varying_count != 0 || control.stride != 0; + + Buffer* host_buffer = &slot_buffers[NULL_BUFFER_ID]; + u32 offset = 0; + u32 size = 0; + + if (has_layout && binding.buffer_id != NULL_BUFFER_ID && binding.size != 0) { + Buffer& buffer = slot_buffers[binding.buffer_id]; + TouchBuffer(buffer, binding.buffer_id); + size = binding.size; + SynchronizeBuffer(buffer, binding.device_addr, size); + MarkWrittenBuffer(binding.buffer_id, binding.device_addr, size); + offset = buffer.Offset(binding.device_addr); + buffer.MarkUsage(offset, size); + host_buffer = &buffer; } - Buffer& buffer = slot_buffers[binding.buffer_id]; - TouchBuffer(buffer, binding.buffer_id); - const u32 size = binding.size; - SynchronizeBuffer(buffer, binding.device_addr, size); - MarkWrittenBuffer(binding.buffer_id, binding.device_addr, size); - - const u32 offset = buffer.Offset(binding.device_addr); - buffer.MarkUsage(offset, size); - host_bindings.buffers.push_back(&buffer); + host_bindings.buffers.push_back(host_buffer); host_bindings.offsets.push_back(offset); host_bindings.sizes.push_back(size); } - if (host_bindings.buffers.size() > 0) { - runtime.BindTransformFeedbackBuffers(host_bindings); - } + runtime.BindTransformFeedbackBuffers(host_bindings); } template diff --git a/src/video_core/gpu_thread.cpp b/src/video_core/gpu_thread.cpp index 8d8d857a02..63a2516399 100644 --- a/src/video_core/gpu_thread.cpp +++ b/src/video_core/gpu_thread.cpp @@ -1,4 +1,4 @@ -// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later // SPDX-FileCopyrightText: Copyright 2019 yuzu Emulator Project @@ -92,7 +92,16 @@ void ThreadManager::InvalidateRegion(DAddr addr, u64 size) { } void ThreadManager::FlushAndInvalidateRegion(DAddr addr, u64 size) { - // Skip flush on asynch mode, as FlushAndInvalidateRegion is not used for anything too important + if (Settings::IsGPULevelHigh()) { + if (!is_async) { + PushCommand(FlushRegionCommand(addr, size)); + } else { + auto& gpu = system.GPU(); + const u64 fence = gpu.RequestFlush(addr, size); + TickGPU(); + gpu.WaitForSyncOperation(fence); + } + } rasterizer->OnCacheInvalidation(addr, size); } diff --git a/src/video_core/host_shaders/resolve_conditional_render.comp b/src/video_core/host_shaders/resolve_conditional_render.comp index 307e77d1ad..3bc92f94fa 100644 --- a/src/video_core/host_shaders/resolve_conditional_render.comp +++ b/src/video_core/host_shaders/resolve_conditional_render.comp @@ -1,3 +1,6 @@ +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project +// SPDX-License-Identifier: GPL-3.0-or-later + // SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later @@ -5,16 +8,22 @@ layout(local_size_x = 1) in; -layout(std430, binding = 0) buffer Query { - uvec2 initial; - uvec2 unknown; - uvec2 current; +layout(std430, binding = 0) readonly buffer Query { + uint data[]; }; -layout(std430, binding = 1) buffer Result { +layout(std430, binding = 1) writeonly buffer Result { uint result; }; +layout(push_constant) uniform PushConstants { + uint compare_to_zero; +}; + void main() { - result = all(equal(initial, current)) ? 1 : 0; + if (compare_to_zero != 0u) { + result = (data[0] != 0u && data[1] != 0u) ? 1u : 0u; + } else { + result = (data[0] == data[4] && data[1] == data[5]) ? 1u : 0u; + } } diff --git a/src/video_core/macro.cpp b/src/video_core/macro.cpp index 66cea5afbd..2cda78c459 100644 --- a/src/video_core/macro.cpp +++ b/src/video_core/macro.cpp @@ -285,11 +285,11 @@ void HLE_MultiDrawIndexedIndirectCount::Fallback(Engines::Maxwell3D& maxwell3d, } void HLE_DrawIndirectByteCount::Execute(Engines::Maxwell3D& maxwell3d, std::span parameters, [[maybe_unused]] u32 method) { const bool force = maxwell3d.Rasterizer().HasDrawTransformFeedback(); - auto topology = Maxwell3D::Regs::PrimitiveTopology(parameters[0] & 0xFFFFU); - if (!force && (!maxwell3d.AnyParametersDirty() || !IsTopologySafe(topology))) { + if (!force) { Fallback(maxwell3d, parameters); return; } + auto topology = Maxwell3D::Regs::PrimitiveTopology(parameters[0] & 0xFFFFU); auto& params = maxwell3d.draw_manager->GetIndirectParams(); params.is_byte_count = true; params.is_indexed = false; diff --git a/src/video_core/query_cache/query_cache.h b/src/video_core/query_cache/query_cache.h index 4ed42487aa..6bed91a53e 100644 --- a/src/video_core/query_cache/query_cache.h +++ b/src/video_core/query_cache/query_cache.h @@ -412,6 +412,7 @@ bool QueryCacheBase::AccelerateHostConditionalRendering() { .found_query = nullptr, }; } + it_current = it_current_2; } auto* query = impl->ObtainQuery(it_current->second); qc_dirty |= True(query->flags & QueryFlagBits::IsHostManaged) && diff --git a/src/video_core/query_cache/query_stream.h b/src/video_core/query_cache/query_stream.h index 1d11b12752..b4dc9d1815 100644 --- a/src/video_core/query_cache/query_stream.h +++ b/src/video_core/query_cache/query_stream.h @@ -1,3 +1,6 @@ +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project +// SPDX-License-Identifier: GPL-3.0-or-later + // SPDX-FileCopyrightText: Copyright 2023 yuzu Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later @@ -75,7 +78,7 @@ public: } u64 GetDependentMask() const { - return dependence_mask; + return dependent_mask; } u64 GetAmendValue() const { diff --git a/src/video_core/renderer_opengl/gl_rasterizer.cpp b/src/video_core/renderer_opengl/gl_rasterizer.cpp index e268c4d2c6..26826a8f78 100644 --- a/src/video_core/renderer_opengl/gl_rasterizer.cpp +++ b/src/video_core/renderer_opengl/gl_rasterizer.cpp @@ -629,6 +629,9 @@ void RasterizerOpenGL::ReleaseFences(bool force) { void RasterizerOpenGL::FlushAndInvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) { + if (Settings::IsGPULevelHigh()) { + FlushRegion(addr, size, which); + } InvalidateRegion(addr, size, which); } diff --git a/src/video_core/renderer_vulkan/fixed_pipeline_state.cpp b/src/video_core/renderer_vulkan/fixed_pipeline_state.cpp index 866b721a84..d53cf70b37 100644 --- a/src/video_core/renderer_vulkan/fixed_pipeline_state.cpp +++ b/src/video_core/renderer_vulkan/fixed_pipeline_state.cpp @@ -190,9 +190,7 @@ void FixedPipelineState::Refresh(Tegra::Engines::Maxwell3D& maxwell3d, DynamicFe } } } - if (!extended_dynamic_state_3_enables) { - dynamic_state.Refresh3(regs); - } + dynamic_state.Refresh3(regs, features); if (xfb_enabled) { RefreshXfbState(xfb_state, regs); } @@ -295,16 +293,22 @@ void FixedPipelineState::DynamicState::Refresh2(const Maxwell& regs, depth_bias_enable.Assign(enabled_lut[POLYGON_OFFSET_ENABLE_LUT[topology_index]] != 0 ? 1 : 0); } -void FixedPipelineState::DynamicState::Refresh3(const Maxwell& regs) { - logic_op_enable.Assign(regs.logic_op.enable != 0 ? 1 : 0); - depth_clamp_disabled.Assign(regs.viewport_clip_control.geometry_clip == - Maxwell::ViewportClipControl::GeometryClip::Passthrough || - regs.viewport_clip_control.geometry_clip == - Maxwell::ViewportClipControl::GeometryClip::FrustumXYZ || - regs.viewport_clip_control.geometry_clip == - Maxwell::ViewportClipControl::GeometryClip::FrustumZ); - - line_stipple_enable.Assign(regs.line_stipple_enable); +void FixedPipelineState::DynamicState::Refresh3(const Maxwell& regs, + const DynamicFeatures& features) { + if (!features.has_dynamic_state3_logic_op_enable) { + logic_op_enable.Assign(regs.logic_op.enable != 0 ? 1 : 0); + } + if (!features.has_dynamic_state3_depth_clamp_enable) { + depth_clamp_disabled.Assign(regs.viewport_clip_control.geometry_clip == + Maxwell::ViewportClipControl::GeometryClip::Passthrough || + regs.viewport_clip_control.geometry_clip == + Maxwell::ViewportClipControl::GeometryClip::FrustumXYZ || + regs.viewport_clip_control.geometry_clip == + Maxwell::ViewportClipControl::GeometryClip::FrustumZ); + } + if (!features.has_dynamic_state3_line_stipple_enable) { + line_stipple_enable.Assign(regs.line_stipple_enable); + } } size_t FixedPipelineState::Hash() const noexcept { diff --git a/src/video_core/renderer_vulkan/fixed_pipeline_state.h b/src/video_core/renderer_vulkan/fixed_pipeline_state.h index ffc91e9a55..44157d686d 100644 --- a/src/video_core/renderer_vulkan/fixed_pipeline_state.h +++ b/src/video_core/renderer_vulkan/fixed_pipeline_state.h @@ -27,6 +27,9 @@ struct DynamicFeatures { bool has_extended_dynamic_state_2_patch_control_points; bool has_extended_dynamic_state_3_blend; bool has_extended_dynamic_state_3_enables; + bool has_dynamic_state3_depth_clamp_enable; + bool has_dynamic_state3_logic_op_enable; + bool has_dynamic_state3_line_stipple_enable; bool has_dynamic_vertex_input; bool has_provoking_vertex; bool has_provoking_vertex_first_mode; @@ -175,7 +178,7 @@ struct FixedPipelineState { void Refresh(const Maxwell& regs); void Refresh2(const Maxwell& regs, Maxwell::PrimitiveTopology topology, bool base_features_supported); - void Refresh3(const Maxwell& regs); + void Refresh3(const Maxwell& regs, const DynamicFeatures& features); Maxwell::ComparisonOp DepthTestFunc() const noexcept { return UnpackComparisonOp(depth_test_func); @@ -265,8 +268,7 @@ struct FixedPipelineState { return sizeof(*this); } if (dynamic_vertex_input && extended_dynamic_state_3_blend) { - // Exclude dynamic state and attributes - return offsetof(FixedPipelineState, dynamic_state); + return offsetof(FixedPipelineState, attachments); } if (dynamic_vertex_input) { // Exclude dynamic state diff --git a/src/video_core/renderer_vulkan/renderer_vulkan.cpp b/src/video_core/renderer_vulkan/renderer_vulkan.cpp index 010cfd225d..2deec13ace 100644 --- a/src/video_core/renderer_vulkan/renderer_vulkan.cpp +++ b/src/video_core/renderer_vulkan/renderer_vulkan.cpp @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -171,7 +172,11 @@ try RendererVulkan::~RendererVulkan() { scheduler.RegisterOnSubmit([] {}); - void(device.GetLogical().WaitIdle()); + scheduler.Finish(); + { + std::scoped_lock lock{scheduler.submit_mutex}; + void(device.GetLogical().WaitIdle()); + } } void RendererVulkan::Composite(std::span framebuffers) { diff --git a/src/video_core/renderer_vulkan/vk_blit_screen.cpp b/src/video_core/renderer_vulkan/vk_blit_screen.cpp index bb7eb9bdaa..75a8c3bf91 100644 --- a/src/video_core/renderer_vulkan/vk_blit_screen.cpp +++ b/src/video_core/renderer_vulkan/vk_blit_screen.cpp @@ -8,6 +8,7 @@ // SPDX-License-Identifier: GPL-2.0-or-later #include +#include #include "video_core/framebuffer_config.h" #include "video_core/present.h" #include "video_core/renderer_vulkan/present/filters.h" @@ -31,7 +32,10 @@ BlitScreen::~BlitScreen() = default; void BlitScreen::WaitIdle() { present_manager.WaitPresent(); scheduler.Finish(); - device.GetLogical().WaitIdle(); + { + std::scoped_lock lock{scheduler.submit_mutex}; + device.GetLogical().WaitIdle(); + } } void BlitScreen::SetWindowAdaptPass() { diff --git a/src/video_core/renderer_vulkan/vk_buffer_cache.cpp b/src/video_core/renderer_vulkan/vk_buffer_cache.cpp index 74f06427dd..a359502046 100644 --- a/src/video_core/renderer_vulkan/vk_buffer_cache.cpp +++ b/src/video_core/renderer_vulkan/vk_buffer_cache.cpp @@ -637,12 +637,10 @@ void BufferCacheRuntime::BindTransformFeedbackBuffers(VideoCommon::HostBindings< for (u32 i = 0; i < bindings.buffers.size(); ++i) { auto handle = bindings.buffers[i]->Handle(); if (handle == VK_NULL_HANDLE) { + ReserveNullBuffer(); + handle = *null_buffer; bindings.offsets[i] = 0; - bindings.sizes[i] = VK_WHOLE_SIZE; - if (!device.HasNullDescriptor()) { - ReserveNullBuffer(); - handle = *null_buffer; - } + bindings.sizes[i] = 0; } buffer_handles[i] = handle; } diff --git a/src/video_core/renderer_vulkan/vk_compute_pass.cpp b/src/video_core/renderer_vulkan/vk_compute_pass.cpp index d45a57f7bb..f198b65d69 100644 --- a/src/video_core/renderer_vulkan/vk_compute_pass.cpp +++ b/src/video_core/renderer_vulkan/vk_compute_pass.cpp @@ -228,6 +228,10 @@ struct QueriesPrefixScanPushConstants { u32 accumulation_limit; u32 buffer_offset; }; + +struct ConditionalRenderingResolvePushConstants { + u32 compare_to_zero; +}; } // Anonymous namespace ComputePass::ComputePass(const Device& device_, DescriptorPool& descriptor_pool, @@ -413,7 +417,8 @@ ConditionalRenderingResolvePass::ConditionalRenderingResolvePass( const Device& device_, Scheduler& scheduler_, DescriptorPool& descriptor_pool_, ComputePassDescriptorQueue& compute_pass_descriptor_queue_) : ComputePass(device_, descriptor_pool_, INPUT_OUTPUT_DESCRIPTOR_SET_BINDINGS, - INPUT_OUTPUT_DESCRIPTOR_UPDATE_TEMPLATE, INPUT_OUTPUT_BANK_INFO, nullptr, + INPUT_OUTPUT_DESCRIPTOR_UPDATE_TEMPLATE, INPUT_OUTPUT_BANK_INFO, + COMPUTE_PUSH_CONSTANT_RANGE, RESOLVE_CONDITIONAL_RENDER_COMP_SPV), scheduler{scheduler_}, compute_pass_descriptor_queue{compute_pass_descriptor_queue_} {} @@ -430,7 +435,7 @@ void ConditionalRenderingResolvePass::Resolve(VkBuffer dst_buffer, VkBuffer src_ const void* const descriptor_data{compute_pass_descriptor_queue.UpdateData()}; scheduler.RequestOutsideRenderPassOperationContext(); - scheduler.Record([this, descriptor_data](vk::CommandBuffer cmdbuf) { + scheduler.Record([this, descriptor_data, compare_to_zero](vk::CommandBuffer cmdbuf) { static constexpr VkMemoryBarrier read_barrier{ .sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER, .pNext = nullptr, @@ -443,6 +448,9 @@ void ConditionalRenderingResolvePass::Resolve(VkBuffer dst_buffer, VkBuffer src_ .srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT, .dstAccessMask = VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT_EXT, }; + const ConditionalRenderingResolvePushConstants uniforms{ + .compare_to_zero = compare_to_zero ? 1U : 0U, + }; const VkDescriptorSet set = descriptor_allocator.Commit(); device.GetLogical().UpdateDescriptorSet(set, *descriptor_template, descriptor_data); @@ -450,9 +458,11 @@ void ConditionalRenderingResolvePass::Resolve(VkBuffer dst_buffer, VkBuffer src_ VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, 0, read_barrier); cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, *pipeline); cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, *layout, 0, set, {}); + cmdbuf.PushConstants(*layout, VK_SHADER_STAGE_COMPUTE_BIT, uniforms); cmdbuf.Dispatch(1, 1, 1); cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, - VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0, write_barrier); + VK_PIPELINE_STAGE_CONDITIONAL_RENDERING_BIT_EXT, 0, + write_barrier); }); } @@ -520,7 +530,7 @@ void QueriesPrefixScanPass::Run(VkBuffer accumulation_buffer, VkBuffer dst_buffe const VkDescriptorSet set = descriptor_allocator.Commit(); device.GetLogical().UpdateDescriptorSet(set, *descriptor_template, descriptor_data); - cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, + cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, 0, read_barrier); cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, *pipeline); cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, *layout, 0, set, {}); diff --git a/src/video_core/renderer_vulkan/vk_graphics_pipeline.cpp b/src/video_core/renderer_vulkan/vk_graphics_pipeline.cpp index 8f706a02c8..43fbefe425 100644 --- a/src/video_core/renderer_vulkan/vk_graphics_pipeline.cpp +++ b/src/video_core/renderer_vulkan/vk_graphics_pipeline.cpp @@ -467,6 +467,10 @@ bool GraphicsPipeline::ConfigureImpl(bool is_indexed) { bind_stage_info(4); } + if (regs.transform_feedback_enabled != 0) { + scheduler.RequestOutsideRenderPassOperationContext(); + } + buffer_cache.UpdateGraphicsBuffers(is_indexed); buffer_cache.BindHostGeometryBuffers(is_indexed); diff --git a/src/video_core/renderer_vulkan/vk_pipeline_cache.cpp b/src/video_core/renderer_vulkan/vk_pipeline_cache.cpp index 5f86506961..7b5ac7fac0 100644 --- a/src/video_core/renderer_vulkan/vk_pipeline_cache.cpp +++ b/src/video_core/renderer_vulkan/vk_pipeline_cache.cpp @@ -485,6 +485,12 @@ PipelineCache::PipelineCache(Tegra::MaxwellDeviceMemoryManager& device_memory_, device.IsExtExtendedDynamicState3BlendingSupported(); dynamic_features.has_extended_dynamic_state_3_enables = device.IsExtExtendedDynamicState3EnablesSupported(); + dynamic_features.has_dynamic_state3_depth_clamp_enable = + device.SupportsDynamicState3DepthClampEnable(); + dynamic_features.has_dynamic_state3_logic_op_enable = + device.SupportsDynamicState3LogicOpEnable(); + dynamic_features.has_dynamic_state3_line_stipple_enable = + device.SupportsDynamicState3LineStippleEnable(); // VIDS: Independent toggle (not affected by dyna_state levels) dynamic_features.has_dynamic_vertex_input = diff --git a/src/video_core/renderer_vulkan/vk_present_manager.cpp b/src/video_core/renderer_vulkan/vk_present_manager.cpp index 80853362ad..de854554c7 100644 --- a/src/video_core/renderer_vulkan/vk_present_manager.cpp +++ b/src/video_core/renderer_vulkan/vk_present_manager.cpp @@ -189,7 +189,7 @@ void PresentManager::RecreateFrame(Frame* frame, u32 width, u32 height, VkFormat frame->image = memory_allocator.CreateImage({ .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, .pNext = nullptr, - .flags = VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT, + .flags = VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT | VK_IMAGE_CREATE_EXTENDED_USAGE_BIT, .imageType = VK_IMAGE_TYPE_2D, .format = swapchain.GetImageFormat(), .extent = diff --git a/src/video_core/renderer_vulkan/vk_query_cache.cpp b/src/video_core/renderer_vulkan/vk_query_cache.cpp index 8518d89eee..6aea5d18a8 100644 --- a/src/video_core/renderer_vulkan/vk_query_cache.cpp +++ b/src/video_core/renderer_vulkan/vk_query_cache.cpp @@ -157,8 +157,9 @@ public: ReserveHostQuery(); scheduler.Record([query_pool = current_query_pool, - query_index = current_bank_slot](vk::CommandBuffer cmdbuf) { + query_index = current_bank_slot](vk::CommandBuffer cmdbuf) { const bool use_precise = Settings::IsGPULevelHigh(); + cmdbuf.ResetQueryPool(query_pool, static_cast(query_index), 1); cmdbuf.BeginQuery(query_pool, static_cast(query_index), use_precise ? VK_QUERY_CONTROL_PRECISE_BIT : 0); }); @@ -220,8 +221,7 @@ public: } PauseCounter(); const auto driver_id = device.GetDriverID(); - if (driver_id == VK_DRIVER_ID_QUALCOMM_PROPRIETARY || - driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) { + if (driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) { pending_sync.clear(); sync_values_stash.clear(); return; @@ -666,13 +666,18 @@ public: offsets.fill(0); last_queries.fill(0); last_queries_stride.fill(1); + stream_to_slot.fill(INVALID_SLOT); + VkBufferUsageFlags counter_buffer_usage = + VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT; + if (device.IsExtTransformFeedbackSupported()) { + counter_buffer_usage |= VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_COUNTER_BUFFER_BIT_EXT; + } const VkBufferCreateInfo buffer_ci = { .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, .pNext = nullptr, .flags = 0, .size = TFBQueryBank::QUERY_SIZE * NUM_STREAMS, - .usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT | - VK_BUFFER_USAGE_TRANSFORM_FEEDBACK_COUNTER_BUFFER_BIT_EXT, + .usage = counter_buffer_usage, .sharingMode = VK_SHARING_MODE_EXCLUSIVE, .queueFamilyIndexCount = 0, .pQueueFamilyIndices = nullptr, @@ -692,6 +697,9 @@ public: ~TFBCounterStreamer() = default; void StartCounter() override { + if (!device.IsExtTransformFeedbackSupported()) { + return; + } FlushBeginTFB(); has_started = true; } @@ -706,7 +714,9 @@ public: void CloseCounter() override { if (has_flushed_end_pending) { - FlushEndTFB(); + if (scheduler.IsRenderPassActive()) { + FlushEndTFB(); + } } runtime.View3DRegs([this](Maxwell3D& maxwell3d) { if (maxwell3d.regs.transform_feedback_enabled == 0) { @@ -756,18 +766,33 @@ public: if (has_timestamp) { new_query->flags |= VideoCommon::QueryFlagBits::HasTimestamp; } + if (!device.IsExtTransformFeedbackSupported()) { + new_query->flags |= VideoCommon::QueryFlagBits::IsFinalValueSynced; + return index; + } if (!subreport_) { new_query->flags |= VideoCommon::QueryFlagBits::IsFinalValueSynced; return index; } const size_t subreport = static_cast(*subreport_); + if (subreport >= NUM_STREAMS) { + new_query->flags |= VideoCommon::QueryFlagBits::IsFinalValueSynced; + return index; + } last_queries[subreport] = address; if ((streams_mask & (1ULL << subreport)) == 0) { new_query->flags |= VideoCommon::QueryFlagBits::IsFinalValueSynced; return index; } + const size_t slot = stream_to_slot[subreport]; + if (slot >= NUM_STREAMS) { + new_query->flags |= VideoCommon::QueryFlagBits::IsFinalValueSynced; + return index; + } + + scheduler.RequestOutsideRenderPassOperationContext(); CloseCounter(); - auto [bank_slot, data_slot] = ProduceCounterBuffer(subreport); + auto [bank_slot, data_slot] = ProduceCounterBuffer(slot); new_query->start_bank_id = static_cast(bank_slot); new_query->size_banks = 1; new_query->start_slot = static_cast(data_slot); @@ -778,6 +803,9 @@ public: } std::optional> GetLastQueryStream(size_t stream) { + if (stream >= NUM_STREAMS) { + return std::nullopt; + } if (last_queries[stream] != 0) { std::pair result(last_queries[stream], last_queries_stride[stream]); return result; @@ -789,6 +817,10 @@ public: return out_topology; } + u32 GetPatchVertices() const { + return patch_vertices; + } + bool HasUnsyncedQueries() const override { return !pending_flush_queries.empty(); } @@ -855,6 +887,9 @@ public: private: void FlushBeginTFB() { + if (!device.IsExtTransformFeedbackSupported()) [[unlikely]] { + return; + } if (has_flushed_end_pending) [[unlikely]] { return; } @@ -868,12 +903,24 @@ private: }); return; } + static constexpr VkMemoryBarrier COUNTER_RESUME_BARRIER{ + .sType = VK_STRUCTURE_TYPE_MEMORY_BARRIER, + .pNext = nullptr, + .srcAccessMask = VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_WRITE_BIT_EXT, + .dstAccessMask = VK_ACCESS_TRANSFORM_FEEDBACK_COUNTER_READ_BIT_EXT, + }; scheduler.Record([this, total = static_cast(buffers_count)](vk::CommandBuffer cmdbuf) { + cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT, + VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT, 0, + COUNTER_RESUME_BARRIER); cmdbuf.BeginTransformFeedbackEXT(0, total, counter_buffers.data(), offsets.data()); }); } void FlushEndTFB() { + if (!device.IsExtTransformFeedbackSupported()) [[unlikely]] { + return; + } if (!has_flushed_end_pending) [[unlikely]] { UNREACHABLE(); return; @@ -899,28 +946,48 @@ private: void UpdateBuffers() { last_queries.fill(0); last_queries_stride.fill(1); + stream_to_slot.fill(INVALID_SLOT); streams_mask = 0; // reset previously recorded streams runtime.View3DRegs([this](Maxwell3D& maxwell3d) { buffers_count = 0; out_topology = maxwell3d.draw_manager->GetDrawState().topology; + patch_vertices = std::max(maxwell3d.regs.patch_vertices, 1U); + if (out_topology == Maxwell3D::Regs::PrimitiveTopology::Patches) { + switch (maxwell3d.regs.tessellation.params.output_primitives.Value()) { + case Maxwell3D::Regs::Tessellation::OutputPrimitives::Points: + out_topology = Maxwell3D::Regs::PrimitiveTopology::Points; + break; + case Maxwell3D::Regs::Tessellation::OutputPrimitives::Lines: + out_topology = Maxwell3D::Regs::PrimitiveTopology::LineStrip; + break; + case Maxwell3D::Regs::Tessellation::OutputPrimitives::Triangles_CW: + case Maxwell3D::Regs::Tessellation::OutputPrimitives::Triangles_CCW: + out_topology = Maxwell3D::Regs::PrimitiveTopology::TriangleStrip; + break; + } + } for (size_t i = 0; i < Maxwell3D::Regs::NumTransformFeedbackBuffers; i++) { const auto& tf = maxwell3d.regs.transform_feedback; if (tf.buffers[i].enable == 0) { continue; } + buffers_count = std::max(buffers_count, i + 1); const size_t stream = tf.controls[i].stream; if (stream >= last_queries_stride.size()) { LOG_WARNING(Render_Vulkan, "TransformFeedback stream {} out of range", stream); continue; } + if ((streams_mask & (1ULL << stream)) != 0) { + continue; + } last_queries_stride[stream] = tf.controls[i].stride; + stream_to_slot[stream] = i; streams_mask |= 1ULL << stream; - buffers_count = std::max(buffers_count, stream + 1); } }); } - std::pair ProduceCounterBuffer(size_t stream) { + std::pair ProduceCounterBuffer(size_t slot_index) { if (current_bank == nullptr || current_bank->IsClosed()) { current_bank_id = bank_pool.ReserveBank([this](std::deque& queue, size_t index) { @@ -946,7 +1013,8 @@ private: }; scheduler.RequestOutsideRenderPassOperationContext(); scheduler.Record([dst_buffer = current_bank->GetBuffer(), - src_buffer = counter_buffers[stream], src_offset = offsets[stream], + src_buffer = counter_buffers[slot_index], + src_offset = offsets[slot_index], slot](vk::CommandBuffer cmdbuf) { cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFORM_FEEDBACK_BIT_EXT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0, READ_BARRIER); @@ -965,6 +1033,7 @@ private: friend class PrimitivesSucceededStreamer; static constexpr size_t NUM_STREAMS = 4; + static constexpr size_t INVALID_SLOT = NUM_STREAMS; QueryCacheRuntime& runtime; const Device& device; @@ -994,7 +1063,9 @@ private: std::array offsets{}; std::array last_queries; std::array last_queries_stride; + std::array stream_to_slot; Maxwell3D::Regs::PrimitiveTopology out_topology; + u32 patch_vertices{1}; u64 streams_mask; }; @@ -1015,6 +1086,7 @@ public: u64 stride{}; DAddr dependant_address{}; Maxwell3D::Regs::PrimitiveTopology topology{Maxwell3D::Regs::PrimitiveTopology::Points}; + u32 patch_vertices{1}; size_t dependant_index{}; bool dependant_manage{}; }; @@ -1031,6 +1103,10 @@ public: ~PrimitivesSucceededStreamer() = default; + void ResetCounter() override { + tfb_streamer.ResetCounter(); + } + size_t WriteCounter(DAddr address, bool has_timestamp, u32 value, std::optional subreport_) override { auto index = BuildQuery(); @@ -1048,6 +1124,7 @@ public: auto dependant_address_opt = tfb_streamer.GetLastQueryStream(subreport); bool must_manage_dependance = false; new_query->topology = tfb_streamer.GetOutputTopology(); + new_query->patch_vertices = tfb_streamer.GetPatchVertices(); if (dependant_address_opt) { auto [dep_address, stride] = *dependant_address_opt; new_query->dependant_address = dep_address; @@ -1068,6 +1145,7 @@ public: } new_query->stride = 1; runtime.View3DRegs([new_query, subreport](Maxwell3D& maxwell3d) { + new_query->patch_vertices = std::max(maxwell3d.regs.patch_vertices, 1U); for (size_t i = 0; i < Maxwell3D::Regs::NumTransformFeedbackBuffers; i++) { const auto& tf = maxwell3d.regs.transform_feedback; if (tf.buffers[i].enable == 0) { @@ -1131,27 +1209,39 @@ public: } } query->value = [&]() -> u64 { + const auto saturating_subtract = [](u64 value, u64 amount) { + return value > amount ? value - amount : 0; + }; switch (query->topology) { case Maxwell3D::Regs::PrimitiveTopology::Points: return num_vertices; case Maxwell3D::Regs::PrimitiveTopology::Lines: return num_vertices / 2; case Maxwell3D::Regs::PrimitiveTopology::LineLoop: - return (num_vertices / 2) + 1; + return num_vertices > 1 ? num_vertices : 0; case Maxwell3D::Regs::PrimitiveTopology::LineStrip: - return num_vertices - 1; - case Maxwell3D::Regs::PrimitiveTopology::Patches: + return saturating_subtract(num_vertices, 1); + case Maxwell3D::Regs::PrimitiveTopology::LinesAdjacency: + return num_vertices / 4; + case Maxwell3D::Regs::PrimitiveTopology::LineStripAdjacency: + return saturating_subtract(num_vertices, 3); case Maxwell3D::Regs::PrimitiveTopology::Triangles: - case Maxwell3D::Regs::PrimitiveTopology::TrianglesAdjacency: return num_vertices / 3; + case Maxwell3D::Regs::PrimitiveTopology::TrianglesAdjacency: + return num_vertices / 6; case Maxwell3D::Regs::PrimitiveTopology::TriangleFan: case Maxwell3D::Regs::PrimitiveTopology::TriangleStrip: + return saturating_subtract(num_vertices, 2); case Maxwell3D::Regs::PrimitiveTopology::TriangleStripAdjacency: - return num_vertices - 2; + return num_vertices > 4 ? (num_vertices - 4) / 2 : 0; case Maxwell3D::Regs::PrimitiveTopology::Quads: return num_vertices / 4; + case Maxwell3D::Regs::PrimitiveTopology::QuadStrip: + return num_vertices > 2 ? (num_vertices - 2) / 2 : 0; case Maxwell3D::Regs::PrimitiveTopology::Polygon: - return 1U; + return num_vertices >= 3 ? 1U : 0U; + case Maxwell3D::Regs::PrimitiveTopology::Patches: + return num_vertices / std::max(query->patch_vertices, 1U); default: return num_vertices; } @@ -1202,16 +1292,24 @@ struct QueryCacheRuntimeImpl { hcr_setup.pNext = nullptr; hcr_setup.flags = 0; - conditional_resolve_pass = std::make_unique( - device, scheduler, descriptor_pool, compute_pass_descriptor_queue); + const bool has_conditional_rendering = device.IsExtConditionalRendering(); + if (has_conditional_rendering) { + conditional_resolve_pass = std::make_unique( + device, scheduler, descriptor_pool, compute_pass_descriptor_queue); + } + + VkBufferUsageFlags hcr_buffer_usage = + VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT; + if (has_conditional_rendering) { + hcr_buffer_usage |= VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT; + } const VkBufferCreateInfo buffer_ci = { .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO, .pNext = nullptr, .flags = 0, .size = sizeof(u32), - .usage = VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | - VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT, + .usage = hcr_buffer_usage, .sharingMode = VK_SHARING_MODE_EXCLUSIVE, .queueFamilyIndexCount = 0, .pQueueFamilyIndices = nullptr, @@ -1338,15 +1436,17 @@ void QueryCacheRuntime::HostConditionalRenderingCompareValueImpl(VideoCommon::Lo } } -void QueryCacheRuntime::HostConditionalRenderingCompareBCImpl(DAddr address, bool is_equal) { +void QueryCacheRuntime::HostConditionalRenderingCompareBCImpl(DAddr address, bool is_equal, + bool compare_to_zero) { VkBuffer to_resolve; u32 to_resolve_offset; + const u32 resolve_size = compare_to_zero ? 8 : 24; { std::scoped_lock lk(impl->buffer_cache.mutex); - static constexpr auto sync_info = VideoCommon::ObtainBufferSynchronize::NoSynchronize; + const auto sync_info = VideoCommon::ObtainBufferSynchronize::FullSynchronize; const auto post_op = VideoCommon::ObtainBufferOperation::DoNothing; const auto [buffer, offset] = - impl->buffer_cache.ObtainCPUBuffer(address, 24, sync_info, post_op); + impl->buffer_cache.ObtainCPUBuffer(address, resolve_size, sync_info, post_op); to_resolve = buffer->Handle(); to_resolve_offset = static_cast(offset); } @@ -1355,7 +1455,7 @@ void QueryCacheRuntime::HostConditionalRenderingCompareBCImpl(DAddr address, boo PauseHostConditionalRendering(); } impl->conditional_resolve_pass->Resolve(*impl->hcr_resolve_buffer, to_resolve, - to_resolve_offset, false); + to_resolve_offset, compare_to_zero); impl->hcr_setup.buffer = *impl->hcr_resolve_buffer; impl->hcr_setup.offset = 0; impl->hcr_setup.flags = is_equal ? 0 : VK_CONDITIONAL_RENDERING_INVERTED_BIT_EXT; @@ -1371,7 +1471,7 @@ bool QueryCacheRuntime::HostConditionalRenderingCompareValue(VideoCommon::Lookup if (!impl->device.IsExtConditionalRendering()) { return false; } - HostConditionalRenderingCompareValueImpl(object_1, false); + HostConditionalRenderingCompareBCImpl(object_1.address, true, true); return true; } @@ -1420,7 +1520,8 @@ bool QueryCacheRuntime::HostConditionalRenderingCompareValues(VideoCommon::Looku auto driver_id = impl->device.GetDriverID(); const bool is_gpu_high = Settings::IsGPULevelHigh(); - if ((!is_gpu_high && driver_id == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS) || driver_id == VK_DRIVER_ID_QUALCOMM_PROPRIETARY || driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) { + if ((!is_gpu_high && driver_id == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS) || driver_id == VK_DRIVER_ID_ARM_PROPRIETARY || driver_id == VK_DRIVER_ID_MESA_TURNIP) { + EndHostConditionalRendering(); return true; } @@ -1437,10 +1538,12 @@ bool QueryCacheRuntime::HostConditionalRenderingCompareValues(VideoCommon::Looku } if (!is_gpu_high) { + EndHostConditionalRendering(); return true; } if (!is_in_bc[0] && !is_in_bc[1]) { + EndHostConditionalRendering(); return true; } HostConditionalRenderingCompareBCImpl(object_1.address, equal_check); diff --git a/src/video_core/renderer_vulkan/vk_query_cache.h b/src/video_core/renderer_vulkan/vk_query_cache.h index e2aa4d991e..bbb5234e11 100644 --- a/src/video_core/renderer_vulkan/vk_query_cache.h +++ b/src/video_core/renderer_vulkan/vk_query_cache.h @@ -63,7 +63,8 @@ public: private: void HostConditionalRenderingCompareValueImpl(VideoCommon::LookupData object, bool is_equal); - void HostConditionalRenderingCompareBCImpl(DAddr address, bool is_equal); + void HostConditionalRenderingCompareBCImpl(DAddr address, bool is_equal, + bool compare_to_zero = false); friend struct QueryCacheRuntimeImpl; std::unique_ptr impl; }; diff --git a/src/video_core/renderer_vulkan/vk_rasterizer.cpp b/src/video_core/renderer_vulkan/vk_rasterizer.cpp index ba7b5d8c1b..78337b3ebe 100644 --- a/src/video_core/renderer_vulkan/vk_rasterizer.cpp +++ b/src/video_core/renderer_vulkan/vk_rasterizer.cpp @@ -173,6 +173,28 @@ DrawParams MakeDrawParams(const MaxwellDrawState& draw_state, u32 num_instances, } return params; } + +bool SupportsPrimitiveRestart(VkPrimitiveTopology topology) { + switch (topology) { + case VK_PRIMITIVE_TOPOLOGY_POINT_LIST: + case VK_PRIMITIVE_TOPOLOGY_LINE_LIST: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST: + case VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY: + case VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY: + case VK_PRIMITIVE_TOPOLOGY_PATCH_LIST: + return false; + default: + return true; + } +} + +bool IsPrimitiveRestartSupported(const Device& device, VkPrimitiveTopology topology) { + return ((topology != VK_PRIMITIVE_TOPOLOGY_PATCH_LIST && + device.IsTopologyListPrimitiveRestartSupported()) || + SupportsPrimitiveRestart(topology) || + (topology == VK_PRIMITIVE_TOPOLOGY_PATCH_LIST && + device.IsPatchListPrimitiveRestartSupported())); +} } // Anonymous namespace RasterizerVulkan::RasterizerVulkan(Core::Frontend::EmuWindow& emu_window_, Tegra::GPU& gpu_, @@ -225,6 +247,7 @@ void RasterizerVulkan::PrepareDraw(bool is_indexed, Func&& draw_func) { UpdateDynamicStates(); + query_cache.NotifySegment(true); HandleTransformFeedback(); query_cache.CounterEnable(VideoCommon::QueryType::ZPassPixelCount64, maxwell3d->regs.zpass_pixel_count_enable); @@ -336,6 +359,7 @@ void RasterizerVulkan::DrawTexture() { UpdateDynamicStates(); + query_cache.NotifySegment(true); query_cache.CounterEnable(VideoCommon::QueryType::ZPassPixelCount64, maxwell3d->regs.zpass_pixel_count_enable); const auto& draw_texture_state = maxwell3d->draw_manager->GetDrawTextureState(); @@ -575,11 +599,17 @@ void RasterizerVulkan::DispatchCompute() { } void RasterizerVulkan::ResetCounter(VideoCommon::QueryType type) { - if (type != VideoCommon::QueryType::ZPassPixelCount64) { + switch (type) { + case VideoCommon::QueryType::ZPassPixelCount64: + case VideoCommon::QueryType::StreamingByteCount: + case VideoCommon::QueryType::StreamingPrimitivesSucceeded: + case VideoCommon::QueryType::VtgPrimitivesOut: + query_cache.CounterReset(type); + return; + default: LOG_DEBUG(Render_Vulkan, "Unimplemented counter reset={}", type); return; } - query_cache.CounterReset(type); } void RasterizerVulkan::Query(GPUVAddr gpu_addr, VideoCommon::QueryType type, @@ -766,6 +796,9 @@ void RasterizerVulkan::ReleaseFences(bool force) { void RasterizerVulkan::FlushAndInvalidateRegion(DAddr addr, u64 size, VideoCommon::CacheType which) { + if (Settings::IsGPULevelHigh()) { + FlushRegion(addr, size, which); + } InvalidateRegion(addr, size, which); } @@ -830,6 +863,10 @@ bool RasterizerVulkan::AccelerateConditionalRendering() { return query_cache.AccelerateHostConditionalRendering(); } +bool RasterizerVulkan::HasDrawTransformFeedback() { + return device.IsTransformFeedbackDrawSupported(); +} + bool RasterizerVulkan::AccelerateSurfaceCopy(const Tegra::Engines::Fermi2D::Surface& src, const Tegra::Engines::Fermi2D::Surface& dst, const Tegra::Engines::Fermi2D::Config& copy_config) { @@ -976,6 +1013,12 @@ bool AccelerateDMA::BufferToImage(const Tegra::DMA::ImageCopy& copy_info, void RasterizerVulkan::UpdateDynamicStates() { auto& regs = maxwell3d->regs; + auto& flags = maxwell3d->dirty.flags; + const auto topology = maxwell3d->draw_manager->GetDrawState().topology; + if (state_tracker.ChangePrimitiveTopology(topology)) { + flags[Dirty::DepthBiasEnable] = true; + flags[Dirty::PrimitiveRestartEnable] = true; + } // Core Dynamic States (Vulkan 1.0) - Always active regardless of dyna_state setting UpdateViewportsState(regs); @@ -1084,6 +1127,9 @@ void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D::Regs& reg if (!state_tracker.TouchViewports()) { return; } + + maxwell3d->dirty.flags[Dirty::Scissors] = true; + if (!regs.viewport_scale_offset_enabled) { float x = static_cast(regs.surface_clip.x); float y = static_cast(regs.surface_clip.y); @@ -1101,8 +1147,12 @@ void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D::Regs& reg .minDepth = 0.0f, .maxDepth = 1.0f, }; - scheduler.Record([viewport](vk::CommandBuffer cmdbuf) { - cmdbuf.SetViewport(0, viewport); + scheduler.Record([this, viewport](vk::CommandBuffer cmdbuf) { + const u32 num_viewports = std::min(device.GetMaxViewports(), Maxwell::NumViewports); + std::array viewport_list{}; + viewport_list.fill(viewport); + const vk::Span viewports(viewport_list.data(), num_viewports); + cmdbuf.SetViewport(0, viewports); }); return; } @@ -1142,8 +1192,12 @@ void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D::Regs& regs scissor.offset.y = static_cast(y); scissor.extent.width = width; scissor.extent.height = height; - scheduler.Record([scissor](vk::CommandBuffer cmdbuf) { - cmdbuf.SetScissor(0, scissor); + scheduler.Record([this, scissor](vk::CommandBuffer cmdbuf) { + const u32 num_scissors = std::min(device.GetMaxViewports(), Maxwell::NumViewports); + std::array scissor_list{}; + scissor_list.fill(scissor); + const vk::Span scissors(scissor_list.data(), num_scissors); + cmdbuf.SetScissor(0, scissors); }); return; } @@ -1388,7 +1442,17 @@ void RasterizerVulkan::UpdatePrimitiveRestartEnable(Tegra::Engines::Maxwell3D::R if (!state_tracker.TouchPrimitiveRestartEnable()) { return; } - scheduler.Record([enable = regs.primitive_restart.enabled](vk::CommandBuffer cmdbuf) { + + bool enable = regs.primitive_restart.enabled != 0; + if (device.IsMoltenVK()) { + enable = true; + } else if (enable) { + const auto topology = + MaxwellToVK::PrimitiveTopology(device, maxwell3d->draw_manager->GetDrawState().topology); + enable = IsPrimitiveRestartSupported(device, topology); + } + + scheduler.Record([enable](vk::CommandBuffer cmdbuf) { cmdbuf.SetPrimitiveRestartEnableEXT(enable); }); } @@ -1727,7 +1791,9 @@ void RasterizerVulkan::UpdateStencilTestEnable(Tegra::Engines::Maxwell3D::Regs& void RasterizerVulkan::UpdateVertexInput(Tegra::Engines::Maxwell3D::Regs& regs) { auto& dirty{maxwell3d->dirty.flags}; - if (!dirty[Dirty::VertexInput]) { + const bool vertex_input_dirty = dirty[Dirty::VertexInput]; + const bool vertex_buffers_dirty = dirty[VideoCommon::Dirty::VertexBuffers]; + if (!vertex_input_dirty && !vertex_buffers_dirty) { return; } dirty[Dirty::VertexInput] = false; @@ -1735,38 +1801,31 @@ void RasterizerVulkan::UpdateVertexInput(Tegra::Engines::Maxwell3D::Regs& regs) boost::container::static_vector bindings; boost::container::static_vector attributes; - // There seems to be a bug on Nvidia's driver where updating only higher attributes ends up - // generating dirty state. Track the highest dirty attribute and update all attributes until - // that one. - size_t highest_dirty_attr{}; - for (size_t index = 0; index < Maxwell::NumVertexAttributes; ++index) { - if (dirty[Dirty::VertexAttribute0 + index]) { - highest_dirty_attr = index; - } - } - for (size_t index = 0; index < highest_dirty_attr; ++index) { + const u32 max_attributes = + static_cast(std::min(Maxwell::NumVertexAttributes, + device.GetMaxVertexInputAttributes())); + const u32 max_bindings = + static_cast(std::min(Maxwell::NumVertexArrays, + device.GetMaxVertexInputBindings())); + + + for (u32 index = 0; index < max_attributes; ++index) { const Maxwell::VertexAttribute attribute{regs.vertex_attrib_format[index]}; const u32 binding{attribute.buffer}; - dirty[Dirty::VertexAttribute0 + index] = false; - dirty[Dirty::VertexBinding0 + static_cast(binding)] = true; - if (!attribute.constant) { - attributes.push_back({ - .sType = VK_STRUCTURE_TYPE_VERTEX_INPUT_ATTRIBUTE_DESCRIPTION_2_EXT, - .pNext = nullptr, - .location = static_cast(index), - .binding = binding, - .format = MaxwellToVK::VertexFormat(device, attribute.type, attribute.size), - .offset = attribute.offset, - }); - } - } - for (size_t index = 0; index < Maxwell::NumVertexAttributes; ++index) { - if (!dirty[Dirty::VertexBinding0 + index]) { + if (attribute.constant || binding >= max_bindings) { continue; } - dirty[Dirty::VertexBinding0 + index] = false; + attributes.push_back({ + .sType = VK_STRUCTURE_TYPE_VERTEX_INPUT_ATTRIBUTE_DESCRIPTION_2_EXT, + .pNext = nullptr, + .location = index, + .binding = binding, + .format = MaxwellToVK::VertexFormat(device, attribute.type, attribute.size), + .offset = attribute.offset, + }); + } - const u32 binding{static_cast(index)}; + for (u32 binding = 0; binding < max_bindings; ++binding) { const auto& input_binding{regs.vertex_streams[binding]}; const bool is_instanced{regs.vertex_stream_instances.IsInstancingEnabled(binding)}; bindings.push_back({ @@ -1778,6 +1837,14 @@ void RasterizerVulkan::UpdateVertexInput(Tegra::Engines::Maxwell3D::Regs& regs) .divisor = is_instanced ? input_binding.frequency : 1, }); } + + for (size_t index = 0; index < Maxwell::NumVertexAttributes; ++index) { + dirty[Dirty::VertexAttribute0 + index] = false; + } + for (size_t index = 0; index < Maxwell::NumVertexArrays; ++index) { + dirty[Dirty::VertexBinding0 + index] = false; + } + scheduler.Record([bindings, attributes](vk::CommandBuffer cmdbuf) { cmdbuf.SetVertexInputEXT(bindings, attributes); }); diff --git a/src/video_core/renderer_vulkan/vk_rasterizer.h b/src/video_core/renderer_vulkan/vk_rasterizer.h index b689c6b660..841933d31d 100644 --- a/src/video_core/renderer_vulkan/vk_rasterizer.h +++ b/src/video_core/renderer_vulkan/vk_rasterizer.h @@ -1,4 +1,4 @@ -// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later // SPDX-FileCopyrightText: Copyright 2019 yuzu Emulator Project @@ -122,6 +122,7 @@ public: void FlushCommands() override; void TickFrame() override; bool AccelerateConditionalRendering() override; + bool HasDrawTransformFeedback() override; bool AccelerateSurfaceCopy(const Tegra::Engines::Fermi2D::Surface& src, const Tegra::Engines::Fermi2D::Surface& dst, const Tegra::Engines::Fermi2D::Config& copy_config) override; diff --git a/src/video_core/renderer_vulkan/vk_scheduler.cpp b/src/video_core/renderer_vulkan/vk_scheduler.cpp index 226619d8d6..fdaf9baacc 100644 --- a/src/video_core/renderer_vulkan/vk_scheduler.cpp +++ b/src/video_core/renderer_vulkan/vk_scheduler.cpp @@ -324,6 +324,8 @@ void Scheduler::EndRenderPass() return; } + query_cache->CounterClose(VideoCommon::QueryType::StreamingByteCount); + // Log render pass end if (Settings::values.gpu_logging_enabled.GetValue() && Settings::values.gpu_log_vulkan_calls.GetValue()) { diff --git a/src/video_core/renderer_vulkan/vk_scheduler.h b/src/video_core/renderer_vulkan/vk_scheduler.h index 00a912f2cd..0709c3a370 100644 --- a/src/video_core/renderer_vulkan/vk_scheduler.h +++ b/src/video_core/renderer_vulkan/vk_scheduler.h @@ -63,6 +63,11 @@ public: /// of a renderpass. void RequestOutsideRenderPassOperationContext(); + /// Returns true when a render pass is currently active in the scheduler state. + bool IsRenderPassActive() const { + return state.renderpass != VK_NULL_HANDLE; + } + /// Update the pipeline to the current execution context. bool UpdateGraphicsPipeline(GraphicsPipeline* pipeline); diff --git a/src/video_core/renderer_vulkan/vk_state_tracker.cpp b/src/video_core/renderer_vulkan/vk_state_tracker.cpp index 79967d540a..3f4dd89c7e 100644 --- a/src/video_core/renderer_vulkan/vk_state_tracker.cpp +++ b/src/video_core/renderer_vulkan/vk_state_tracker.cpp @@ -1,4 +1,4 @@ -// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later // SPDX-FileCopyrightText: Copyright 2020 yuzu Emulator Project @@ -87,6 +87,7 @@ Flags MakeInvalidationFlags() { void SetupDirtyViewports(Tables& tables) { FillBlock(tables[0], OFF(viewport_transform), NUM(viewport_transform), Viewports); FillBlock(tables[0], OFF(viewports), NUM(viewports), Viewports); + FillBlock(tables[1], OFF(surface_clip), NUM(surface_clip), Viewports); tables[0][OFF(viewport_scale_offset_enabled)] = Viewports; tables[1][OFF(window_origin)] = Viewports; } diff --git a/src/video_core/renderer_vulkan/vk_state_tracker.h b/src/video_core/renderer_vulkan/vk_state_tracker.h index 74bae9e181..6b47ba4176 100644 --- a/src/video_core/renderer_vulkan/vk_state_tracker.h +++ b/src/video_core/renderer_vulkan/vk_state_tracker.h @@ -1,4 +1,4 @@ -// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later // SPDX-FileCopyrightText: Copyright 2020 yuzu Emulator Project diff --git a/src/video_core/renderer_vulkan/vk_texture_cache.cpp b/src/video_core/renderer_vulkan/vk_texture_cache.cpp index 48aa5ec476..f099db74cb 100644 --- a/src/video_core/renderer_vulkan/vk_texture_cache.cpp +++ b/src/video_core/renderer_vulkan/vk_texture_cache.cpp @@ -176,7 +176,18 @@ constexpr VkBorderColor ConvertBorderColor(const std::array& color) { .pViewFormats = view_formats.data(), }; if (view_formats.size() > 1) { - image_ci.flags |= VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT; + image_ci.flags |= + VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT | VK_IMAGE_CREATE_EXTENDED_USAGE_BIT; + + const bool has_storage_compatible_view = + std::any_of(view_formats.begin(), view_formats.end(), [&device](VkFormat view_format) { + return device.IsFormatSupported(view_format, VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT, + FormatType::Optimal); + }); + if (has_storage_compatible_view) { + image_ci.usage |= VK_IMAGE_USAGE_STORAGE_BIT; + } + if (device.IsKhrImageFormatListSupported()) { image_ci.pNext = &image_format_list; } @@ -668,11 +679,16 @@ void CopyBufferToImage(vk::CommandBuffer cmdbuf, VkBuffer src_buffer, VkImage im } void TryTransformSwizzleIfNeeded(PixelFormat format, std::array& swizzle, - bool emulate_a4b4g4r4) { + bool emulate_bgr565, bool emulate_a4b4g4r4) { switch (format) { case PixelFormat::A1B5G5R5_UNORM: std::ranges::transform(swizzle, swizzle.begin(), SwapBlueRed); break; + case PixelFormat::B5G6R5_UNORM: + if (emulate_bgr565) { + std::ranges::transform(swizzle, swizzle.begin(), SwapBlueRed); + } + break; case PixelFormat::A5B5G5R1_UNORM: std::ranges::transform(swizzle, swizzle.begin(), SwapSpecial); break; @@ -2119,22 +2135,21 @@ ImageView::ImageView(TextureCacheRuntime& runtime, const VideoCommon::ImageViewI if (!info.IsRenderTarget()) { swizzle = info.Swizzle(); TryTransformSwizzleIfNeeded(format, swizzle, - !device->IsExt4444FormatsSupported()); + device->MustEmulateBGR565(), + !device->IsExt4444FormatsSupported()); if ((aspect_mask & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) != 0) { std::ranges::transform(swizzle, swizzle.begin(), ConvertGreenRed); SanitizeDepthStencilSwizzle(swizzle, device->SupportsDepthStencilSwizzleOne()); } } const auto format_info = MaxwellToVK::SurfaceFormat(*device, FormatType::Optimal, true, format); - if (ImageUsageFlags(format_info, format) != image.UsageFlags()) { - LOG_WARNING(Render_Vulkan, - "Image view format {} has different usage flags than image format {}", format, - image.info.format); - } + const VkImageUsageFlags requested_view_usage = ImageUsageFlags(format_info, format); + const VkImageUsageFlags image_usage = image.UsageFlags(); + const VkImageUsageFlags clamped_view_usage = requested_view_usage & image_usage; const VkImageViewUsageCreateInfo image_view_usage{ .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_USAGE_CREATE_INFO, .pNext = nullptr, - .usage = ImageUsageFlags(format_info, format), + .usage = clamped_view_usage, }; const VkImageViewCreateInfo create_info{ .sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO, @@ -2300,23 +2315,18 @@ vk::ImageView ImageView::MakeView(VkFormat vk_format, VkImageAspectFlags aspect_ Sampler::Sampler(TextureCacheRuntime& runtime, const Tegra::Texture::TSCEntry& tsc) { const auto& device = runtime.device; - // Check if custom border colors are supported - const bool has_custom_border_colors = runtime.device.IsCustomBorderColorsSupported(); - const bool has_format_undefined = runtime.device.IsCustomBorderColorWithoutFormatSupported(); + const bool has_custom_border_extension = runtime.device.IsExtCustomBorderColorSupported(); + const bool has_format_undefined = + has_custom_border_extension && runtime.device.IsCustomBorderColorWithoutFormatSupported(); + const bool has_custom_border_colors = + has_format_undefined && runtime.device.IsCustomBorderColorsSupported(); const auto color = tsc.BorderColor(); - // Determine border format based on available features: - // - If customBorderColorWithoutFormat is available: use VK_FORMAT_UNDEFINED (most flexible) - // - If only customBorderColors is available: use concrete format (R8G8B8A8_UNORM) - // - If neither is available: use standard border colors (handled by ConvertBorderColor) - const VkFormat border_format = has_format_undefined ? VK_FORMAT_UNDEFINED - : VK_FORMAT_R8G8B8A8_UNORM; - const VkSamplerCustomBorderColorCreateInfoEXT border_ci{ .sType = VK_STRUCTURE_TYPE_SAMPLER_CUSTOM_BORDER_COLOR_CREATE_INFO_EXT, .pNext = nullptr, .customBorderColor = std::bit_cast(color), - .format = border_format, + .format = VK_FORMAT_UNDEFINED, }; const void* pnext = nullptr; if (has_custom_border_colors) { diff --git a/src/video_core/transform_feedback.cpp b/src/video_core/transform_feedback.cpp index a8f9da9853..53d29c08e2 100644 --- a/src/video_core/transform_feedback.cpp +++ b/src/video_core/transform_feedback.cpp @@ -1,4 +1,4 @@ -// SPDX-FileCopyrightText: Copyright 2025 Eden Emulator Project +// SPDX-FileCopyrightText: Copyright 2026 Eden Emulator Project // SPDX-License-Identifier: GPL-3.0-or-later // SPDX-FileCopyrightText: Copyright 2021 yuzu Emulator Project @@ -88,13 +88,13 @@ std::pair, u32> MakeTransformF return 0; }; - UNIMPLEMENTED_IF_MSG(layout.stream != 0, "Stream is not zero: {}", layout.stream); Shader::TransformFeedbackVarying varying{ .buffer = static_cast(buffer), .stride = layout.stride, .offset = offset * 4, .components = 1, }; + varying.stream = layout.stream; const u32 base_offset = offset; const auto attribute{get_attribute(offset)}; if (std::ranges::find(VECTORS, Common::AlignDown(attribute, 4)) != VECTORS.end()) { diff --git a/src/video_core/vulkan_common/vulkan_device.cpp b/src/video_core/vulkan_common/vulkan_device.cpp index 5075a79bcd..6e55306079 100644 --- a/src/video_core/vulkan_common/vulkan_device.cpp +++ b/src/video_core/vulkan_common/vulkan_device.cpp @@ -869,6 +869,10 @@ bool Device::HasTimelineSemaphore() const { return features.timeline_semaphore.timelineSemaphore; } +bool Device::MustEmulateBGR565() const { + return Settings::values.emulate_bgr565.GetValue(); +} + bool Device::GetSuitability(bool requires_swapchain) { // Assume we will be suitable. bool suitable = true; @@ -919,6 +923,17 @@ bool Device::GetSuitability(bool requires_swapchain) { FOR_EACH_VK_FEATURE_EXT(FEATURE_EXTENSION); FOR_EACH_VK_EXTENSION(EXTENSION); + if (supported_extensions.contains(VK_KHR_ROBUSTNESS_2_EXTENSION_NAME)) { + loaded_extensions.erase(VK_EXT_ROBUSTNESS_2_EXTENSION_NAME); + loaded_extensions.insert(VK_KHR_ROBUSTNESS_2_EXTENSION_NAME); + extensions.robustness_2 = true; + } else if (supported_extensions.contains(VK_EXT_ROBUSTNESS_2_EXTENSION_NAME)) { + loaded_extensions.insert(VK_EXT_ROBUSTNESS_2_EXTENSION_NAME); + extensions.robustness_2 = true; + } else { + extensions.robustness_2 = false; + } + #undef FEATURE_EXTENSION #undef EXTENSION @@ -1131,8 +1146,6 @@ bool Device::GetSuitability(bool requires_swapchain) { if (u32(Settings::values.dyna_state.GetValue()) == 0) { LOG_INFO(Render_Vulkan, "Extended Dynamic State disabled by user setting, clearing all EDS features"); - features.custom_border_color.customBorderColors = false; - features.custom_border_color.customBorderColorWithoutFormat = false; features.extended_dynamic_state.extendedDynamicState = false; features.extended_dynamic_state2.extendedDynamicState2 = false; features.extended_dynamic_state3.extendedDynamicState3ColorBlendEnable = false; @@ -1148,24 +1161,13 @@ bool Device::GetSuitability(bool requires_swapchain) { void Device::RemoveUnsuitableExtensions() { // VK_EXT_custom_border_color - // Enable extension if driver supports it, then check individual features - // - customBorderColors: Required to use VK_BORDER_COLOR_FLOAT_CUSTOM_EXT - // - customBorderColorWithoutFormat: Optional, allows VK_FORMAT_UNDEFINED - // If only customBorderColors is available, we must provide a specific format if (extensions.custom_border_color) { - // Verify that at least customBorderColors is available - if (!features.custom_border_color.customBorderColors) { - LOG_WARNING(Render_Vulkan, - "VK_EXT_custom_border_color reported but customBorderColors feature not available, disabling"); - extensions.custom_border_color = false; - } + extensions.custom_border_color = + features.custom_border_color.customBorderColors && + features.custom_border_color.customBorderColorWithoutFormat; } RemoveExtensionFeatureIfUnsuitable(extensions.custom_border_color, features.custom_border_color, VK_EXT_CUSTOM_BORDER_COLOR_EXTENSION_NAME); - // VK_KHR_unified_image_layouts - extensions.unified_image_layouts = features.unified_image_layouts.unifiedImageLayouts; - RemoveExtensionFeatureIfUnsuitable(extensions.unified_image_layouts, features.unified_image_layouts, - VK_KHR_UNIFIED_IMAGE_LAYOUTS_EXTENSION_NAME); // VK_EXT_depth_bias_control extensions.depth_bias_control = @@ -1251,16 +1253,22 @@ void Device::RemoveUnsuitableExtensions() { VK_EXT_EXTENDED_DYNAMIC_STATE_3_EXTENSION_NAME); // VK_EXT_robustness2 - extensions.robustness_2 = features.robustness2.robustBufferAccess2 || - features.robustness2.robustImageAccess2 || - features.robustness2.nullDescriptor; + features.robustness2.robustBufferAccess2 = VK_FALSE; + features.robustness2.robustImageAccess2 = VK_FALSE; + extensions.robustness_2 = features.robustness2.nullDescriptor; + + const char* robustness2_extension_name = + loaded_extensions.contains(VK_KHR_ROBUSTNESS_2_EXTENSION_NAME) + ? VK_KHR_ROBUSTNESS_2_EXTENSION_NAME + : VK_EXT_ROBUSTNESS_2_EXTENSION_NAME; RemoveExtensionFeatureIfUnsuitable(extensions.robustness_2, features.robustness2, - VK_EXT_ROBUSTNESS_2_EXTENSION_NAME); + robustness2_extension_name); - // VK_EXT_image_robustness - extensions.image_robustness = features.image_robustness.robustImageAccess; - RemoveExtensionFeatureIfUnsuitable(extensions.image_robustness, features.image_robustness, + // Image robustness + extensions.robust_image_access = features.robust_image_access.robustImageAccess; + RemoveExtensionFeatureIfUnsuitable(extensions.robust_image_access, + features.robust_image_access, VK_EXT_IMAGE_ROBUSTNESS_EXTENSION_NAME); // VK_KHR_shader_atomic_int64 @@ -1288,8 +1296,7 @@ void Device::RemoveUnsuitableExtensions() { // VK_EXT_transform_feedback extensions.transform_feedback = features.transform_feedback.transformFeedback && - properties.transform_feedback.maxTransformFeedbackBuffers > 0 && - properties.transform_feedback.transformFeedbackQueries; + properties.transform_feedback.maxTransformFeedbackBuffers > 0; RemoveExtensionFeatureIfUnsuitable(extensions.transform_feedback, features.transform_feedback, VK_EXT_TRANSFORM_FEEDBACK_EXTENSION_NAME); diff --git a/src/video_core/vulkan_common/vulkan_device.h b/src/video_core/vulkan_common/vulkan_device.h index ad9d53ce16..a8a89aee89 100644 --- a/src/video_core/vulkan_common/vulkan_device.h +++ b/src/video_core/vulkan_common/vulkan_device.h @@ -38,7 +38,7 @@ VK_DEFINE_HANDLE(VmaAllocator) FEATURE(KHR, TimelineSemaphore, TIMELINE_SEMAPHORE, timeline_semaphore) #define FOR_EACH_VK_FEATURE_1_3(FEATURE) \ - FEATURE(EXT, ImageRobustness, IMAGE_ROBUSTNESS, image_robustness) \ + FEATURE(EXT, ImageRobustness, IMAGE_ROBUSTNESS, robust_image_access) \ FEATURE(EXT, ShaderDemoteToHelperInvocation, SHADER_DEMOTE_TO_HELPER_INVOCATION, \ shader_demote_to_helper_invocation) \ FEATURE(EXT, SubgroupSizeControl, SUBGROUP_SIZE_CONTROL, subgroup_size_control) \ @@ -68,8 +68,7 @@ VK_DEFINE_HANDLE(VmaAllocator) FEATURE(KHR, PipelineExecutableProperties, PIPELINE_EXECUTABLE_PROPERTIES, \ pipeline_executable_properties) \ FEATURE(KHR, WorkgroupMemoryExplicitLayout, WORKGROUP_MEMORY_EXPLICIT_LAYOUT, \ - workgroup_memory_explicit_layout) \ - FEATURE(KHR, UnifiedImageLayouts, UNIFIED_IMAGE_LAYOUTS, unified_image_layouts) + workgroup_memory_explicit_layout) // Define miscellaneous extensions which may be used by the implementation here. @@ -124,7 +123,6 @@ VK_DEFINE_HANDLE(VmaAllocator) EXTENSION_NAME(VK_EXT_EXTENDED_DYNAMIC_STATE_3_EXTENSION_NAME) \ EXTENSION_NAME(VK_EXT_EXTERNAL_MEMORY_HOST_EXTENSION_NAME) \ EXTENSION_NAME(VK_EXT_4444_FORMATS_EXTENSION_NAME) \ - EXTENSION_NAME(VK_EXT_IMAGE_ROBUSTNESS_EXTENSION_NAME) \ EXTENSION_NAME(VK_EXT_LINE_RASTERIZATION_EXTENSION_NAME) \ EXTENSION_NAME(VK_EXT_ROBUSTNESS_2_EXTENSION_NAME) \ EXTENSION_NAME(VK_EXT_VERTEX_INPUT_DYNAMIC_STATE_EXTENSION_NAME) \ @@ -174,13 +172,11 @@ VK_DEFINE_HANDLE(VmaAllocator) FEATURE_NAME(depth_bias_control, depthBiasExact) \ FEATURE_NAME(extended_dynamic_state, extendedDynamicState) \ FEATURE_NAME(format_a4b4g4r4, formatA4B4G4R4) \ - FEATURE_NAME(image_robustness, robustImageAccess) \ + FEATURE_NAME(robust_image_access, robustImageAccess) \ FEATURE_NAME(index_type_uint8, indexTypeUint8) \ FEATURE_NAME(primitive_topology_list_restart, primitiveTopologyListRestart) \ FEATURE_NAME(provoking_vertex, provokingVertexLast) \ FEATURE_NAME(robustness2, nullDescriptor) \ - FEATURE_NAME(robustness2, robustBufferAccess2) \ - FEATURE_NAME(robustness2, robustImageAccess2) \ FEATURE_NAME(shader_float16_int8, shaderFloat16) \ FEATURE_NAME(shader_float16_int8, shaderInt8) \ FEATURE_NAME(timeline_semaphore, timelineSemaphore) \ @@ -542,6 +538,17 @@ public: return extensions.transform_feedback; } + /// Returns true if transform feedback draw commands are supported. + bool IsTransformFeedbackDrawSupported() const { + return extensions.transform_feedback && properties.transform_feedback.transformFeedbackDraw; + } + + /// Returns true if transform feedback query types are supported. + bool IsTransformFeedbackQueriesSupported() const { + return extensions.transform_feedback && + properties.transform_feedback.transformFeedbackQueries; + } + /// Returns true if the device supports VK_EXT_transform_feedback properly. bool AreTransformFeedbackGeometryStreamsSupported() const { return features.transform_feedback.geometryStreams; @@ -552,36 +559,6 @@ public: return extensions.custom_border_color; } - /// Returns true if the device supports VK_EXT_image_robustness. - bool IsExtImageRobustnessSupported() const { - return extensions.image_robustness; - } - - /// Returns true if robustImageAccess is supported. - bool IsRobustImageAccessSupported() const { - return features.image_robustness.robustImageAccess; - } - - /// Returns true if the device supports VK_EXT_robustness2. - bool IsExtRobustness2Supported() const { - return extensions.robustness_2; - } - - /// Returns true if robustBufferAccess2 is supported. - bool IsRobustBufferAccess2Supported() const { - return features.robustness2.robustBufferAccess2; - } - - /// Returns true if robustImageAccess2 is supported. - bool IsRobustImageAccess2Supported() const { - return features.robustness2.robustImageAccess2; - } - - /// Returns true if nullDescriptor is supported. - bool IsNullDescriptorSupported() const { - return features.robustness2.nullDescriptor; - } - /// Returns true if customBorderColors feature is available. bool IsCustomBorderColorsSupported() const { return features.custom_border_color.customBorderColors; @@ -805,6 +782,8 @@ public: return features.robustness2.nullDescriptor; } + bool MustEmulateBGR565() const; + bool HasExactDepthBiasControl() const { return features.depth_bias_control.depthBiasExact; } diff --git a/src/video_core/vulkan_common/vulkan_wrapper.cpp b/src/video_core/vulkan_common/vulkan_wrapper.cpp index f59ac7d6bc..871ce52678 100644 --- a/src/video_core/vulkan_common/vulkan_wrapper.cpp +++ b/src/video_core/vulkan_common/vulkan_wrapper.cpp @@ -123,6 +123,7 @@ void Load(VkDevice device, DeviceDispatch& dld) noexcept { X(vkCmdEndDebugUtilsLabelEXT); X(vkCmdFillBuffer); X(vkCmdPipelineBarrier); + X(vkCmdResetQueryPool); X(vkCmdPushConstants); X(vkCmdPushDescriptorSetWithTemplateKHR); X(vkCmdSetBlendConstants); diff --git a/src/video_core/vulkan_common/vulkan_wrapper.h b/src/video_core/vulkan_common/vulkan_wrapper.h index aaff66359e..4a3baad2c4 100644 --- a/src/video_core/vulkan_common/vulkan_wrapper.h +++ b/src/video_core/vulkan_common/vulkan_wrapper.h @@ -225,6 +225,7 @@ struct DeviceDispatch : InstanceDispatch { PFN_vkCmdEndTransformFeedbackEXT vkCmdEndTransformFeedbackEXT{}; PFN_vkCmdFillBuffer vkCmdFillBuffer{}; PFN_vkCmdPipelineBarrier vkCmdPipelineBarrier{}; + PFN_vkCmdResetQueryPool vkCmdResetQueryPool{}; PFN_vkCmdPushConstants vkCmdPushConstants{}; PFN_vkCmdPushDescriptorSetWithTemplateKHR vkCmdPushDescriptorSetWithTemplateKHR{}; PFN_vkCmdResolveImage vkCmdResolveImage{}; @@ -1168,6 +1169,10 @@ public: dld->vkCmdEndQuery(handle, query_pool, query); } + void ResetQueryPool(VkQueryPool query_pool, u32 first_query, u32 query_count) const noexcept { + dld->vkCmdResetQueryPool(handle, query_pool, first_query, query_count); + } + void BindDescriptorSets(VkPipelineBindPoint bind_point, VkPipelineLayout layout, u32 first, Span sets, Span dynamic_offsets) const noexcept { dld->vkCmdBindDescriptorSets(handle, bind_point, layout, first, sets.size(), sets.data(),