Rasterizer cache refactor (#6375)

* rasterizer_cache: Remove custom texture code

* It's a hacky buggy mess, will be reimplemented later when the cache is in a better state

* rasterizer_cache: Refactor surface upload/download

* Switch to the texture_codec header which was written as part of the vulkan backend by steveice and me

* Move most of the upload logic to the rasterizer cache and out of the surface object

* Scaled uploads/downloads have been disabled for now since they require more runtime infrastructure

* rasterizer_cache: Refactor runtime interface

* Remove aspect enum which is the same as SurfaceType

* Replace Subresource with specific structures for each operation (blit/copy/clear). This mimics moderns APIs vulkan much better

* Pass the surface to the runtime instead of the texture

* Implement CopyTextures with glCopyImageSubData which is available on 4.3 and gles.
  This function also has an overload for cubes which will be removed later.

* rasterizer_cache: Move texture allocation to the runtime

* renderer_opengl: Remove TextureDownloaderES

* It's overly compilcated and unused at the moment. Will be replaced with a simple compute shader in a later commit

* rasterizer_cache: Split CachedSurface

* This commit splits CachedSurface into two classes, SurfaceBase which contains the backend agnostic functions and Surface which is the opengl specific part

* For now the cache uses the opengl surface directly and there are a few ugly casts with watchers, those will be taken care of when the template convertion and watcher removal are added respectively

* rasterizer_cache: Move reinterpreters to the runtime

* rasterizer_cache: Move some pixel format function to the cpp file

* rasterizer_cache: Common texture acceleration functions

* They don't contain any backend specific code so they shouldn't be duplicated

* rasterizer_cache: Remove BlitSurfaces

* It's better to prefer copy/blit in the caller anyway

* rasterizer_cache: Only allocate needed levels

* rasterizer_cache: Move texture runtime out of common dir

* Also shorten the util header filename

* surface_params: Cleanup code

* Add more comments, organize it a bit etc

* rasterizer_cache: Move texture filtering to the runtime

* rasterizer_cache: Move to VideoCore

* renderer_opengl: Reimplement scaled uploads/downloads

* Instead of looking up for temporary textures, each allocation now contains both a scaled and unscaled handle
  This allows the scale operations to be done inside the surface object itself and improves performance in general

* In particular the scaled download code has been expanded to use ARB_get_texture_sub_image when possible
  which is faster and more convenient than glReadPixels. The latter is still relevant for OpenGLES though.

* Finally allocations are now given a handy debug name that can be viewed from renderdoc.

* rasterizer_cache: Remove global state

* gl_rasterizer: Abstract common draw operations to Framebuffer

* This also allows to cache framebuffer objects instead of always swapping the textures, something that particularly benefits mali gpus

* rasterizer_cache: Implement multi-level surfaces

* With this commit the cache can now directly upload and use mipmaps
  without needing to sync them with watchers. By using native mimaps
  directly this also adds support for mipmap for cube

* Texture cubes have also been updated to drop the watcher requirement

* host_shaders: Add CMake integration for string shaders

* Improves build time shader generation making it much less prone to errors.
  Also moves the presentation shaders here to avoid embedding them to the cpp file.

* Texture filter shaders now make explicit use of uniform bindings for better vulkan compatibility

* renderer_opengl: Emulate lod bias in the shader

* This way opengles can emulate it correctly

* gl_rasterizer: Respect GL_MAX_TEXTURE_BUFFER_SIZE

* Older Bifrost Mali GPUs only support up to 64kb texture buffers. Citra would try to allocate a much larger buffer the first 64kb of which would work fine but after that the driver starts misbehaving and showing various graphical glitches

* rasterizer_cache: Cleanup CopySurface

* renderer_opengl: Keep frames synchronized when using a GPU debugger

* rasterizer_cache: Rename Surface to SurfaceRef

* Makes it clear that surface is a shared_ptr and not an object

* rasterizer_cache: Cleanup

* Move constructor to the top of the file

* Move FindMatch to the top as well and remove the Invalid flag which was redudant;
  all FindMatch calls used it expect from MatchFlags::Copy which ignores it anyway

* gl_texture_runtime: Make driver const

* gl_texture_runtime: Fix RGB8 format handling

* The texture_codec header, being written with vulkan in mind converts RGB8 to RGBA8. The backend wasn't adjusted to account for this though and treated the data as RGB8.

* Also remove D16 convertions, both opengl and vulkan are required to support this format so these are not needed

* gl_texture_runtime: Reduce state switches during FBO blits

* glBlitFramebuffer is only affected by the scissor rectangle so just disable scissor testing instead of resetting our entire state

* surface_params: Prevent texcopy that spans multiple levels

* It would have failed before as well, with multi-level surfaces it triggers the assert though

* renderer_opengl: Centralize texture filters

* A lot of code is shared between the filters thus is makes it sense to centralize them

* Also fix an issue with partial texture uploads

* Address review comments

* rasterizer_cache: Use leading return types

* rasterizer_cache: Cleanup null checks

* renderer_opengl: Add additional logging

* externals: Actually downgrade glad

* For some reason I missed adding the files to git

* surface_params: Do not check for levels in exact match

* Some games will try to use the base level of a multi level surface. Checking for levels forces another surface to be created and a copy to be made which is both unncessary and breaks custom textures

---------

Co-authored-by: bunnei <bunneidev@gmail.com>
This commit is contained in:
GPUCode 2023-04-21 10:14:55 +03:00 committed by GitHub
parent 9414db4f65
commit 26d6f9d1c6
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
105 changed files with 4606 additions and 4984 deletions

View file

@ -1,10 +0,0 @@
//? #version 320 es
out highp uint color;
uniform highp sampler2D depth;
uniform int lod;
void main() {
color = uint(texelFetch(depth, ivec2(gl_FragCoord.xy), lod).x * (exp2(32.0) - 1.0));
}

View file

@ -1,8 +0,0 @@
//? #version 320 es
const vec2 vertices[4] =
vec2[4](vec2(-1.0, -1.0), vec2(1.0, -1.0), vec2(-1.0, 1.0), vec2(1.0, 1.0));
void main() {
gl_Position = vec4(vertices[gl_VertexID], 0.0, 1.0);
}

View file

@ -1,9 +0,0 @@
//? #version 320 es
#extension GL_ARM_shader_framebuffer_fetch_depth_stencil : enable
out highp uint color;
void main() {
color = uint(gl_LastFragDepthARM * (exp2(24.0) - 1.0)) << 8;
color |= uint(gl_LastFragStencilARM);
}

View file

@ -0,0 +1,218 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/settings.h"
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/renderer_opengl/gl_blit_helper.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_texture_runtime.h"
#include "video_core/host_shaders/full_screen_triangle_vert.h"
#include "video_core/host_shaders/texture_filtering/bicubic_frag.h"
#include "video_core/host_shaders/texture_filtering/nearest_neighbor_frag.h"
#include "video_core/host_shaders/texture_filtering/refine_frag.h"
#include "video_core/host_shaders/texture_filtering/scale_force_frag.h"
#include "video_core/host_shaders/texture_filtering/tex_coord_vert.h"
#include "video_core/host_shaders/texture_filtering/x_gradient_frag.h"
#include "video_core/host_shaders/texture_filtering/xbrz_freescale_frag.h"
#include "video_core/host_shaders/texture_filtering/y_gradient_frag.h"
namespace OpenGL {
using Settings::TextureFilter;
using VideoCore::SurfaceType;
namespace {
struct TempTexture {
OGLTexture tex;
OGLFramebuffer fbo;
};
OGLSampler CreateSampler(GLenum filter) {
OGLSampler sampler;
sampler.Create();
glSamplerParameteri(sampler.handle, GL_TEXTURE_MIN_FILTER, filter);
glSamplerParameteri(sampler.handle, GL_TEXTURE_MAG_FILTER, filter);
glSamplerParameteri(sampler.handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(sampler.handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
return sampler;
}
OGLProgram CreateProgram(std::string_view frag) {
OGLProgram program;
program.Create(HostShaders::FULL_SCREEN_TRIANGLE_VERT, frag);
glProgramUniform2f(program.handle, 0, 1.f, 1.f);
glProgramUniform2f(program.handle, 1, 0.f, 0.f);
return program;
}
} // Anonymous namespace
BlitHelper::BlitHelper(TextureRuntime& runtime_)
: runtime{runtime_}, linear_sampler{CreateSampler(GL_LINEAR)},
nearest_sampler{CreateSampler(GL_NEAREST)}, bicubic_program{CreateProgram(
HostShaders::BICUBIC_FRAG)},
nearest_program{CreateProgram(HostShaders::NEAREST_NEIGHBOR_FRAG)},
scale_force_program{CreateProgram(HostShaders::SCALE_FORCE_FRAG)},
xbrz_program{CreateProgram(HostShaders::XBRZ_FREESCALE_FRAG)},
gradient_x_program{CreateProgram(HostShaders::X_GRADIENT_FRAG)},
gradient_y_program{CreateProgram(HostShaders::Y_GRADIENT_FRAG)},
refine_program{CreateProgram(HostShaders::REFINE_FRAG)} {
vao.Create();
filter_fbo.Create();
state.draw.vertex_array = vao.handle;
for (u32 i = 0; i < 3; i++) {
state.texture_units[i].sampler = i == 2 ? nearest_sampler.handle : linear_sampler.handle;
}
}
BlitHelper::~BlitHelper() = default;
bool BlitHelper::Filter(Surface& surface, const VideoCore::TextureBlit& blit) {
// Filtering to depth stencil surfaces isn't supported.
if (surface.type == SurfaceType::Depth || surface.type == SurfaceType::DepthStencil) {
return false;
}
// Avoid filtering for mipmaps as the result often looks terrible.
if (blit.src_level != 0) {
return true;
}
const OpenGLState prev_state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = surface.Handle(false);
const auto filter{Settings::values.texture_filter.GetValue()};
switch (filter) {
case TextureFilter::None:
break;
case TextureFilter::Anime4K:
FilterAnime4K(surface, blit);
break;
case TextureFilter::Bicubic:
FilterBicubic(surface, blit);
break;
case TextureFilter::NearestNeighbor:
FilterNearest(surface, blit);
break;
case TextureFilter::ScaleForce:
FilterScaleForce(surface, blit);
break;
case TextureFilter::xBRZ:
FilterXbrz(surface, blit);
break;
}
prev_state.Apply();
return true;
}
void BlitHelper::FilterAnime4K(Surface& surface, const VideoCore::TextureBlit& blit) {
static constexpr u8 internal_scale_factor = 2;
const auto& tuple = surface.Tuple();
const u32 src_width = blit.src_rect.GetWidth();
const u32 src_height = blit.src_rect.GetHeight();
const auto temp_rect{blit.src_rect * internal_scale_factor};
const auto setup_temp_tex = [&](GLint internal_format, GLint format, u32 width, u32 height) {
TempTexture texture;
texture.fbo.Create();
texture.tex.Create();
state.texture_units[1].texture_2d = texture.tex.handle;
state.draw.draw_framebuffer = texture.fbo.handle;
state.Apply();
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, texture.tex.handle);
glTexStorage2D(GL_TEXTURE_2D, 1, internal_format, width, height);
return texture;
};
// Create intermediate textures
auto SRC = setup_temp_tex(tuple.internal_format, tuple.format, src_width, src_height);
auto XY = setup_temp_tex(GL_RG16F, GL_RG, temp_rect.GetWidth(), temp_rect.GetHeight());
auto LUMAD = setup_temp_tex(GL_R16F, GL_RED, temp_rect.GetWidth(), temp_rect.GetHeight());
// Copy to SRC
glCopyImageSubData(surface.Handle(false), GL_TEXTURE_2D, 0, blit.src_rect.left,
blit.src_rect.bottom, 0, SRC.tex.handle, GL_TEXTURE_2D, 0, 0, 0, 0,
src_width, src_height, 1);
state.texture_units[0].texture_2d = SRC.tex.handle;
state.texture_units[1].texture_2d = LUMAD.tex.handle;
state.texture_units[2].texture_2d = XY.tex.handle;
// gradient x pass
Draw(gradient_x_program, XY.tex.handle, XY.fbo.handle, 0, temp_rect);
// gradient y pass
Draw(gradient_y_program, LUMAD.tex.handle, LUMAD.fbo.handle, 0, temp_rect);
// refine pass
Draw(refine_program, surface.Handle(), filter_fbo.handle, blit.dst_level, blit.dst_rect);
// These will have handles from the previous texture that was filtered, reset them to avoid
// binding invalid textures.
state.texture_units[0].texture_2d = 0;
state.texture_units[1].texture_2d = 0;
state.texture_units[2].texture_2d = 0;
state.Apply();
}
void BlitHelper::FilterBicubic(Surface& surface, const VideoCore::TextureBlit& blit) {
SetParams(bicubic_program, surface.width, surface.height, blit.src_rect);
Draw(bicubic_program, surface.Handle(), filter_fbo.handle, blit.dst_level, blit.dst_rect);
}
void BlitHelper::FilterNearest(Surface& surface, const VideoCore::TextureBlit& blit) {
SetParams(nearest_program, surface.width, surface.height, blit.src_rect);
Draw(nearest_program, surface.Handle(), filter_fbo.handle, blit.dst_level, blit.dst_rect);
}
void BlitHelper::FilterScaleForce(Surface& surface, const VideoCore::TextureBlit& blit) {
SetParams(scale_force_program, surface.width, surface.height, blit.src_rect);
Draw(scale_force_program, surface.Handle(), filter_fbo.handle, blit.dst_level, blit.dst_rect);
}
void BlitHelper::FilterXbrz(Surface& surface, const VideoCore::TextureBlit& blit) {
glProgramUniform1f(xbrz_program.handle, 2, static_cast<GLfloat>(surface.res_scale));
SetParams(xbrz_program, surface.width, surface.height, blit.src_rect);
Draw(xbrz_program, surface.Handle(), filter_fbo.handle, blit.dst_level, blit.dst_rect);
}
void BlitHelper::SetParams(OGLProgram& program, u32 src_width, u32 src_height,
Common::Rectangle<u32> src_rect) {
glProgramUniform2f(
program.handle, 0,
static_cast<float>(src_rect.right - src_rect.left) / static_cast<float>(src_width),
static_cast<float>(src_rect.top - src_rect.bottom) / static_cast<float>(src_height));
glProgramUniform2f(program.handle, 1,
static_cast<float>(src_rect.left) / static_cast<float>(src_width),
static_cast<float>(src_rect.bottom) / static_cast<float>(src_height));
}
void BlitHelper::Draw(OGLProgram& program, GLuint dst_tex, GLuint dst_fbo, u32 dst_level,
Common::Rectangle<u32> dst_rect) {
state.draw.draw_framebuffer = dst_fbo;
state.draw.shader_program = program.handle;
state.scissor.enabled = true;
state.scissor.x = dst_rect.left;
state.scissor.y = dst_rect.bottom;
state.scissor.width = dst_rect.GetWidth();
state.scissor.height = dst_rect.GetHeight();
state.viewport.x = dst_rect.left;
state.viewport.y = dst_rect.bottom;
state.viewport.width = dst_rect.GetWidth();
state.viewport.height = dst_rect.GetHeight();
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex,
dst_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glClear(GL_COLOR_BUFFER_BIT);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
} // namespace OpenGL

View file

@ -0,0 +1,61 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/math_util.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
namespace VideoCore {
struct TextureBlit;
}
namespace OpenGL {
class TextureRuntime;
class Surface;
class BlitHelper {
public:
BlitHelper(TextureRuntime& runtime);
~BlitHelper();
bool Filter(Surface& surface, const VideoCore::TextureBlit& blit);
private:
void FilterAnime4K(Surface& surface, const VideoCore::TextureBlit& blit);
void FilterBicubic(Surface& surface, const VideoCore::TextureBlit& blit);
void FilterNearest(Surface& surface, const VideoCore::TextureBlit& blit);
void FilterScaleForce(Surface& surface, const VideoCore::TextureBlit& blit);
void FilterXbrz(Surface& surface, const VideoCore::TextureBlit& blit);
void SetParams(OGLProgram& program, u32 src_width, u32 src_height,
Common::Rectangle<u32> src_rect);
void Draw(OGLProgram& program, GLuint dst_tex, GLuint dst_fbo, u32 dst_level,
Common::Rectangle<u32> dst_rect);
private:
TextureRuntime& runtime;
OGLVertexArray vao;
OpenGLState state;
OGLFramebuffer filter_fbo;
OGLSampler linear_sampler;
OGLSampler nearest_sampler;
OGLProgram bicubic_program;
OGLProgram nearest_program;
OGLProgram scale_force_program;
OGLProgram xbrz_program;
OGLProgram gradient_x_program;
OGLProgram gradient_y_program;
OGLProgram refine_program;
};
} // namespace OpenGL

View file

@ -71,10 +71,11 @@ static void APIENTRY DebugHandler(GLenum source, GLenum type, GLuint id, GLenum
id, message);
}
Driver::Driver(Core::TelemetrySession& telemetry_session_) : telemetry_session{telemetry_session_} {
Driver::Driver(Core::TelemetrySession& telemetry_session_)
: telemetry_session{telemetry_session_}, is_gles{Settings::values.use_gles.GetValue()} {
const bool enable_debug = Settings::values.renderer_debug.GetValue();
if (enable_debug) {
glEnable(GL_DEBUG_OUTPUT);
glEnable(GL_DEBUG_OUTPUT_SYNCHRONOUS);
glDebugMessageCallback(DebugHandler, nullptr);
}
@ -90,6 +91,18 @@ bool Driver::HasBug(DriverBug bug) const {
return True(bugs & bug);
}
bool Driver::HasDebugTool() {
GLint num_extensions;
glGetIntegerv(GL_NUM_EXTENSIONS, &num_extensions);
for (GLuint index = 0; index < static_cast<GLuint>(num_extensions); ++index) {
const auto name = reinterpret_cast<const char*>(glGetStringi(GL_EXTENSIONS, index));
if (!std::strcmp(name, "GL_EXT_debug_tool")) {
return true;
}
}
return false;
}
void Driver::ReportDriverInfo() {
// Report the context version and the vendor string
gl_version = std::string_view{reinterpret_cast<const char*>(glGetString(GL_VERSION))};

View file

@ -48,6 +48,9 @@ public:
/// Returns true of the driver has a particular bug stated in the DriverBug enum
bool HasBug(DriverBug bug) const;
/// Returns true if any debug tool is attached
bool HasDebugTool();
/// Returns the vendor of the currently selected physical device
Vendor GetVendor() const {
return vendor;
@ -58,6 +61,11 @@ public:
return gpu_vendor;
}
/// Returns true if the an OpenGLES context is used
bool IsOpenGLES() const noexcept {
return is_gles;
}
/// Returns true if the implementation is suitable for emulation
bool IsSuitable() const {
return is_suitable;
@ -99,6 +107,7 @@ private:
Vendor vendor = Vendor::Unknown;
DriverBug bugs{};
bool is_suitable{};
bool is_gles{};
bool ext_buffer_storage{};
bool arb_buffer_storage{};

View file

@ -5,261 +5,128 @@
#include "common/scope_exit.h"
#include "video_core/renderer_opengl/gl_format_reinterpreter.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_texture_runtime.h"
#include "video_core/host_shaders/format_reinterpreter/d24s8_to_rgba8_frag.h"
#include "video_core/host_shaders/format_reinterpreter/fullscreen_quad_vert.h"
#include "video_core/host_shaders/format_reinterpreter/rgba4_to_rgb5a1_frag.h"
namespace OpenGL {
class RGBA4toRGB5A1 final : public FormatReinterpreterBase {
public:
RGBA4toRGB5A1() {
constexpr std::string_view vs_source = R"(
out vec2 dst_coord;
uniform mediump ivec2 dst_size;
const vec2 vertices[4] =
vec2[4](vec2(-1.0, -1.0), vec2(1.0, -1.0), vec2(-1.0, 1.0), vec2(1.0, 1.0));
void main() {
gl_Position = vec4(vertices[gl_VertexID], 0.0, 1.0);
dst_coord = (vertices[gl_VertexID] / 2.0 + 0.5) * vec2(dst_size);
RGBA4toRGB5A1::RGBA4toRGB5A1() {
program.Create(HostShaders::FULLSCREEN_QUAD_VERT, HostShaders::RGBA4_TO_RGB5A1_FRAG);
dst_size_loc = glGetUniformLocation(program.handle, "dst_size");
src_size_loc = glGetUniformLocation(program.handle, "src_size");
src_offset_loc = glGetUniformLocation(program.handle, "src_offset");
vao.Create();
}
)";
constexpr std::string_view fs_source = R"(
in mediump vec2 dst_coord;
void RGBA4toRGB5A1::Reinterpret(Surface& source, Common::Rectangle<u32> src_rect, Surface& dest,
Common::Rectangle<u32> dst_rect) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
out lowp vec4 frag_color;
OpenGLState state;
state.texture_units[0].texture_2d = source.Handle();
state.draw.draw_framebuffer = draw_fbo.handle;
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
uniform lowp sampler2D source;
uniform mediump ivec2 dst_size;
uniform mediump ivec2 src_size;
uniform mediump ivec2 src_offset;
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dest.Handle(),
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
void main() {
mediump ivec2 tex_coord;
if (src_size == dst_size) {
tex_coord = ivec2(dst_coord);
glUniform2i(dst_size_loc, dst_rect.GetWidth(), dst_rect.GetHeight());
glUniform2i(src_size_loc, src_rect.GetWidth(), src_rect.GetHeight());
glUniform2i(src_offset_loc, src_rect.left, src_rect.bottom);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
ShaderD24S8toRGBA8::ShaderD24S8toRGBA8() {
program.Create(HostShaders::FULLSCREEN_QUAD_VERT, HostShaders::D24S8_TO_RGBA8_FRAG);
dst_size_loc = glGetUniformLocation(program.handle, "dst_size");
src_size_loc = glGetUniformLocation(program.handle, "src_size");
src_offset_loc = glGetUniformLocation(program.handle, "src_offset");
vao.Create();
auto state = OpenGLState::GetCurState();
auto cur_program = state.draw.shader_program;
state.draw.shader_program = program.handle;
state.Apply();
glUniform1i(glGetUniformLocation(program.handle, "stencil"), 1);
state.draw.shader_program = cur_program;
state.Apply();
// Nvidia seem to be the only one to support D24S8 views, at least on windows
// so for everyone else it will do an intermediate copy before running through the shader
std::string_view vendor{reinterpret_cast<const char*>(glGetString(GL_VENDOR))};
if (vendor.find("NVIDIA") != vendor.npos) {
use_texture_view = true;
} else {
highp int tex_index = int(dst_coord.y) * dst_size.x + int(dst_coord.x);
mediump int y = tex_index / src_size.x;
tex_coord = ivec2(tex_index - y * src_size.x, y);
LOG_INFO(Render_OpenGL,
"Texture views are unsupported, reinterpretation will do intermediate copy");
temp_tex.Create();
}
tex_coord -= src_offset;
lowp ivec4 rgba4 = ivec4(texelFetch(source, tex_coord, 0) * (exp2(4.0) - 1.0));
lowp ivec3 rgb5 =
((rgba4.rgb << ivec3(1, 2, 3)) | (rgba4.gba >> ivec3(3, 2, 1))) & 0x1F;
frag_color = vec4(vec3(rgb5) / (exp2(5.0) - 1.0), rgba4.a & 0x01);
}
)";
program.Create(vs_source.data(), fs_source.data());
dst_size_loc = glGetUniformLocation(program.handle, "dst_size");
src_size_loc = glGetUniformLocation(program.handle, "src_size");
src_offset_loc = glGetUniformLocation(program.handle, "src_offset");
vao.Create();
}
void ShaderD24S8toRGBA8::Reinterpret(Surface& source, Common::Rectangle<u32> src_rect,
Surface& dest, Common::Rectangle<u32> dst_rect) {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
PixelFormat GetSourceFormat() const override {
return PixelFormat::RGBA4;
}
void Reinterpret(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
OpenGLState state;
state.texture_units[0].texture_2d = src_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
dst_tex.handle, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
glUniform2i(dst_size_loc, dst_rect.GetWidth(), dst_rect.GetHeight());
glUniform2i(src_size_loc, src_rect.GetWidth(), src_rect.GetHeight());
glUniform2i(src_offset_loc, src_rect.left, src_rect.bottom);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
private:
OGLProgram program;
GLint dst_size_loc{-1}, src_size_loc{-1}, src_offset_loc{-1};
OGLVertexArray vao;
};
class ShaderD24S8toRGBA8 final : public FormatReinterpreterBase {
public:
ShaderD24S8toRGBA8() {
constexpr std::string_view vs_source = R"(
out vec2 dst_coord;
uniform mediump ivec2 dst_size;
const vec2 vertices[4] =
vec2[4](vec2(-1.0, -1.0), vec2(1.0, -1.0), vec2(-1.0, 1.0), vec2(1.0, 1.0));
void main() {
gl_Position = vec4(vertices[gl_VertexID], 0.0, 1.0);
dst_coord = (vertices[gl_VertexID] / 2.0 + 0.5) * vec2(dst_size);
}
)";
constexpr std::string_view fs_source = R"(
in mediump vec2 dst_coord;
out lowp vec4 frag_color;
uniform highp sampler2D depth;
uniform lowp usampler2D stencil;
uniform mediump ivec2 dst_size;
uniform mediump ivec2 src_size;
uniform mediump ivec2 src_offset;
void main() {
mediump ivec2 tex_coord;
if (src_size == dst_size) {
tex_coord = ivec2(dst_coord);
} else {
highp int tex_index = int(dst_coord.y) * dst_size.x + int(dst_coord.x);
mediump int y = tex_index / src_size.x;
tex_coord = ivec2(tex_index - y * src_size.x, y);
}
tex_coord -= src_offset;
highp uint depth_val =
uint(texelFetch(depth, tex_coord, 0).x * (exp2(32.0) - 1.0));
lowp uint stencil_val = texelFetch(stencil, tex_coord, 0).x;
highp uvec4 components =
uvec4(stencil_val, (uvec3(depth_val) >> uvec3(24u, 16u, 8u)) & 0x000000FFu);
frag_color = vec4(components) / (exp2(8.0) - 1.0);
}
)";
program.Create(vs_source.data(), fs_source.data());
dst_size_loc = glGetUniformLocation(program.handle, "dst_size");
src_size_loc = glGetUniformLocation(program.handle, "src_size");
src_offset_loc = glGetUniformLocation(program.handle, "src_offset");
vao.Create();
auto state = OpenGLState::GetCurState();
auto cur_program = state.draw.shader_program;
state.draw.shader_program = program.handle;
state.Apply();
glUniform1i(glGetUniformLocation(program.handle, "stencil"), 1);
state.draw.shader_program = cur_program;
state.Apply();
// Nvidia seem to be the only one to support D24S8 views, at least on windows
// so for everyone else it will do an intermediate copy before running through the shader
std::string_view vendor{reinterpret_cast<const char*>(glGetString(GL_VENDOR))};
if (vendor.find("NVIDIA") != vendor.npos) {
use_texture_view = true;
} else {
LOG_INFO(Render_OpenGL,
"Texture views are unsupported, reinterpretation will do intermediate copy");
temp_tex.Create();
}
}
PixelFormat GetSourceFormat() const override {
return PixelFormat::D24S8;
}
void Reinterpret(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override {
OpenGLState prev_state = OpenGLState::GetCurState();
SCOPE_EXIT({ prev_state.Apply(); });
OpenGLState state;
state.texture_units[0].texture_2d = src_tex.handle;
if (use_texture_view) {
temp_tex.Create();
glActiveTexture(GL_TEXTURE1);
glTextureView(temp_tex.handle, GL_TEXTURE_2D, src_tex.handle, GL_DEPTH24_STENCIL8, 0, 1,
0, 1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
} else if (src_rect.top > temp_rect.top || src_rect.right > temp_rect.right) {
temp_tex.Release();
temp_tex.Create();
state.texture_units[1].texture_2d = temp_tex.handle;
state.Apply();
glActiveTexture(GL_TEXTURE1);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_DEPTH24_STENCIL8, src_rect.right, src_rect.top);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
temp_rect = src_rect;
}
state.texture_units[1].texture_2d = temp_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
OpenGLState state;
state.texture_units[0].texture_2d = source.Handle();
if (use_texture_view) {
temp_tex.Create();
glActiveTexture(GL_TEXTURE1);
if (!use_texture_view) {
glCopyImageSubData(src_tex.handle, GL_TEXTURE_2D, 0, src_rect.left, src_rect.bottom, 0,
temp_tex.handle, GL_TEXTURE_2D, 0, src_rect.left, src_rect.bottom, 0,
src_rect.GetWidth(), src_rect.GetHeight(), 1);
}
glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_STENCIL_INDEX);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
dst_tex.handle, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
glUniform2i(dst_size_loc, dst_rect.GetWidth(), dst_rect.GetHeight());
glUniform2i(src_size_loc, src_rect.GetWidth(), src_rect.GetHeight());
glUniform2i(src_offset_loc, src_rect.left, src_rect.bottom);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
if (use_texture_view) {
temp_tex.Release();
}
glTextureView(temp_tex.handle, GL_TEXTURE_2D, source.Handle(), GL_DEPTH24_STENCIL8, 0, 1, 0,
1);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
} else if (src_rect.top > temp_rect.top || src_rect.right > temp_rect.right) {
temp_tex.Release();
temp_tex.Create();
state.texture_units[1].texture_2d = temp_tex.handle;
state.Apply();
glActiveTexture(GL_TEXTURE1);
glTexStorage2D(GL_TEXTURE_2D, 1, GL_DEPTH24_STENCIL8, src_rect.right, src_rect.top);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
temp_rect = src_rect;
}
private:
bool use_texture_view{};
OGLProgram program{};
GLint dst_size_loc{-1}, src_size_loc{-1}, src_offset_loc{-1};
OGLVertexArray vao{};
OGLTexture temp_tex{};
Common::Rectangle<u32> temp_rect{0, 0, 0, 0};
};
state.texture_units[1].texture_2d = temp_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
FormatReinterpreterOpenGL::FormatReinterpreterOpenGL() {
const std::string_view vendor{reinterpret_cast<const char*>(glGetString(GL_VENDOR))};
const std::string_view version{reinterpret_cast<const char*>(glGetString(GL_VERSION))};
glActiveTexture(GL_TEXTURE1);
if (!use_texture_view) {
glCopyImageSubData(source.Handle(), GL_TEXTURE_2D, 0, src_rect.left, src_rect.bottom, 0,
temp_tex.handle, GL_TEXTURE_2D, 0, src_rect.left, src_rect.bottom, 0,
src_rect.GetWidth(), src_rect.GetHeight(), 1);
}
glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_STENCIL_INDEX);
auto Register = [this](PixelFormat dest, std::unique_ptr<FormatReinterpreterBase>&& obj) {
const u32 dst_index = static_cast<u32>(dest);
return reinterpreters[dst_index].push_back(std::move(obj));
};
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dest.Handle(),
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
Register(PixelFormat::RGBA8, std::make_unique<ShaderD24S8toRGBA8>());
LOG_INFO(Render_OpenGL, "Using shader for D24S8 to RGBA8 reinterpretation");
glUniform2i(dst_size_loc, dst_rect.GetWidth(), dst_rect.GetHeight());
glUniform2i(src_size_loc, src_rect.GetWidth(), src_rect.GetHeight());
glUniform2i(src_offset_loc, src_rect.left, src_rect.bottom);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
Register(PixelFormat::RGB5A1, std::make_unique<RGBA4toRGB5A1>());
}
auto FormatReinterpreterOpenGL::GetPossibleReinterpretations(PixelFormat dst_format)
-> const ReinterpreterList& {
return reinterpreters[static_cast<u32>(dst_format)];
temp_tex.Release();
}
} // namespace OpenGL

View file

@ -11,7 +11,7 @@
namespace OpenGL {
class RasterizerCacheOpenGL;
class Surface;
class FormatReinterpreterBase {
public:
@ -22,9 +22,9 @@ public:
virtual ~FormatReinterpreterBase() = default;
virtual PixelFormat GetSourceFormat() const = 0;
virtual void Reinterpret(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) = 0;
virtual VideoCore::PixelFormat GetSourceFormat() const = 0;
virtual void Reinterpret(Surface& source, Common::Rectangle<u32> src_rect, Surface& dest,
Common::Rectangle<u32> dst_rect) = 0;
protected:
OGLFramebuffer read_fbo;
@ -33,15 +33,45 @@ protected:
using ReinterpreterList = std::vector<std::unique_ptr<FormatReinterpreterBase>>;
class FormatReinterpreterOpenGL : NonCopyable {
class RGBA4toRGB5A1 final : public FormatReinterpreterBase {
public:
FormatReinterpreterOpenGL();
~FormatReinterpreterOpenGL() = default;
RGBA4toRGB5A1();
const ReinterpreterList& GetPossibleReinterpretations(PixelFormat dst_format);
VideoCore::PixelFormat GetSourceFormat() const override {
return VideoCore::PixelFormat::RGBA4;
}
void Reinterpret(Surface& source, Common::Rectangle<u32> src_rect, Surface& dest,
Common::Rectangle<u32> dst_rect) override;
private:
std::array<ReinterpreterList, PIXEL_FORMAT_COUNT> reinterpreters;
OGLProgram program;
GLint dst_size_loc{-1};
GLint src_size_loc{-1};
GLint src_offset_loc{-1};
OGLVertexArray vao;
};
class ShaderD24S8toRGBA8 final : public FormatReinterpreterBase {
public:
ShaderD24S8toRGBA8();
VideoCore::PixelFormat GetSourceFormat() const override {
return VideoCore::PixelFormat::D24S8;
}
void Reinterpret(Surface& source, Common::Rectangle<u32> src_rect, Surface& dest,
Common::Rectangle<u32> dst_rect) override;
private:
bool use_texture_view{};
OGLProgram program{};
GLint dst_size_loc{-1};
GLint src_size_loc{-1};
GLint src_offset_loc{-1};
OGLVertexArray vao{};
OGLTexture temp_tex{};
Common::Rectangle<u32> temp_rect{0, 0, 0, 0};
};
} // namespace OpenGL

View file

@ -25,9 +25,10 @@ MICROPROFILE_DEFINE(OpenGL_VAO, "OpenGL", "Vertex Array Setup", MP_RGB(255, 128,
MICROPROFILE_DEFINE(OpenGL_VS, "OpenGL", "Vertex Shader Setup", MP_RGB(192, 128, 128));
MICROPROFILE_DEFINE(OpenGL_GS, "OpenGL", "Geometry Shader Setup", MP_RGB(128, 192, 128));
MICROPROFILE_DEFINE(OpenGL_Drawing, "OpenGL", "Drawing", MP_RGB(128, 128, 192));
MICROPROFILE_DEFINE(OpenGL_Blits, "OpenGL", "Blits", MP_RGB(100, 100, 255));
MICROPROFILE_DEFINE(OpenGL_CacheManagement, "OpenGL", "Cache Mgmt", MP_RGB(100, 255, 100));
using VideoCore::SurfaceType;
constexpr std::size_t VERTEX_BUFFER_SIZE = 16 * 1024 * 1024;
constexpr std::size_t INDEX_BUFFER_SIZE = 2 * 1024 * 1024;
constexpr std::size_t UNIFORM_BUFFER_SIZE = 2 * 1024 * 1024;
@ -62,17 +63,27 @@ GLenum MakeAttributeType(Pica::PipelineRegs::VertexAttributeFormat format) {
return GL_UNSIGNED_BYTE;
}
[[nodiscard]] GLsizeiptr TextureBufferSize() {
// Use the smallest texel size from the texel views
// which corresponds to GL_RG32F
GLint max_texel_buffer_size;
glGetIntegerv(GL_MAX_TEXTURE_BUFFER_SIZE, &max_texel_buffer_size);
LOG_INFO(Render_OpenGL, "Max texture buffer size: {}", max_texel_buffer_size);
return std::min<GLsizeiptr>(max_texel_buffer_size * 8, TEXTURE_BUFFER_SIZE);
}
} // Anonymous namespace
RasterizerOpenGL::RasterizerOpenGL(Memory::MemorySystem& memory, VideoCore::RendererBase& renderer,
Driver& driver_)
: VideoCore::RasterizerAccelerated{memory}, driver{driver_}, res_cache{renderer},
: VideoCore::RasterizerAccelerated{memory}, driver{driver_}, runtime{driver, renderer},
res_cache{memory, runtime, regs, renderer}, texture_buffer_size{TextureBufferSize()},
vertex_buffer{driver, GL_ARRAY_BUFFER, VERTEX_BUFFER_SIZE},
uniform_buffer{driver, GL_UNIFORM_BUFFER, UNIFORM_BUFFER_SIZE},
index_buffer{driver, GL_ELEMENT_ARRAY_BUFFER, INDEX_BUFFER_SIZE},
texture_buffer{driver, GL_TEXTURE_BUFFER, TEXTURE_BUFFER_SIZE}, texture_lf_buffer{
texture_buffer{driver, GL_TEXTURE_BUFFER, texture_buffer_size}, texture_lf_buffer{
driver, GL_TEXTURE_BUFFER,
TEXTURE_BUFFER_SIZE} {
texture_buffer_size} {
// Clipping plane 0 is always enabled for PICA fixed clip plane z <= 0
state.clip_distance[0] = true;
@ -372,11 +383,8 @@ void RasterizerOpenGL::DrawTriangles() {
bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
MICROPROFILE_SCOPE(OpenGL_Drawing);
bool shadow_rendering = regs.framebuffer.output_merger.fragment_operation_mode ==
Pica::FramebufferRegs::FragmentOperationMode::Shadow;
const bool has_stencil =
regs.framebuffer.framebuffer.depth_format == Pica::FramebufferRegs::DepthFormat::D24S8;
const bool shadow_rendering = regs.framebuffer.IsShadowRendering();
const bool has_stencil = regs.framebuffer.HasStencil();
const bool write_color_fb = shadow_rendering || state.color_mask.red_enabled == GL_TRUE ||
state.color_mask.green_enabled == GL_TRUE ||
@ -394,108 +402,45 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
(write_depth_fb || regs.framebuffer.output_merger.depth_test_enable != 0 ||
(has_stencil && state.stencil.test_enabled));
Common::Rectangle<s32> viewport_rect_unscaled{
// These registers hold half-width and half-height, so must be multiplied by 2
regs.rasterizer.viewport_corner.x, // left
regs.rasterizer.viewport_corner.y + // top
static_cast<s32>(Pica::float24::FromRaw(regs.rasterizer.viewport_size_y).ToFloat32() *
2),
regs.rasterizer.viewport_corner.x + // right
static_cast<s32>(Pica::float24::FromRaw(regs.rasterizer.viewport_size_x).ToFloat32() *
2),
regs.rasterizer.viewport_corner.y // bottom
};
Surface color_surface;
Surface depth_surface;
Common::Rectangle<u32> surfaces_rect;
std::tie(color_surface, depth_surface, surfaces_rect) =
res_cache.GetFramebufferSurfaces(using_color_fb, using_depth_fb, viewport_rect_unscaled);
const u16 res_scale = color_surface != nullptr
? color_surface->res_scale
: (depth_surface == nullptr ? 1u : depth_surface->res_scale);
Common::Rectangle<u32> draw_rect{
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.left) +
viewport_rect_unscaled.left * res_scale,
surfaces_rect.left, surfaces_rect.right)), // Left
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) +
viewport_rect_unscaled.top * res_scale,
surfaces_rect.bottom, surfaces_rect.top)), // Top
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.left) +
viewport_rect_unscaled.right * res_scale,
surfaces_rect.left, surfaces_rect.right)), // Right
static_cast<u32>(std::clamp<s32>(static_cast<s32>(surfaces_rect.bottom) +
viewport_rect_unscaled.bottom * res_scale,
surfaces_rect.bottom, surfaces_rect.top))}; // Bottom
// Bind the framebuffer surfaces
state.draw.draw_framebuffer = framebuffer.handle;
state.Apply();
if (shadow_rendering) {
if (color_surface == nullptr) {
return true;
}
glFramebufferParameteri(GL_DRAW_FRAMEBUFFER, GL_FRAMEBUFFER_DEFAULT_WIDTH,
color_surface->width * color_surface->res_scale);
glFramebufferParameteri(GL_DRAW_FRAMEBUFFER, GL_FRAMEBUFFER_DEFAULT_HEIGHT,
color_surface->height * color_surface->res_scale);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
state.image_shadow_buffer = color_surface->texture.handle;
} else {
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
color_surface != nullptr ? color_surface->texture.handle : 0, 0);
if (depth_surface != nullptr) {
if (has_stencil) {
// attach both depth and stencil
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_TEXTURE_2D, depth_surface->texture.handle, 0);
} else {
// attach depth
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D,
depth_surface->texture.handle, 0);
// clear stencil attachment
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
}
} else {
// clear both depth and stencil attachment
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
0, 0);
}
const Framebuffer framebuffer =
res_cache.GetFramebufferSurfaces(using_color_fb, using_depth_fb);
const bool has_color = framebuffer.HasAttachment(SurfaceType::Color);
const bool has_depth_stencil = framebuffer.HasAttachment(SurfaceType::DepthStencil);
if (!has_color && (shadow_rendering || !has_depth_stencil)) {
return true;
}
// Sync the viewport
state.viewport.x =
static_cast<GLint>(surfaces_rect.left) + viewport_rect_unscaled.left * res_scale;
state.viewport.y =
static_cast<GLint>(surfaces_rect.bottom) + viewport_rect_unscaled.bottom * res_scale;
state.viewport.width = static_cast<GLsizei>(viewport_rect_unscaled.GetWidth() * res_scale);
state.viewport.height = static_cast<GLsizei>(viewport_rect_unscaled.GetHeight() * res_scale);
// Bind the framebuffer surfaces
if (shadow_rendering) {
state.image_shadow_buffer = framebuffer.Attachment(SurfaceType::Color);
}
state.draw.draw_framebuffer = framebuffer.Handle();
// Sync the viewport
const auto viewport = framebuffer.Viewport();
state.viewport.x = viewport.x;
state.viewport.y = viewport.y;
state.viewport.width = viewport.width;
state.viewport.height = viewport.height;
// Viewport can have negative offsets or larger dimensions than our framebuffer sub-rect.
// Enable scissor test to prevent drawing outside of the framebuffer region
const auto draw_rect = framebuffer.DrawRect();
state.scissor.enabled = true;
state.scissor.x = draw_rect.left;
state.scissor.y = draw_rect.bottom;
state.scissor.width = draw_rect.GetWidth();
state.scissor.height = draw_rect.GetHeight();
state.Apply();
const u32 res_scale = framebuffer.ResolutionScale();
if (uniform_block_data.data.framebuffer_scale != res_scale) {
uniform_block_data.data.framebuffer_scale = res_scale;
uniform_block_data.dirty = true;
}
// Scissor checks are window-, not viewport-relative, which means that if the cached texture
// sub-rect changes, the scissor bounds also need to be updated.
GLint scissor_x1 =
static_cast<GLint>(surfaces_rect.left + regs.rasterizer.scissor_test.x1 * res_scale);
GLint scissor_y1 =
static_cast<GLint>(surfaces_rect.bottom + regs.rasterizer.scissor_test.y1 * res_scale);
// x2, y2 have +1 added to cover the entire pixel area, otherwise you might get cracks when
// scaling or doing multisampling.
GLint scissor_x2 =
static_cast<GLint>(surfaces_rect.left + (regs.rasterizer.scissor_test.x2 + 1) * res_scale);
GLint scissor_y2 = static_cast<GLint>(surfaces_rect.bottom +
(regs.rasterizer.scissor_test.y2 + 1) * res_scale);
// Update scissor uniforms
const auto [scissor_x1, scissor_y2, scissor_x2, scissor_y1] = framebuffer.Scissor();
if (uniform_block_data.data.scissor_x1 != scissor_x1 ||
uniform_block_data.data.scissor_x2 != scissor_x2 ||
uniform_block_data.data.scissor_y1 != scissor_y1 ||
@ -508,127 +453,8 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
uniform_block_data.dirty = true;
}
bool need_duplicate_texture = false;
auto CheckBarrier = [&need_duplicate_texture, &color_surface](GLuint handle) {
if (color_surface && color_surface->texture.handle == handle) {
need_duplicate_texture = true;
}
};
const auto BindCubeFace = [&](GLuint& target, Pica::TexturingRegs::CubeFace face,
Pica::Texture::TextureInfo& info) {
info.physical_address = regs.texturing.GetCubePhysicalAddress(face);
Surface surface = res_cache.GetTextureSurface(info);
if (surface != nullptr) {
CheckBarrier(target = surface->texture.handle);
} else {
target = 0;
}
};
// Sync and bind the texture surfaces
const auto pica_textures = regs.texturing.GetTextures();
for (unsigned texture_index = 0; texture_index < pica_textures.size(); ++texture_index) {
const auto& texture = pica_textures[texture_index];
if (texture.enabled) {
if (texture_index == 0) {
using TextureType = Pica::TexturingRegs::TextureConfig::TextureType;
switch (texture.config.type.Value()) {
case TextureType::Shadow2D: {
Surface surface = res_cache.GetTextureSurface(texture);
if (surface != nullptr) {
CheckBarrier(state.image_shadow_texture_px = surface->texture.handle);
} else {
state.image_shadow_texture_px = 0;
}
continue;
}
case TextureType::ShadowCube: {
using CubeFace = Pica::TexturingRegs::CubeFace;
auto info = Pica::Texture::TextureInfo::FromPicaRegister(texture.config,
texture.format);
BindCubeFace(state.image_shadow_texture_px, CubeFace::PositiveX, info);
BindCubeFace(state.image_shadow_texture_nx, CubeFace::NegativeX, info);
BindCubeFace(state.image_shadow_texture_py, CubeFace::PositiveY, info);
BindCubeFace(state.image_shadow_texture_ny, CubeFace::NegativeY, info);
BindCubeFace(state.image_shadow_texture_pz, CubeFace::PositiveZ, info);
BindCubeFace(state.image_shadow_texture_nz, CubeFace::NegativeZ, info);
continue;
}
case TextureType::TextureCube:
using CubeFace = Pica::TexturingRegs::CubeFace;
TextureCubeConfig config;
config.px = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveX);
config.nx = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeX);
config.py = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveY);
config.ny = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeY);
config.pz = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveZ);
config.nz = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeZ);
config.width = texture.config.width;
config.format = texture.format;
state.texture_cube_unit.texture_cube =
res_cache.GetTextureCube(config).texture.handle;
texture_cube_sampler.SyncWithConfig(texture.config);
state.texture_units[texture_index].texture_2d = 0;
continue; // Texture unit 0 setup finished. Continue to next unit
default:
break;
}
state.texture_cube_unit.texture_cube = 0;
}
texture_samplers[texture_index].SyncWithConfig(texture.config);
Surface surface = res_cache.GetTextureSurface(texture);
if (surface != nullptr) {
CheckBarrier(state.texture_units[texture_index].texture_2d =
surface->texture.handle);
} else {
// Can occur when texture addr is null or its memory is unmapped/invalid
// HACK: In this case, the correct behaviour for the PICA is to use the last
// rendered colour. But because this would be impractical to implement, the
// next best alternative is to use a clear texture, essentially skipping
// the geometry in question.
// For example: a bug in Pokemon X/Y causes NULL-texture squares to be drawn
// on the male character's face, which in the OpenGL default appear black.
state.texture_units[texture_index].texture_2d = default_texture;
}
} else {
state.texture_units[texture_index].texture_2d = 0;
}
}
OGLTexture temp_tex;
if (need_duplicate_texture) {
const auto& tuple = GetFormatTuple(color_surface->pixel_format);
const GLsizei levels = color_surface->max_level + 1;
// The game is trying to use a surface as a texture and framebuffer at the same time
// which causes unpredictable behavior on the host.
// Making a copy to sample from eliminates this issue and seems to be fairly cheap.
temp_tex.Create();
temp_tex.Allocate(GL_TEXTURE_2D, levels, tuple.internal_format,
color_surface->GetScaledWidth(), color_surface->GetScaledHeight());
temp_tex.CopyFrom(color_surface->texture, GL_TEXTURE_2D, levels,
color_surface->GetScaledWidth(), color_surface->GetScaledHeight());
for (auto& unit : state.texture_units) {
if (unit.texture_2d == color_surface->texture.handle) {
unit.texture_2d = temp_tex.handle;
}
}
for (auto shadow_unit : {&state.image_shadow_texture_nx, &state.image_shadow_texture_ny,
&state.image_shadow_texture_nz, &state.image_shadow_texture_px,
&state.image_shadow_texture_py, &state.image_shadow_texture_pz}) {
if (*shadow_unit == color_surface->texture.handle) {
*shadow_unit = temp_tex.handle;
}
}
}
SyncTextureUnits(framebuffer);
// Sync and bind the shader
if (shader_dirty) {
@ -643,17 +469,6 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
// Sync the uniform data
UploadUniforms(accelerate);
// Viewport can have negative offsets or larger
// dimensions than our framebuffer sub-rect.
// Enable scissor test to prevent drawing
// outside of the framebuffer region
state.scissor.enabled = true;
state.scissor.x = draw_rect.left;
state.scissor.y = draw_rect.bottom;
state.scissor.width = draw_rect.GetWidth();
state.scissor.height = draw_rect.GetHeight();
state.Apply();
// Draw the vertex batch
bool succeeded = true;
if (accelerate) {
@ -671,12 +486,11 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
base_vertex += max_vertices) {
const std::size_t vertices = std::min(max_vertices, vertex_batch.size() - base_vertex);
const std::size_t vertex_size = vertices * sizeof(HardwareVertex);
u8* vbo;
GLintptr offset;
std::tie(vbo, offset, std::ignore) =
vertex_buffer.Map(vertex_size, sizeof(HardwareVertex));
const auto [vbo, offset, _] = vertex_buffer.Map(vertex_size, sizeof(HardwareVertex));
std::memcpy(vbo, vertex_batch.data() + base_vertex, vertex_size);
vertex_buffer.Unmap(vertex_size);
glDrawArrays(GL_TRIANGLES, static_cast<GLint>(offset / sizeof(HardwareVertex)),
static_cast<GLsizei>(vertices));
}
@ -684,10 +498,133 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
vertex_batch.clear();
// Reset textures in rasterizer state context because the rasterizer cache might delete them
for (unsigned texture_index = 0; texture_index < pica_textures.size(); ++texture_index) {
state.texture_units[texture_index].texture_2d = 0;
if (shadow_rendering) {
glMemoryBarrier(GL_TEXTURE_FETCH_BARRIER_BIT | GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
GL_TEXTURE_UPDATE_BARRIER_BIT | GL_FRAMEBUFFER_BARRIER_BIT);
}
res_cache.InvalidateFramebuffer(framebuffer);
return succeeded;
}
void RasterizerOpenGL::SyncTextureUnits(const Framebuffer& framebuffer) {
using TextureType = Pica::TexturingRegs::TextureConfig::TextureType;
const auto pica_textures = regs.texturing.GetTextures();
for (u32 texture_index = 0; texture_index < pica_textures.size(); ++texture_index) {
const auto& texture = pica_textures[texture_index];
// If the texture unit is disabled unbind the corresponding gl unit
if (!texture.enabled) {
state.texture_units[texture_index].texture_2d = 0;
continue;
}
// Handle special tex0 configurations
if (texture_index == 0) {
switch (texture.config.type.Value()) {
case TextureType::Shadow2D: {
auto surface = res_cache.GetTextureSurface(texture);
state.image_shadow_texture_px = surface->Handle();
continue;
}
case TextureType::ShadowCube: {
BindShadowCube(texture);
continue;
}
case TextureType::TextureCube: {
BindTextureCube(texture);
continue;
}
default:
UnbindSpecial();
}
}
// Sync texture unit sampler
texture_samplers[texture_index].SyncWithConfig(texture.config);
// Bind the texture provided by the rasterizer cache
auto surface = res_cache.GetTextureSurface(texture);
if (!surface) {
// Can occur when texture addr is null or its memory is unmapped/invalid
// HACK: In this case, the correct behaviour for the PICA is to use the last
// rendered colour. But because this would be impractical to implement, the
// next best alternative is to use a clear texture, essentially skipping
// the geometry in question.
// For example: a bug in Pokemon X/Y causes NULL-texture squares to be drawn
// on the male character's face, which in the OpenGL default appear black.
state.texture_units[texture_index].texture_2d = default_texture;
} else if (!IsFeedbackLoop(texture_index, framebuffer, *surface)) {
state.texture_units[texture_index].texture_2d = surface->Handle();
}
}
}
void RasterizerOpenGL::BindShadowCube(const Pica::TexturingRegs::FullTextureConfig& texture) {
using CubeFace = Pica::TexturingRegs::CubeFace;
auto info = Pica::Texture::TextureInfo::FromPicaRegister(texture.config, texture.format);
constexpr std::array faces = {
CubeFace::PositiveX, CubeFace::NegativeX, CubeFace::PositiveY,
CubeFace::NegativeY, CubeFace::PositiveZ, CubeFace::NegativeZ,
};
for (CubeFace face : faces) {
const u32 binding = static_cast<u32>(face);
info.physical_address = regs.texturing.GetCubePhysicalAddress(face);
auto surface = res_cache.GetTextureSurface(info);
state.image_shadow_texture[binding] = surface->Handle();
}
}
void RasterizerOpenGL::BindTextureCube(const Pica::TexturingRegs::FullTextureConfig& texture) {
using CubeFace = Pica::TexturingRegs::CubeFace;
const VideoCore::TextureCubeConfig config = {
.px = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveX),
.nx = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeX),
.py = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveY),
.ny = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeY),
.pz = regs.texturing.GetCubePhysicalAddress(CubeFace::PositiveZ),
.nz = regs.texturing.GetCubePhysicalAddress(CubeFace::NegativeZ),
.width = texture.config.width,
.levels = texture.config.lod.max_level + 1,
.format = texture.format,
};
auto surface = res_cache.GetTextureCube(config);
texture_cube_sampler.SyncWithConfig(texture.config);
state.texture_cube_unit.texture_cube = surface->Handle();
state.texture_units[0].texture_2d = 0;
}
bool RasterizerOpenGL::IsFeedbackLoop(u32 texture_index, const Framebuffer& framebuffer,
Surface& surface) {
const GLuint color_attachment = framebuffer.Attachment(SurfaceType::Color);
const bool is_feedback_loop = color_attachment == surface.Handle();
if (!is_feedback_loop) {
return false;
}
// Make a temporary copy of the framebuffer to sample from
Surface temp_surface{runtime, framebuffer.ColorParams()};
const VideoCore::TextureCopy copy = {
.src_level = 0,
.dst_level = 0,
.src_layer = 0,
.dst_layer = 0,
.src_offset = {0, 0},
.dst_offset = {0, 0},
.extent = {temp_surface.GetScaledWidth(), temp_surface.GetScaledHeight()},
};
runtime.CopyTextures(surface, temp_surface, copy);
state.texture_units[texture_index].texture_2d = temp_surface.Handle();
return true;
}
void RasterizerOpenGL::UnbindSpecial() {
state.texture_cube_unit.texture_cube = 0;
state.image_shadow_texture_px = 0;
state.image_shadow_texture_nx = 0;
@ -697,29 +634,6 @@ bool RasterizerOpenGL::Draw(bool accelerate, bool is_indexed) {
state.image_shadow_texture_nz = 0;
state.image_shadow_buffer = 0;
state.Apply();
if (shadow_rendering) {
glMemoryBarrier(GL_TEXTURE_FETCH_BARRIER_BIT | GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
GL_TEXTURE_UPDATE_BARRIER_BIT | GL_FRAMEBUFFER_BARRIER_BIT);
}
// Mark framebuffer surfaces as dirty
Common::Rectangle<u32> draw_rect_unscaled{draw_rect.left / res_scale, draw_rect.top / res_scale,
draw_rect.right / res_scale,
draw_rect.bottom / res_scale};
if (color_surface != nullptr && write_color_fb) {
auto interval = color_surface->GetSubRectInterval(draw_rect_unscaled);
res_cache.InvalidateRegion(boost::icl::first(interval), boost::icl::length(interval),
color_surface);
}
if (depth_surface != nullptr && write_depth_fb) {
auto interval = depth_surface->GetSubRectInterval(draw_rect_unscaled);
res_cache.InvalidateRegion(boost::icl::first(interval), boost::icl::length(interval),
depth_surface);
}
return succeeded;
}
void RasterizerOpenGL::NotifyFixedFunctionPicaRegisterChanged(u32 id) {
@ -815,148 +729,15 @@ void RasterizerOpenGL::ClearAll(bool flush) {
}
bool RasterizerOpenGL::AccelerateDisplayTransfer(const GPU::Regs::DisplayTransferConfig& config) {
MICROPROFILE_SCOPE(OpenGL_Blits);
SurfaceParams src_params;
src_params.addr = config.GetPhysicalInputAddress();
src_params.width = config.output_width;
src_params.stride = config.input_width;
src_params.height = config.output_height;
src_params.is_tiled = !config.input_linear;
src_params.pixel_format = PixelFormatFromGPUPixelFormat(config.input_format);
src_params.UpdateParams();
SurfaceParams dst_params;
dst_params.addr = config.GetPhysicalOutputAddress();
dst_params.width = config.scaling != config.NoScale ? config.output_width.Value() / 2
: config.output_width.Value();
dst_params.height = config.scaling == config.ScaleXY ? config.output_height.Value() / 2
: config.output_height.Value();
dst_params.is_tiled = config.input_linear != config.dont_swizzle;
dst_params.pixel_format = PixelFormatFromGPUPixelFormat(config.output_format);
dst_params.UpdateParams();
Common::Rectangle<u32> src_rect;
Surface src_surface;
std::tie(src_surface, src_rect) =
res_cache.GetSurfaceSubRect(src_params, ScaleMatch::Ignore, true);
if (src_surface == nullptr)
return false;
dst_params.res_scale = src_surface->res_scale;
Common::Rectangle<u32> dst_rect;
Surface dst_surface;
std::tie(dst_surface, dst_rect) =
res_cache.GetSurfaceSubRect(dst_params, ScaleMatch::Upscale, false);
if (dst_surface == nullptr)
return false;
if (src_surface->is_tiled != dst_surface->is_tiled)
std::swap(src_rect.top, src_rect.bottom);
if (config.flip_vertically)
std::swap(src_rect.top, src_rect.bottom);
if (!res_cache.BlitSurfaces(src_surface, src_rect, dst_surface, dst_rect))
return false;
res_cache.InvalidateRegion(dst_params.addr, dst_params.size, dst_surface);
return true;
return res_cache.AccelerateDisplayTransfer(config);
}
bool RasterizerOpenGL::AccelerateTextureCopy(const GPU::Regs::DisplayTransferConfig& config) {
u32 copy_size = Common::AlignDown(config.texture_copy.size, 16);
if (copy_size == 0) {
return false;
}
u32 input_gap = config.texture_copy.input_gap * 16;
u32 input_width = config.texture_copy.input_width * 16;
if (input_width == 0 && input_gap != 0) {
return false;
}
if (input_gap == 0 || input_width >= copy_size) {
input_width = copy_size;
input_gap = 0;
}
if (copy_size % input_width != 0) {
return false;
}
u32 output_gap = config.texture_copy.output_gap * 16;
u32 output_width = config.texture_copy.output_width * 16;
if (output_width == 0 && output_gap != 0) {
return false;
}
if (output_gap == 0 || output_width >= copy_size) {
output_width = copy_size;
output_gap = 0;
}
if (copy_size % output_width != 0) {
return false;
}
SurfaceParams src_params;
src_params.addr = config.GetPhysicalInputAddress();
src_params.stride = input_width + input_gap; // stride in bytes
src_params.width = input_width; // width in bytes
src_params.height = copy_size / input_width;
src_params.size = ((src_params.height - 1) * src_params.stride) + src_params.width;
src_params.end = src_params.addr + src_params.size;
Common::Rectangle<u32> src_rect;
Surface src_surface;
std::tie(src_surface, src_rect) = res_cache.GetTexCopySurface(src_params);
if (src_surface == nullptr) {
return false;
}
if (output_gap != 0 &&
(output_width != src_surface->BytesInPixels(src_rect.GetWidth() / src_surface->res_scale) *
(src_surface->is_tiled ? 8 : 1) ||
output_gap % src_surface->BytesInPixels(src_surface->is_tiled ? 64 : 1) != 0)) {
return false;
}
SurfaceParams dst_params = *src_surface;
dst_params.addr = config.GetPhysicalOutputAddress();
dst_params.width = src_rect.GetWidth() / src_surface->res_scale;
dst_params.stride = dst_params.width + src_surface->PixelsInBytes(
src_surface->is_tiled ? output_gap / 8 : output_gap);
dst_params.height = src_rect.GetHeight() / src_surface->res_scale;
dst_params.res_scale = src_surface->res_scale;
dst_params.UpdateParams();
// Since we are going to invalidate the gap if there is one, we will have to load it first
const bool load_gap = output_gap != 0;
Common::Rectangle<u32> dst_rect;
Surface dst_surface;
std::tie(dst_surface, dst_rect) =
res_cache.GetSurfaceSubRect(dst_params, ScaleMatch::Upscale, load_gap);
if (dst_surface == nullptr) {
return false;
}
if (dst_surface->type == SurfaceType::Texture) {
return false;
}
if (!res_cache.BlitSurfaces(src_surface, src_rect, dst_surface, dst_rect)) {
return false;
}
res_cache.InvalidateRegion(dst_params.addr, dst_params.size, dst_surface);
return true;
return res_cache.AccelerateTextureCopy(config);
}
bool RasterizerOpenGL::AccelerateFill(const GPU::Regs::MemoryFillConfig& config) {
Surface dst_surface = res_cache.GetFillSurface(config);
if (dst_surface == nullptr)
return false;
res_cache.InvalidateRegion(dst_surface->addr, dst_surface->size, dst_surface);
return true;
return res_cache.AccelerateFill(config);
}
bool RasterizerOpenGL::AccelerateDisplay(const GPU::Regs::FramebufferConfig& config,
@ -967,32 +748,30 @@ bool RasterizerOpenGL::AccelerateDisplay(const GPU::Regs::FramebufferConfig& con
}
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
SurfaceParams src_params;
VideoCore::SurfaceParams src_params;
src_params.addr = framebuffer_addr;
src_params.width = std::min(config.width.Value(), pixel_stride);
src_params.height = config.height;
src_params.stride = pixel_stride;
src_params.is_tiled = false;
src_params.pixel_format = PixelFormatFromGPUPixelFormat(config.color_format);
src_params.pixel_format = VideoCore::PixelFormatFromGPUPixelFormat(config.color_format);
src_params.UpdateParams();
Common::Rectangle<u32> src_rect;
Surface src_surface;
std::tie(src_surface, src_rect) =
res_cache.GetSurfaceSubRect(src_params, ScaleMatch::Ignore, true);
auto [src_surface, src_rect] =
res_cache.GetSurfaceSubRect(src_params, VideoCore::ScaleMatch::Ignore, true);
if (src_surface == nullptr) {
return false;
}
u32 scaled_width = src_surface->GetScaledWidth();
u32 scaled_height = src_surface->GetScaledHeight();
const u32 scaled_width = src_surface->GetScaledWidth();
const u32 scaled_height = src_surface->GetScaledHeight();
screen_info.display_texcoords = Common::Rectangle<float>(
(float)src_rect.bottom / (float)scaled_height, (float)src_rect.left / (float)scaled_width,
(float)src_rect.top / (float)scaled_height, (float)src_rect.right / (float)scaled_width);
screen_info.display_texture = src_surface->texture.handle;
screen_info.display_texture = src_surface->Handle();
return true;
}
@ -1003,7 +782,6 @@ void RasterizerOpenGL::SamplerInfo::Create() {
wrap_s = wrap_t = TextureConfig::Repeat;
border_color = 0;
lod_min = lod_max = 0;
lod_bias = 0;
// default is 1000 and -1000
// Other attributes have correct defaults
@ -1021,22 +799,11 @@ void RasterizerOpenGL::SamplerInfo::SyncWithConfig(
glSamplerParameteri(s, GL_TEXTURE_MAG_FILTER, PicaToGL::TextureMagFilterMode(mag_filter));
}
// TODO(wwylele): remove new_supress_mipmap_for_cube logic once mipmap for cube is implemented
bool new_supress_mipmap_for_cube =
config.type == Pica::TexturingRegs::TextureConfig::TextureCube;
if (min_filter != config.min_filter || mip_filter != config.mip_filter ||
supress_mipmap_for_cube != new_supress_mipmap_for_cube) {
if (min_filter != config.min_filter || mip_filter != config.mip_filter) {
min_filter = config.min_filter;
mip_filter = config.mip_filter;
supress_mipmap_for_cube = new_supress_mipmap_for_cube;
if (new_supress_mipmap_for_cube) {
// HACK: use mag filter converter for min filter because they are the same anyway
glSamplerParameteri(s, GL_TEXTURE_MIN_FILTER,
PicaToGL::TextureMagFilterMode(min_filter));
} else {
glSamplerParameteri(s, GL_TEXTURE_MIN_FILTER,
PicaToGL::TextureMinFilterMode(min_filter, mip_filter));
}
glSamplerParameteri(s, GL_TEXTURE_MIN_FILTER,
PicaToGL::TextureMinFilterMode(min_filter, mip_filter));
}
if (wrap_s != config.wrap_s) {
@ -1065,11 +832,6 @@ void RasterizerOpenGL::SamplerInfo::SyncWithConfig(
lod_max = config.lod.max_level;
glSamplerParameterf(s, GL_TEXTURE_MAX_LOD, static_cast<float>(lod_max));
}
if (!GLES && lod_bias != config.lod.bias) {
lod_bias = config.lod.bias;
glSamplerParameterf(s, GL_TEXTURE_LOD_BIAS, lod_bias / 256.0f);
}
}
void RasterizerOpenGL::SetShader() {

View file

@ -12,7 +12,7 @@
#include "video_core/renderer_opengl/gl_shader_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_stream_buffer.h"
#include "video_core/shader/shader.h"
#include "video_core/renderer_opengl/gl_texture_runtime.h"
namespace VideoCore {
class RendererBase;
@ -69,10 +69,6 @@ private:
u32 border_color;
u32 lod_min;
u32 lod_max;
s32 lod_bias;
// TODO(wwylele): remove this once mipmap for cube is implemented
bool supress_mipmap_for_cube = false;
};
/// Syncs the clip enabled status to match the PICA register
@ -115,6 +111,21 @@ private:
void SyncAndUploadLUTs();
void SyncAndUploadLUTsLF();
/// Syncs all enabled PICA texture units
void SyncTextureUnits(const Framebuffer& framebuffer);
/// Binds the PICA shadow cube required for shadow mapping
void BindShadowCube(const Pica::TexturingRegs::FullTextureConfig& texture);
/// Binds a texture cube to texture unit 0
void BindTextureCube(const Pica::TexturingRegs::FullTextureConfig& texture);
/// Makes a temporary copy of the framebuffer if a feedback loop is detected
bool IsFeedbackLoop(u32 texture_index, const Framebuffer& framebuffer, Surface& surface);
/// Unbinds all special texture unit 0 texture configurations
void UnbindSpecial();
/// Upload the uniform blocks to the uniform buffer object
void UploadUniforms(bool accelerate_draw);
@ -138,7 +149,8 @@ private:
Driver& driver;
OpenGLState state;
GLuint default_texture;
RasterizerCacheOpenGL res_cache;
TextureRuntime runtime;
VideoCore::RasterizerCache res_cache;
std::unique_ptr<ShaderProgramManager> shader_program_manager;
OGLVertexArray sw_vao; // VAO for software shader draw
@ -146,6 +158,7 @@ private:
std::array<bool, 16> hw_vao_enabled_attributes{};
std::array<SamplerInfo, 3> texture_samplers;
GLsizeiptr texture_buffer_size;
OGLStreamBuffer vertex_buffer;
OGLStreamBuffer uniform_buffer;
OGLStreamBuffer index_buffer;

View file

@ -83,20 +83,6 @@ void OGLTexture::Allocate(GLenum target, GLsizei levels, GLenum internalformat,
glBindTexture(GL_TEXTURE_2D, old_tex);
}
void OGLTexture::CopyFrom(const OGLTexture& other, GLenum target, GLsizei levels, GLsizei width,
GLsizei height) {
GLuint old_tex = OpenGLState::GetCurState().texture_units[0].texture_2d;
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, handle);
for (GLsizei level = 0; level < levels; level++) {
glCopyImageSubData(other.handle, target, level, 0, 0, 0, handle, target, level, 0, 0, 0,
width >> level, height >> level, 1);
}
glBindTexture(GL_TEXTURE_2D, old_tex);
}
void OGLSampler::Create() {
if (handle != 0) {
return;
@ -117,10 +103,10 @@ void OGLSampler::Release() {
handle = 0;
}
void OGLShader::Create(const char* source, GLenum type) {
void OGLShader::Create(std::string_view source, GLenum type) {
if (handle != 0)
return;
if (source == nullptr)
if (source.empty())
return;
MICROPROFILE_SCOPE(OpenGL_ResourceCreation);
@ -144,7 +130,7 @@ void OGLProgram::Create(bool separable_program, const std::vector<GLuint>& shade
handle = LoadProgram(separable_program, shaders);
}
void OGLProgram::Create(const char* vert_shader, const char* frag_shader) {
void OGLProgram::Create(std::string_view vert_shader, std::string_view frag_shader) {
OGLShader vert, frag;
vert.Create(vert_shader, GL_VERTEX_SHADER);
frag.Create(frag_shader, GL_FRAGMENT_SHADER);

View file

@ -61,9 +61,6 @@ public:
void Allocate(GLenum target, GLsizei levels, GLenum internalformat, GLsizei width,
GLsizei height = 1, GLsizei depth = 1);
void CopyFrom(const OGLTexture& other, GLenum target, GLsizei levels, GLsizei width,
GLsizei height);
GLuint handle = 0;
};
@ -108,7 +105,7 @@ public:
return *this;
}
void Create(const char* source, GLenum type);
void Create(std::string_view source, GLenum type);
void Release();
@ -135,7 +132,7 @@ public:
void Create(bool separable_program, const std::vector<GLuint>& shaders);
/// Creates a new program from given shader soruce code
void Create(const char* vert_shader, const char* frag_shader);
void Create(std::string_view vert_shader, std::string_view frag_shader);
/// Deletes the internal OpenGL resource
void Release();

View file

@ -262,7 +262,8 @@ static std::string SampleTexture(const PicaFSConfig& config, unsigned texture_un
// Only unit 0 respects the texturing type
switch (state.texture0_type) {
case TexturingRegs::TextureConfig::Texture2D:
return "textureLod(tex0, texcoord0, getLod(texcoord0 * vec2(textureSize(tex0, 0))))";
return "textureLod(tex0, texcoord0, getLod(texcoord0 * vec2(textureSize(tex0, 0))) + "
"tex_lod_bias[0])";
case TexturingRegs::TextureConfig::Projection2D:
// TODO (wwylele): find the exact LOD formula for projection texture
return "textureProj(tex0, vec3(texcoord0, texcoord0_w))";
@ -280,12 +281,15 @@ static std::string SampleTexture(const PicaFSConfig& config, unsigned texture_un
return "texture(tex0, texcoord0)";
}
case 1:
return "textureLod(tex1, texcoord1, getLod(texcoord1 * vec2(textureSize(tex1, 0))))";
return "textureLod(tex1, texcoord1, getLod(texcoord1 * vec2(textureSize(tex1, 0))) + "
"tex_lod_bias[1])";
case 2:
if (state.texture2_use_coord1)
return "textureLod(tex2, texcoord1, getLod(texcoord1 * vec2(textureSize(tex2, 0))))";
return "textureLod(tex2, texcoord1, getLod(texcoord1 * vec2(textureSize(tex2, 0))) + "
"tex_lod_bias[2])";
else
return "textureLod(tex2, texcoord2, getLod(texcoord2 * vec2(textureSize(tex2, 0))))";
return "textureLod(tex2, texcoord2, getLod(texcoord2 * vec2(textureSize(tex2, 0))) + "
"tex_lod_bias[2])";
case 3:
if (state.proctex.enable) {
return "ProcTex()";

View file

@ -13,7 +13,7 @@
namespace OpenGL {
GLuint LoadShader(const char* source, GLenum type) {
GLuint LoadShader(std::string_view source, GLenum type) {
const std::string version = GLES ? R"(#version 320 es
#define CITRA_GLES
@ -28,7 +28,7 @@ GLuint LoadShader(const char* source, GLenum type) {
)"
: "#version 430 core\n";
const char* debug_type;
std::string_view debug_type;
switch (type) {
case GL_VERTEX_SHADER:
debug_type = "vertex";
@ -43,9 +43,11 @@ GLuint LoadShader(const char* source, GLenum type) {
UNREACHABLE();
}
std::array<const char*, 2> src_arr{version.data(), source};
std::array<const GLchar*, 2> src_arr{version.data(), source.data()};
std::array<GLint, 2> lengths{static_cast<GLint>(version.size()),
static_cast<GLint>(source.size())};
GLuint shader_id = glCreateShader(type);
glShaderSource(shader_id, static_cast<GLsizei>(src_arr.size()), src_arr.data(), nullptr);
glShaderSource(shader_id, static_cast<GLsizei>(src_arr.size()), src_arr.data(), lengths.data());
LOG_DEBUG(Render_OpenGL, "Compiling {} shader...", debug_type);
glCompileShader(shader_id);

View file

@ -29,7 +29,7 @@ precision mediump uimage2D;
* @param source String of the GLSL shader program
* @param type Type of the shader (GL_VERTEX_SHADER, GL_GEOMETRY_SHADER or GL_FRAGMENT_SHADER)
*/
GLuint LoadShader(const char* source, GLenum type);
GLuint LoadShader(std::string_view source, GLenum type);
/**
* Utility function to create and link an OpenGL GLSL shader program

View file

@ -116,12 +116,17 @@ public:
// GL_IMAGE_BINDING_NAME
GLuint image_shadow_buffer;
GLuint image_shadow_texture_px;
GLuint image_shadow_texture_nx;
GLuint image_shadow_texture_py;
GLuint image_shadow_texture_ny;
GLuint image_shadow_texture_pz;
GLuint image_shadow_texture_nz;
union {
std::array<GLuint, 6> image_shadow_texture;
struct {
GLuint image_shadow_texture_px;
GLuint image_shadow_texture_nx;
GLuint image_shadow_texture_py;
GLuint image_shadow_texture_ny;
GLuint image_shadow_texture_pz;
GLuint image_shadow_texture_nz;
};
};
struct {
GLuint read_framebuffer; // GL_READ_FRAMEBUFFER_BINDING

View file

@ -54,7 +54,7 @@ GLsizeiptr OGLStreamBuffer::GetSize() const {
}
std::tuple<u8*, GLintptr, bool> OGLStreamBuffer::Map(GLsizeiptr size, GLintptr alignment) {
ASSERT(size <= buffer_size);
ASSERT_MSG(size <= buffer_size, "Requested size {} exceeds buffer size {}", size, buffer_size);
ASSERT(alignment <= buffer_size);
mapped_size = size;

View file

@ -0,0 +1,598 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/scope_exit.h"
#include "common/settings.h"
#include "video_core/regs.h"
#include "video_core/renderer_base.h"
#include "video_core/renderer_opengl/gl_driver.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_texture_runtime.h"
namespace OpenGL {
namespace {
using VideoCore::PixelFormat;
using VideoCore::SurfaceType;
constexpr FormatTuple DEFAULT_TUPLE = {GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE};
static constexpr std::array<FormatTuple, 4> DEPTH_TUPLES = {{
{GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT}, // D16
{},
{GL_DEPTH_COMPONENT24, GL_DEPTH_COMPONENT, GL_UNSIGNED_INT}, // D24
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8}, // D24S8
}};
static constexpr std::array<FormatTuple, 5> COLOR_TUPLES = {{
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8}, // RGBA8
{GL_RGB8, GL_BGR, GL_UNSIGNED_BYTE}, // RGB8
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_5_5_5_1}, // RGB5A1
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5}, // RGB565
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4}, // RGBA4
}};
static constexpr std::array<FormatTuple, 5> COLOR_TUPLES_OES = {{
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE}, // RGBA8
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE}, // RGB8
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_5_5_5_1}, // RGB5A1
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5}, // RGB565
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4}, // RGBA4
}};
struct FramebufferInfo {
GLuint color;
GLuint depth;
u32 color_level;
u32 depth_level;
};
[[nodiscard]] GLbitfield MakeBufferMask(SurfaceType type) {
switch (type) {
case SurfaceType::Color:
case SurfaceType::Texture:
case SurfaceType::Fill:
return GL_COLOR_BUFFER_BIT;
case SurfaceType::Depth:
return GL_DEPTH_BUFFER_BIT;
case SurfaceType::DepthStencil:
return GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT;
default:
UNREACHABLE_MSG("Invalid surface type!");
}
return GL_COLOR_BUFFER_BIT;
}
[[nodiscard]] u32 FboIndex(VideoCore::SurfaceType type) {
switch (type) {
case VideoCore::SurfaceType::Color:
case VideoCore::SurfaceType::Texture:
case VideoCore::SurfaceType::Fill:
return 0;
case VideoCore::SurfaceType::Depth:
return 1;
case VideoCore::SurfaceType::DepthStencil:
return 2;
default:
UNREACHABLE_MSG("Invalid surface type!");
}
return 0;
}
[[nodiscard]] OGLTexture MakeHandle(GLenum target, u32 width, u32 height, u32 levels,
const FormatTuple& tuple, std::string_view debug_name = "") {
OGLTexture texture{};
texture.Create();
glBindTexture(target, texture.handle);
glTexStorage2D(target, levels, tuple.internal_format, width, height);
glTexParameteri(target, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(target, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(target, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
if (!debug_name.empty()) {
glObjectLabel(GL_TEXTURE, texture.handle, -1, debug_name.data());
}
return texture;
}
} // Anonymous namespace
TextureRuntime::TextureRuntime(const Driver& driver_, VideoCore::RendererBase& renderer)
: driver{driver_}, blit_helper{*this} {
for (std::size_t i = 0; i < draw_fbos.size(); ++i) {
draw_fbos[i].Create();
read_fbos[i].Create();
}
auto Register = [this](PixelFormat dest, std::unique_ptr<FormatReinterpreterBase>&& obj) {
const u32 dst_index = static_cast<u32>(dest);
return reinterpreters[dst_index].push_back(std::move(obj));
};
Register(PixelFormat::RGBA8, std::make_unique<ShaderD24S8toRGBA8>());
Register(PixelFormat::RGB5A1, std::make_unique<RGBA4toRGB5A1>());
}
TextureRuntime::~TextureRuntime() = default;
bool TextureRuntime::NeedsConversion(VideoCore::PixelFormat pixel_format) const {
const bool should_convert = pixel_format == PixelFormat::RGBA8 || // Needs byteswap
pixel_format == PixelFormat::RGB8; // Is converted to RGBA8
return driver.IsOpenGLES() && should_convert;
}
VideoCore::StagingData TextureRuntime::FindStaging(u32 size, bool upload) {
if (size > staging_buffer.size()) {
staging_buffer.resize(size);
}
return VideoCore::StagingData{
.size = size,
.mapped = std::span{staging_buffer.data(), size},
.buffer_offset = 0,
};
}
const FormatTuple& TextureRuntime::GetFormatTuple(PixelFormat pixel_format) const {
const auto type = GetFormatType(pixel_format);
const std::size_t format_index = static_cast<std::size_t>(pixel_format);
if (type == SurfaceType::Color) {
ASSERT(format_index < COLOR_TUPLES.size());
return (driver.IsOpenGLES() ? COLOR_TUPLES_OES : COLOR_TUPLES)[format_index];
} else if (type == SurfaceType::Depth || type == SurfaceType::DepthStencil) {
const std::size_t tuple_idx = format_index - 14;
ASSERT(tuple_idx < DEPTH_TUPLES.size());
return DEPTH_TUPLES[tuple_idx];
}
return DEFAULT_TUPLE;
}
void TextureRuntime::Recycle(const HostTextureTag tag, Allocation&& alloc) {
recycler.emplace(tag, std::move(alloc));
}
Allocation TextureRuntime::Allocate(const VideoCore::SurfaceParams& params) {
const u32 width = params.width;
const u32 height = params.height;
const u32 levels = params.levels;
const GLenum target = params.texture_type == VideoCore::TextureType::CubeMap
? GL_TEXTURE_CUBE_MAP
: GL_TEXTURE_2D;
const auto& tuple = GetFormatTuple(params.pixel_format);
const HostTextureTag key = {
.tuple = tuple,
.type = params.texture_type,
.width = width,
.height = height,
.levels = levels,
.res_scale = params.res_scale,
};
if (auto it = recycler.find(key); it != recycler.end()) {
Allocation alloc = std::move(it->second);
ASSERT(alloc.res_scale == params.res_scale);
recycler.erase(it);
return alloc;
}
const GLuint old_tex = OpenGLState::GetCurState().texture_units[0].texture_2d;
glActiveTexture(GL_TEXTURE0);
std::array<OGLTexture, 2> textures{};
std::array<GLuint, 2> handles{};
textures[0] = MakeHandle(target, width, height, levels, tuple, params.DebugName(false));
handles.fill(textures[0].handle);
if (params.res_scale != 1) {
const u32 scaled_width = params.GetScaledWidth();
const u32 scaled_height = params.GetScaledHeight();
textures[1] =
MakeHandle(target, scaled_width, scaled_height, levels, tuple, params.DebugName(true));
handles[1] = textures[1].handle;
}
glBindTexture(target, old_tex);
return Allocation{
.textures = std::move(textures),
.handles = std::move(handles),
.tuple = tuple,
.width = params.width,
.height = params.height,
.levels = params.levels,
.res_scale = params.res_scale,
};
}
bool TextureRuntime::ClearTexture(Surface& surface, const VideoCore::TextureClear& clear) {
const auto prev_state = OpenGLState::GetCurState();
// Setup scissor rectangle according to the clear rectangle
OpenGLState state;
state.scissor.enabled = true;
state.scissor.x = clear.texture_rect.left;
state.scissor.y = clear.texture_rect.bottom;
state.scissor.width = clear.texture_rect.GetWidth();
state.scissor.height = clear.texture_rect.GetHeight();
state.draw.draw_framebuffer = draw_fbos[FboIndex(surface.type)].handle;
state.Apply();
switch (surface.type) {
case SurfaceType::Color:
case SurfaceType::Texture:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
surface.Handle(), clear.texture_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
state.color_mask.red_enabled = true;
state.color_mask.green_enabled = true;
state.color_mask.blue_enabled = true;
state.color_mask.alpha_enabled = true;
state.Apply();
glClearBufferfv(GL_COLOR, 0, clear.value.color.AsArray());
break;
case SurfaceType::Depth:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D,
surface.Handle(), clear.texture_level);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
state.depth.write_mask = GL_TRUE;
state.Apply();
glClearBufferfv(GL_DEPTH, 0, &clear.value.depth);
break;
case SurfaceType::DepthStencil:
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
surface.Handle(), clear.texture_level);
state.depth.write_mask = GL_TRUE;
state.stencil.write_mask = -1;
state.Apply();
glClearBufferfi(GL_DEPTH_STENCIL, 0, clear.value.depth, clear.value.stencil);
break;
default:
UNREACHABLE_MSG("Unknown surface type {}", surface.type);
return false;
}
prev_state.Apply();
return true;
}
bool TextureRuntime::CopyTextures(Surface& source, Surface& dest,
const VideoCore::TextureCopy& copy) {
const GLenum src_textarget = source.texture_type == VideoCore::TextureType::CubeMap
? GL_TEXTURE_CUBE_MAP
: GL_TEXTURE_2D;
const GLenum dest_textarget =
dest.texture_type == VideoCore::TextureType::CubeMap ? GL_TEXTURE_CUBE_MAP : GL_TEXTURE_2D;
glCopyImageSubData(source.Handle(), src_textarget, copy.src_level, copy.src_offset.x,
copy.src_offset.y, copy.src_layer, dest.Handle(), dest_textarget,
copy.dst_level, copy.dst_offset.x, copy.dst_offset.y, copy.dst_layer,
copy.extent.width, copy.extent.height, 1);
return true;
}
bool TextureRuntime::BlitTextures(Surface& source, Surface& dest,
const VideoCore::TextureBlit& blit) {
OpenGLState state = OpenGLState::GetCurState();
state.scissor.enabled = false;
state.draw.read_framebuffer = read_fbos[FboIndex(source.type)].handle;
state.draw.draw_framebuffer = draw_fbos[FboIndex(dest.type)].handle;
state.Apply();
source.Attach(GL_READ_FRAMEBUFFER, blit.src_level, blit.src_layer);
dest.Attach(GL_DRAW_FRAMEBUFFER, blit.dst_level, blit.dst_layer);
// TODO (wwylele): use GL_NEAREST for shadow map texture
// Note: shadow map is treated as RGBA8 format in PICA, as well as in the rasterizer cache, but
// doing linear intepolation componentwise would cause incorrect value. However, for a
// well-programmed game this code path should be rarely executed for shadow map with
// inconsistent scale.
const GLbitfield buffer_mask = MakeBufferMask(source.type);
const GLenum filter = buffer_mask == GL_COLOR_BUFFER_BIT ? GL_LINEAR : GL_NEAREST;
glBlitFramebuffer(blit.src_rect.left, blit.src_rect.bottom, blit.src_rect.right,
blit.src_rect.top, blit.dst_rect.left, blit.dst_rect.bottom,
blit.dst_rect.right, blit.dst_rect.top, buffer_mask, filter);
return true;
}
void TextureRuntime::GenerateMipmaps(Surface& surface, u32 max_level) {
OpenGLState state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = surface.Handle();
state.Apply();
glActiveTexture(GL_TEXTURE0);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, max_level);
glGenerateMipmap(GL_TEXTURE_2D);
}
const ReinterpreterList& TextureRuntime::GetPossibleReinterpretations(
PixelFormat dest_format) const {
return reinterpreters[static_cast<u32>(dest_format)];
}
Surface::Surface(TextureRuntime& runtime_, const VideoCore::SurfaceParams& params)
: SurfaceBase{params}, driver{&runtime_.GetDriver()}, runtime{&runtime_} {
if (pixel_format == PixelFormat::Invalid) {
return;
}
alloc = runtime->Allocate(params);
}
Surface::~Surface() {
if (pixel_format == PixelFormat::Invalid || !alloc) {
return;
}
const HostTextureTag tag = {
.tuple = alloc.tuple,
.type = texture_type,
.width = alloc.width,
.height = alloc.height,
.levels = alloc.levels,
.res_scale = alloc.res_scale,
};
runtime->Recycle(tag, std::move(alloc));
}
void Surface::Upload(const VideoCore::BufferTextureCopy& upload,
const VideoCore::StagingData& staging) {
// Ensure no bad interactions with GL_UNPACK_ALIGNMENT
ASSERT(stride * GetFormatBytesPerPixel(pixel_format) % 4 == 0);
const GLuint old_tex = OpenGLState::GetCurState().texture_units[0].texture_2d;
const u32 unscaled_width = upload.texture_rect.GetWidth();
const u32 unscaled_height = upload.texture_rect.GetHeight();
glPixelStorei(GL_UNPACK_ROW_LENGTH, unscaled_width);
// Bind the unscaled texture
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, Handle(false));
// Upload the requested rectangle of pixels
const auto& tuple = runtime->GetFormatTuple(pixel_format);
glTexSubImage2D(GL_TEXTURE_2D, upload.texture_level, upload.texture_rect.left,
upload.texture_rect.bottom, unscaled_width, unscaled_height, tuple.format,
tuple.type, staging.mapped.data());
glPixelStorei(GL_UNPACK_ROW_LENGTH, 0);
glBindTexture(GL_TEXTURE_2D, old_tex);
const VideoCore::TextureBlit blit = {
.src_level = upload.texture_level,
.dst_level = upload.texture_level,
.src_rect = upload.texture_rect,
.dst_rect = upload.texture_rect * res_scale,
};
// If texture filtering is enabled attempt to upscale with that, otherwise fallback
// to normal blit.
if (res_scale != 1 && !runtime->blit_helper.Filter(*this, blit)) {
BlitScale(blit, true);
}
}
void Surface::Download(const VideoCore::BufferTextureCopy& download,
const VideoCore::StagingData& staging) {
// Ensure no bad interactions with GL_PACK_ALIGNMENT
ASSERT(stride * GetFormatBytesPerPixel(pixel_format) % 4 == 0);
const u32 unscaled_width = download.texture_rect.GetWidth();
const u32 unscaled_height = download.texture_rect.GetHeight();
glPixelStorei(GL_UNPACK_ROW_LENGTH, unscaled_width);
// Scale down upscaled data before downloading it
if (res_scale != 1) {
const VideoCore::TextureBlit blit = {
.src_level = download.texture_level,
.dst_level = download.texture_level,
.src_rect = download.texture_rect * res_scale,
.dst_rect = download.texture_rect,
};
BlitScale(blit, false);
}
// Try to download without using an fbo. This should succeed on recent desktop drivers
if (DownloadWithoutFbo(download, staging)) {
return;
}
const u32 fbo_index = FboIndex(type);
OpenGLState state = OpenGLState::GetCurState();
state.scissor.enabled = false;
state.draw.read_framebuffer = runtime->read_fbos[fbo_index].handle;
state.Apply();
Attach(GL_READ_FRAMEBUFFER, download.texture_level, 0, false);
// Read the pixel data to the staging buffer
const auto& tuple = runtime->GetFormatTuple(pixel_format);
glReadPixels(download.texture_rect.left, download.texture_rect.bottom, unscaled_width,
unscaled_height, tuple.format, tuple.type, staging.mapped.data());
// Restore previous state
glPixelStorei(GL_PACK_ROW_LENGTH, 0);
}
bool Surface::DownloadWithoutFbo(const VideoCore::BufferTextureCopy& download,
const VideoCore::StagingData& staging) {
// If a partial download is requested and ARB_get_texture_sub_image (core in 4.5)
// is not available we cannot proceed further.
const bool is_full_download = download.texture_rect == GetRect();
const bool has_sub_image = driver->HasArbGetTextureSubImage();
if (driver->IsOpenGLES() || (!is_full_download && !has_sub_image)) {
return false;
}
const GLuint old_tex = OpenGLState::GetCurState().texture_units[0].texture_2d;
const auto& tuple = runtime->GetFormatTuple(pixel_format);
glActiveTexture(GL_TEXTURE0);
glPixelStorei(GL_PACK_ROW_LENGTH, static_cast<GLint>(stride));
SCOPE_EXIT({ glPixelStorei(GL_PACK_ROW_LENGTH, 0); });
// Prefer glGetTextureSubImage in most cases since it's the fastest and most convenient option
if (has_sub_image) {
const GLsizei buf_size = static_cast<GLsizei>(staging.mapped.size());
glGetTextureSubImage(Handle(false), download.texture_level, download.texture_rect.left,
download.texture_rect.bottom, 0, download.texture_rect.GetWidth(),
download.texture_rect.GetHeight(), 1, tuple.format, tuple.type,
buf_size, staging.mapped.data());
return true;
}
// This should only trigger for full texture downloads in oldish intel drivers
// that only support up to 4.3
glBindTexture(GL_TEXTURE_2D, Handle(false));
glGetTexImage(GL_TEXTURE_2D, download.texture_level, tuple.format, tuple.type,
staging.mapped.data());
glBindTexture(GL_TEXTURE_2D, old_tex);
return true;
}
void Surface::Attach(GLenum target, u32 level, u32 layer, bool scaled) {
const GLuint handle = Handle(scaled);
const GLenum textarget = texture_type == VideoCore::TextureType::CubeMap
? GL_TEXTURE_CUBE_MAP_POSITIVE_X + layer
: GL_TEXTURE_2D;
switch (type) {
case VideoCore::SurfaceType::Color:
case VideoCore::SurfaceType::Texture:
glFramebufferTexture2D(target, GL_COLOR_ATTACHMENT0, textarget, handle, level);
break;
case VideoCore::SurfaceType::Depth:
glFramebufferTexture2D(target, GL_DEPTH_ATTACHMENT, textarget, handle, level);
break;
case VideoCore::SurfaceType::DepthStencil:
glFramebufferTexture2D(target, GL_DEPTH_STENCIL_ATTACHMENT, textarget, handle, level);
break;
default:
UNREACHABLE_MSG("Invalid surface type!");
}
}
u32 Surface::GetInternalBytesPerPixel() const {
// RGB8 is converted to RGBA8 on OpenGL ES since it doesn't support BGR8
if (driver->IsOpenGLES() && pixel_format == VideoCore::PixelFormat::RGB8) {
return 4;
}
return GetFormatBytesPerPixel(pixel_format);
}
void Surface::BlitScale(const VideoCore::TextureBlit& blit, bool up_scale) {
const u32 fbo_index = FboIndex(type);
OpenGLState state = OpenGLState::GetCurState();
state.scissor.enabled = false;
state.draw.read_framebuffer = runtime->read_fbos[fbo_index].handle;
state.draw.draw_framebuffer = runtime->draw_fbos[fbo_index].handle;
state.Apply();
Attach(GL_READ_FRAMEBUFFER, blit.src_level, blit.src_layer, !up_scale);
Attach(GL_DRAW_FRAMEBUFFER, blit.dst_level, blit.dst_layer, up_scale);
const GLenum buffer_mask = MakeBufferMask(type);
const GLenum filter = buffer_mask == GL_COLOR_BUFFER_BIT ? GL_LINEAR : GL_NEAREST;
glBlitFramebuffer(blit.src_rect.left, blit.src_rect.bottom, blit.src_rect.right,
blit.src_rect.top, blit.dst_rect.left, blit.dst_rect.bottom,
blit.dst_rect.right, blit.dst_rect.top, buffer_mask, filter);
}
Framebuffer::Framebuffer(TextureRuntime& runtime, Surface* const color, u32 color_level,
Surface* const depth_stencil, u32 depth_level, const Pica::Regs& regs,
Common::Rectangle<u32> surfaces_rect)
: VideoCore::FramebufferBase{regs, color, color_level,
depth_stencil, depth_level, surfaces_rect} {
const bool shadow_rendering = regs.framebuffer.IsShadowRendering();
const bool has_stencil = regs.framebuffer.HasStencil();
if (shadow_rendering && !color) {
return; // Framebuffer won't get used
}
if (color) {
attachments[0] = color->Handle();
}
if (depth_stencil) {
attachments[1] = depth_stencil->Handle();
}
const FramebufferInfo info = {
.color = attachments[0],
.depth = attachments[1],
.color_level = color_level,
.depth_level = depth_level,
};
const u64 hash = Common::ComputeHash64(&info, sizeof(FramebufferInfo));
auto [it, new_framebuffer] = runtime.framebuffer_cache.try_emplace(hash);
if (!new_framebuffer) {
handle = it->second.handle;
return;
}
const GLuint old_fbo = OpenGLState::GetCurState().draw.draw_framebuffer;
OGLFramebuffer& framebuffer = it->second;
framebuffer.Create();
handle = it->second.handle;
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, framebuffer.handle);
SCOPE_EXIT({ glBindFramebuffer(GL_DRAW_FRAMEBUFFER, old_fbo); });
if (shadow_rendering) {
glFramebufferParameteri(GL_DRAW_FRAMEBUFFER, GL_FRAMEBUFFER_DEFAULT_WIDTH,
color->width * res_scale);
glFramebufferParameteri(GL_DRAW_FRAMEBUFFER, GL_FRAMEBUFFER_DEFAULT_HEIGHT,
color->height * res_scale);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, 0, 0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
} else {
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
color ? color->Handle() : 0, color_level);
if (depth_stencil) {
if (has_stencil) {
// Attach both depth and stencil
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT,
GL_TEXTURE_2D, depth_stencil->Handle(), depth_level);
} else {
// Attach depth
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D,
depth_stencil->Handle(), depth_level);
// Clear stencil attachment
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,
0);
}
} else {
// Clear both depth and stencil attachment
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
0, 0);
}
}
}
Framebuffer::~Framebuffer() = default;
} // namespace OpenGL

View file

@ -0,0 +1,204 @@
// Copyright 2023 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/rasterizer_cache/framebuffer_base.h"
#include "video_core/rasterizer_cache/surface_base.h"
#include "video_core/renderer_opengl/gl_blit_helper.h"
#include "video_core/renderer_opengl/gl_format_reinterpreter.h"
namespace VideoCore {
class RendererBase;
}
namespace OpenGL {
struct FormatTuple {
GLint internal_format;
GLenum format;
GLenum type;
bool operator==(const FormatTuple& other) const noexcept {
return std::tie(internal_format, format, type) ==
std::tie(other.internal_format, other.format, other.type);
}
};
struct HostTextureTag {
FormatTuple tuple{};
VideoCore::TextureType type{};
u32 width = 0;
u32 height = 0;
u32 levels = 1;
u32 res_scale = 1;
bool operator==(const HostTextureTag& other) const noexcept {
return std::tie(tuple, type, width, height, levels, res_scale) ==
std::tie(other.tuple, other.type, other.width, other.height, other.levels,
other.res_scale);
}
struct Hash {
const u64 operator()(const HostTextureTag& tag) const {
return Common::ComputeHash64(&tag, sizeof(HostTextureTag));
}
};
};
struct Allocation {
std::array<OGLTexture, 2> textures;
std::array<GLuint, 2> handles;
FormatTuple tuple;
u32 width;
u32 height;
u32 levels;
u32 res_scale;
operator bool() const noexcept {
return textures[0].handle;
}
bool Matches(u32 width_, u32 height_, u32 levels_, const FormatTuple& tuple_) const {
return std::tie(width, height, levels, tuple) == std::tie(width_, height_, levels_, tuple_);
}
};
class Surface;
class Driver;
struct CachedTextureCube;
/**
* Provides texture manipulation functions to the rasterizer cache
* Separating this into a class makes it easier to abstract graphics API code
*/
class TextureRuntime {
friend class Surface;
friend class Framebuffer;
friend class BlitHelper;
public:
explicit TextureRuntime(const Driver& driver, VideoCore::RendererBase& renderer);
~TextureRuntime();
/// Returns true if the provided pixel format cannot be used natively by the runtime.
bool NeedsConversion(VideoCore::PixelFormat pixel_format) const;
/// Maps an internal staging buffer of the provided size of pixel uploads/downloads
VideoCore::StagingData FindStaging(u32 size, bool upload);
/// Returns the OpenGL format tuple associated with the provided pixel format
const FormatTuple& GetFormatTuple(VideoCore::PixelFormat pixel_format) const;
/// Takes back ownership of the allocation for recycling
void Recycle(const HostTextureTag tag, Allocation&& alloc);
/// Allocates an OpenGL texture with the specified dimentions and format
Allocation Allocate(const VideoCore::SurfaceParams& params);
/// Fills the rectangle of the texture with the clear value provided
bool ClearTexture(Surface& surface, const VideoCore::TextureClear& clear);
/// Copies a rectangle of source to another rectange of dest
bool CopyTextures(Surface& source, Surface& dest, const VideoCore::TextureCopy& copy);
/// Blits a rectangle of source to another rectange of dest
bool BlitTextures(Surface& source, Surface& dest, const VideoCore::TextureBlit& blit);
/// Generates mipmaps for all the available levels of the texture
void GenerateMipmaps(Surface& surface, u32 max_level);
/// Returns all source formats that support reinterpretation to the dest format
const ReinterpreterList& GetPossibleReinterpretations(VideoCore::PixelFormat dest_format) const;
private:
/// Returns the OpenGL driver class
const Driver& GetDriver() const {
return driver;
}
private:
const Driver& driver;
BlitHelper blit_helper;
std::vector<u8> staging_buffer;
std::array<ReinterpreterList, VideoCore::PIXEL_FORMAT_COUNT> reinterpreters;
std::unordered_multimap<HostTextureTag, Allocation, HostTextureTag::Hash> recycler;
std::unordered_map<u64, OGLFramebuffer, Common::IdentityHash<u64>> framebuffer_cache;
std::array<OGLFramebuffer, 3> draw_fbos;
std::array<OGLFramebuffer, 3> read_fbos;
};
class Surface : public VideoCore::SurfaceBase {
public:
explicit Surface(TextureRuntime& runtime, const VideoCore::SurfaceParams& params);
~Surface();
Surface(const Surface&) = delete;
Surface& operator=(const Surface&) = delete;
Surface(Surface&& o) noexcept = default;
Surface& operator=(Surface&& o) noexcept = default;
/// Returns the surface image handle
GLuint Handle(bool scaled = true) const noexcept {
return alloc.handles[static_cast<u32>(scaled)];
}
/// Returns the tuple of the surface allocation.
const FormatTuple& Tuple() const noexcept {
return alloc.tuple;
}
/// Uploads pixel data in staging to a rectangle region of the surface texture
void Upload(const VideoCore::BufferTextureCopy& upload, const VideoCore::StagingData& staging);
/// Downloads pixel data to staging from a rectangle region of the surface texture
void Download(const VideoCore::BufferTextureCopy& download,
const VideoCore::StagingData& staging);
/// Attaches a handle of surface to the specified framebuffer target
void Attach(GLenum target, u32 level, u32 layer, bool scaled = true);
/// Returns the bpp of the internal surface format
u32 GetInternalBytesPerPixel() const;
private:
/// Performs blit between the scaled/unscaled images
void BlitScale(const VideoCore::TextureBlit& blit, bool up_scale);
/// Attempts to download without using an fbo
bool DownloadWithoutFbo(const VideoCore::BufferTextureCopy& download,
const VideoCore::StagingData& staging);
private:
const Driver* driver;
TextureRuntime* runtime;
Allocation alloc{};
};
class Framebuffer : public VideoCore::FramebufferBase {
public:
explicit Framebuffer(TextureRuntime& runtime, Surface* const color, u32 color_level,
Surface* const depth_stencil, u32 depth_level, const Pica::Regs& regs,
Common::Rectangle<u32> surfaces_rect);
~Framebuffer();
[[nodiscard]] GLuint Handle() const noexcept {
return handle;
}
[[nodiscard]] GLuint Attachment(VideoCore::SurfaceType type) const noexcept {
return attachments[Index(type)];
}
[[nodiscard]] bool HasAttachment(VideoCore::SurfaceType type) const noexcept {
return static_cast<bool>(attachments[Index(type)]);
}
private:
std::array<GLuint, 2> attachments{};
GLuint handle{};
};
} // namespace OpenGL

View file

@ -21,8 +21,16 @@
#include "video_core/renderer_opengl/renderer_opengl.h"
#include "video_core/video_core.h"
#include "video_core/host_shaders/opengl_present_anaglyph_frag.h"
#include "video_core/host_shaders/opengl_present_frag.h"
#include "video_core/host_shaders/opengl_present_interlaced_frag.h"
#include "video_core/host_shaders/opengl_present_vert.h"
namespace OpenGL {
MICROPROFILE_DEFINE(OpenGL_RenderFrame, "OpenGL", "Render Frame", MP_RGB(128, 128, 64));
MICROPROFILE_DEFINE(OpenGL_WaitPresent, "OpenGL", "Wait For Present", MP_RGB(128, 128, 128));
// If the size of this is too small, it ends up creating a soft cap on FPS as the renderer will have
// to wait on available presentation frames. There doesn't seem to be much of a downside to a larger
// number but 9 swap textures at 60FPS presentation allows for 800% speed so thats probably fine
@ -48,7 +56,7 @@ public:
std::deque<Frontend::Frame*> present_queue{};
Frontend::Frame* previous_frame = nullptr;
OGLTextureMailbox() {
OGLTextureMailbox(bool has_debug_tool_ = false) : has_debug_tool{has_debug_tool_} {
for (auto& frame : swap_chain) {
free_queue.push(&frame);
}
@ -126,6 +134,8 @@ public:
std::unique_lock<std::mutex> lock(swap_chain_lock);
present_queue.push_front(frame);
present_cv.notify_one();
DebugNotifyNextFrame();
}
// This is virtual as it is to be overriden in OGLVideoDumpingMailbox below.
@ -148,6 +158,8 @@ public:
}
Frontend::Frame* TryGetPresentFrame(int timeout_ms) override {
DebugWaitForNextFrame();
std::unique_lock<std::mutex> lock(swap_chain_lock);
// wait for new entries in the present_queue
present_cv.wait_for(lock, std::chrono::milliseconds(timeout_ms),
@ -160,6 +172,33 @@ public:
LoadPresentFrame();
return previous_frame;
}
private:
std::mutex debug_synch_mutex;
std::condition_variable debug_synch_condition;
std::atomic_int frame_for_debug{};
const bool has_debug_tool; // When true, using a GPU debugger, so keep frames in lock-step
/// Signal that a new frame is available (called from GPU thread)
void DebugNotifyNextFrame() {
if (!has_debug_tool) {
return;
}
frame_for_debug++;
std::lock_guard lock{debug_synch_mutex};
debug_synch_condition.notify_one();
}
/// Wait for a new frame to be available (called from presentation thread)
void DebugWaitForNextFrame() {
if (!has_debug_tool) {
return;
}
const int last_frame = frame_for_debug;
std::unique_lock lock{debug_synch_mutex};
debug_synch_condition.wait(lock,
[this, last_frame] { return frame_for_debug > last_frame; });
}
};
/// This mailbox is different in that it will never discard rendered frames
@ -218,93 +257,6 @@ public:
}
};
static const char vertex_shader[] = R"(
in vec2 vert_position;
in vec2 vert_tex_coord;
out vec2 frag_tex_coord;
// This is a truncated 3x3 matrix for 2D transformations:
// The upper-left 2x2 submatrix performs scaling/rotation/mirroring.
// The third column performs translation.
// The third row could be used for projection, which we don't need in 2D. It hence is assumed to
// implicitly be [0, 0, 1]
uniform mat3x2 modelview_matrix;
void main() {
// Multiply input position by the rotscale part of the matrix and then manually translate by
// the last column. This is equivalent to using a full 3x3 matrix and expanding the vector
// to `vec3(vert_position.xy, 1.0)`
gl_Position = vec4(mat2(modelview_matrix) * vert_position + modelview_matrix[2], 0.0, 1.0);
frag_tex_coord = vert_tex_coord;
}
)";
static const char fragment_shader[] = R"(
in vec2 frag_tex_coord;
layout(location = 0) out vec4 color;
uniform vec4 i_resolution;
uniform vec4 o_resolution;
uniform int layer;
uniform sampler2D color_texture;
void main() {
color = texture(color_texture, frag_tex_coord);
}
)";
static const char fragment_shader_anaglyph[] = R"(
// Anaglyph Red-Cyan shader based on Dubois algorithm
// Constants taken from the paper:
// "Conversion of a Stereo Pair to Anaglyph with
// the Least-Squares Projection Method"
// Eric Dubois, March 2009
const mat3 l = mat3( 0.437, 0.449, 0.164,
-0.062,-0.062,-0.024,
-0.048,-0.050,-0.017);
const mat3 r = mat3(-0.011,-0.032,-0.007,
0.377, 0.761, 0.009,
-0.026,-0.093, 1.234);
in vec2 frag_tex_coord;
out vec4 color;
uniform vec4 resolution;
uniform int layer;
uniform sampler2D color_texture;
uniform sampler2D color_texture_r;
void main() {
vec4 color_tex_l = texture(color_texture, frag_tex_coord);
vec4 color_tex_r = texture(color_texture_r, frag_tex_coord);
color = vec4(color_tex_l.rgb*l+color_tex_r.rgb*r, color_tex_l.a);
}
)";
static const char fragment_shader_interlaced[] = R"(
in vec2 frag_tex_coord;
out vec4 color;
uniform vec4 o_resolution;
uniform sampler2D color_texture;
uniform sampler2D color_texture_r;
uniform int reverse_interlaced;
void main() {
float screen_row = o_resolution.x * frag_tex_coord.x;
if (int(screen_row) % 2 == reverse_interlaced)
color = texture(color_texture, frag_tex_coord);
else
color = texture(color_texture_r, frag_tex_coord);
}
)";
/**
* Vertex structure that the drawn screen rectangles are composed of.
*/
@ -354,9 +306,10 @@ RendererOpenGL::RendererOpenGL(Core::System& system, Frontend::EmuWindow& window
Frontend::EmuWindow* secondary_window)
: VideoCore::RendererBase{system, window, secondary_window}, driver{system.TelemetrySession()},
frame_dumper{system.VideoDumper(), window} {
window.mailbox = std::make_unique<OGLTextureMailbox>();
const bool has_debug_tool = driver.HasDebugTool();
window.mailbox = std::make_unique<OGLTextureMailbox>(has_debug_tool);
if (secondary_window) {
secondary_window->mailbox = std::make_unique<OGLTextureMailbox>();
secondary_window->mailbox = std::make_unique<OGLTextureMailbox>(has_debug_tool);
}
frame_dumper.mailbox = std::make_unique<OGLVideoDumpingMailbox>();
InitOpenGLObjects();
@ -365,10 +318,6 @@ RendererOpenGL::RendererOpenGL(Core::System& system, Frontend::EmuWindow& window
RendererOpenGL::~RendererOpenGL() = default;
MICROPROFILE_DEFINE(OpenGL_RenderFrame, "OpenGL", "Render Frame", MP_RGB(128, 128, 64));
MICROPROFILE_DEFINE(OpenGL_WaitPresent, "OpenGL", "Wait For Present", MP_RGB(128, 128, 128));
/// Swap buffers (render frame)
void RendererOpenGL::SwapBuffers() {
// Maintain the rasterizer's state as a priority
OpenGLState prev_state = OpenGLState::GetCurState();
@ -673,13 +622,13 @@ void RendererOpenGL::ReloadShader() {
if (Settings::values.render_3d.GetValue() == Settings::StereoRenderOption::Anaglyph) {
if (Settings::values.anaglyph_shader_name.GetValue() == "dubois (builtin)") {
shader_data += fragment_shader_anaglyph;
shader_data += HostShaders::OPENGL_PRESENT_ANAGLYPH_FRAG;
} else {
std::string shader_text = OpenGL::GetPostProcessingShaderCode(
true, Settings::values.anaglyph_shader_name.GetValue());
if (shader_text.empty()) {
// Should probably provide some information that the shader couldn't load
shader_data += fragment_shader_anaglyph;
shader_data += HostShaders::OPENGL_PRESENT_ANAGLYPH_FRAG;
} else {
shader_data += shader_text;
}
@ -687,22 +636,22 @@ void RendererOpenGL::ReloadShader() {
} else if (Settings::values.render_3d.GetValue() == Settings::StereoRenderOption::Interlaced ||
Settings::values.render_3d.GetValue() ==
Settings::StereoRenderOption::ReverseInterlaced) {
shader_data += fragment_shader_interlaced;
shader_data += HostShaders::OPENGL_PRESENT_INTERLACED_FRAG;
} else {
if (Settings::values.pp_shader_name.GetValue() == "none (builtin)") {
shader_data += fragment_shader;
shader_data += HostShaders::OPENGL_PRESENT_FRAG;
} else {
std::string shader_text = OpenGL::GetPostProcessingShaderCode(
false, Settings::values.pp_shader_name.GetValue());
if (shader_text.empty()) {
// Should probably provide some information that the shader couldn't load
shader_data += fragment_shader;
shader_data += HostShaders::OPENGL_PRESENT_FRAG;
} else {
shader_data += shader_text;
}
}
}
shader.Create(vertex_shader, shader_data.c_str());
shader.Create(HostShaders::OPENGL_PRESENT_VERT, shader_data);
state.draw.shader_program = shader.handle;
state.Apply();
uniform_modelview_matrix = glGetUniformLocation(shader.handle, "modelview_matrix");

View file

@ -1,253 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <chrono>
#include <vector>
#include <fmt/chrono.h>
#include "common/logging/log.h"
#include "video_core/rasterizer_cache/rasterizer_cache_utils.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_downloader_es.h"
#include "shaders/depth_to_color.frag"
#include "shaders/depth_to_color.vert"
#include "shaders/ds_to_color.frag"
namespace OpenGL {
/**
* Self tests for the texture downloader
*/
void TextureDownloaderES::Test() {
auto cur_state = OpenGLState::GetCurState();
OpenGLState state;
{
GLint range[2];
GLint precision;
#define PRECISION_TEST(type) \
glGetShaderPrecisionFormat(GL_FRAGMENT_SHADER, type, range, &precision); \
LOG_INFO(Render_OpenGL, #type " range: [{}, {}], precision: {}", range[0], range[1], precision);
PRECISION_TEST(GL_LOW_INT);
PRECISION_TEST(GL_MEDIUM_INT);
PRECISION_TEST(GL_HIGH_INT);
PRECISION_TEST(GL_LOW_FLOAT);
PRECISION_TEST(GL_MEDIUM_FLOAT);
PRECISION_TEST(GL_HIGH_FLOAT);
#undef PRECISION_TEST
}
glActiveTexture(GL_TEXTURE0);
const auto test = [this, &state](FormatTuple tuple, auto original_data, std::size_t tex_size,
auto data_generator) {
OGLTexture texture;
texture.Create();
state.texture_units[0].texture_2d = texture.handle;
state.Apply();
original_data.resize(tex_size * tex_size);
for (std::size_t idx = 0; idx < original_data.size(); ++idx) {
original_data[idx] = data_generator(idx);
}
GLsizei tex_sizei = static_cast<GLsizei>(tex_size);
glTexStorage2D(GL_TEXTURE_2D, 1, tuple.internal_format, tex_sizei, tex_sizei);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, tex_sizei, tex_sizei, tuple.format, tuple.type,
original_data.data());
decltype(original_data) new_data(original_data.size());
glFinish();
auto start = std::chrono::high_resolution_clock::now();
GetTexImage(GL_TEXTURE_2D, 0, tuple.format, tuple.type, tex_sizei, tex_sizei,
new_data.data());
glFinish();
auto time = std::chrono::high_resolution_clock::now() - start;
LOG_INFO(Render_OpenGL, "test took {}", std::chrono::duration<double, std::milli>(time));
int diff = 0;
for (std::size_t idx = 0; idx < original_data.size(); ++idx)
if (new_data[idx] - original_data[idx] != diff) {
diff = new_data[idx] - original_data[idx];
// every time the error between the real and expected value changes, log it
// some error is expected in D24 due to floating point precision
LOG_WARNING(Render_OpenGL, "difference changed at {:#X}: {:#X} -> {:#X}", idx,
original_data[idx], new_data[idx]);
}
};
LOG_INFO(Render_OpenGL, "GL_DEPTH24_STENCIL8 download test starting");
test(GetFormatTuple(PixelFormat::D24S8), std::vector<u32>{}, 4096,
[](std::size_t idx) { return static_cast<u32>((idx << 8) | (idx & 0xFF)); });
LOG_INFO(Render_OpenGL, "GL_DEPTH_COMPONENT24 download test starting");
test(GetFormatTuple(PixelFormat::D24), std::vector<u32>{}, 4096,
[](std::size_t idx) { return static_cast<u32>(idx << 8); });
LOG_INFO(Render_OpenGL, "GL_DEPTH_COMPONENT16 download test starting");
test(GetFormatTuple(PixelFormat::D16), std::vector<u16>{}, 256,
[](std::size_t idx) { return static_cast<u16>(idx); });
cur_state.Apply();
}
TextureDownloaderES::TextureDownloaderES(bool enable_depth_stencil) {
vao.Create();
read_fbo_generic.Create();
depth32_fbo.Create();
r32ui_renderbuffer.Create();
depth16_fbo.Create();
r16_renderbuffer.Create();
const auto init_program = [](ConversionShader& converter, std::string_view frag) {
converter.program.Create(depth_to_color_vert.data(), frag.data());
converter.lod_location = glGetUniformLocation(converter.program.handle, "lod");
};
// xperia64: The depth stencil shader currently uses a GLES extension that is not supported
// across all devices Reportedly broken on Tegra devices and the Nexus 6P, so enabling it can be
// toggled
if (enable_depth_stencil) {
init_program(d24s8_r32ui_conversion_shader, ds_to_color_frag);
}
init_program(d24_r32ui_conversion_shader, depth_to_color_frag);
init_program(d16_r16_conversion_shader, R"(
out highp float color;
uniform highp sampler2D depth;
uniform int lod;
void main(){
color = texelFetch(depth, ivec2(gl_FragCoord.xy), lod).x;
}
)");
sampler.Create();
glSamplerParameteri(sampler.handle, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glSamplerParameteri(sampler.handle, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
auto cur_state = OpenGLState::GetCurState();
auto state = cur_state;
state.draw.shader_program = d24s8_r32ui_conversion_shader.program.handle;
state.draw.draw_framebuffer = depth32_fbo.handle;
state.renderbuffer = r32ui_renderbuffer.handle;
state.Apply();
glRenderbufferStorage(GL_RENDERBUFFER, GL_R32UI, max_size, max_size);
glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
r32ui_renderbuffer.handle);
glUniform1i(glGetUniformLocation(d24s8_r32ui_conversion_shader.program.handle, "depth"), 1);
state.draw.draw_framebuffer = depth16_fbo.handle;
state.renderbuffer = r16_renderbuffer.handle;
state.Apply();
glRenderbufferStorage(GL_RENDERBUFFER, GL_R16, max_size, max_size);
glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
r16_renderbuffer.handle);
cur_state.Apply();
}
/**
* OpenGL ES does not support glReadBuffer for depth/stencil formats
* This gets around it by converting to a Red surface before downloading
*/
GLuint TextureDownloaderES::ConvertDepthToColor(GLuint level, GLenum& format, GLenum& type,
GLint height, GLint width) {
ASSERT(width <= max_size && height <= max_size);
const OpenGLState cur_state = OpenGLState::GetCurState();
OpenGLState state;
state.texture_units[0] = {cur_state.texture_units[0].texture_2d, sampler.handle};
state.draw.vertex_array = vao.handle;
OGLTexture texture_view;
const ConversionShader* converter;
switch (type) {
case GL_UNSIGNED_SHORT:
state.draw.draw_framebuffer = depth16_fbo.handle;
converter = &d16_r16_conversion_shader;
format = GL_RED;
break;
case GL_UNSIGNED_INT:
state.draw.draw_framebuffer = depth32_fbo.handle;
converter = &d24_r32ui_conversion_shader;
format = GL_RED_INTEGER;
break;
case GL_UNSIGNED_INT_24_8:
state.draw.draw_framebuffer = depth32_fbo.handle;
converter = &d24s8_r32ui_conversion_shader;
format = GL_RED_INTEGER;
type = GL_UNSIGNED_INT;
break;
default:
UNREACHABLE_MSG("Destination type not recognized");
}
state.draw.shader_program = converter->program.handle;
state.viewport = {0, 0, width, height};
state.Apply();
if (converter->program.handle == d24s8_r32ui_conversion_shader.program.handle) {
// TODO BreadFish64: the ARM framebuffer reading extension is probably not the most optimal
// way to do this, search for another solution
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D,
state.texture_units[0].texture_2d, level);
}
glUniform1i(converter->lod_location, level);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
if (texture_view.handle) {
glTexParameteri(GL_TEXTURE_2D, GL_DEPTH_STENCIL_TEXTURE_MODE, GL_DEPTH_COMPONENT);
}
return state.draw.draw_framebuffer;
}
/**
* OpenGL ES does not support glGetTexImage. Obtain the pixels by attaching the
* texture to a framebuffer.
* Originally from https://github.com/apitrace/apitrace/blob/master/retrace/glstate_images.cpp
* Depth texture download assumes that the texture's format tuple matches what is found
* OpenGL::depth_format_tuples
*/
void TextureDownloaderES::GetTexImage(GLenum target, GLuint level, GLenum format, GLenum type,
GLint height, GLint width, void* pixels) {
OpenGLState state = OpenGLState::GetCurState();
GLuint texture{};
const GLuint old_read_buffer = state.draw.read_framebuffer;
switch (target) {
case GL_TEXTURE_2D:
texture = state.texture_units[0].texture_2d;
break;
case GL_TEXTURE_CUBE_MAP_POSITIVE_X:
case GL_TEXTURE_CUBE_MAP_NEGATIVE_X:
case GL_TEXTURE_CUBE_MAP_POSITIVE_Y:
case GL_TEXTURE_CUBE_MAP_NEGATIVE_Y:
case GL_TEXTURE_CUBE_MAP_POSITIVE_Z:
case GL_TEXTURE_CUBE_MAP_NEGATIVE_Z:
texture = state.texture_cube_unit.texture_cube;
break;
default:
UNIMPLEMENTED_MSG("Unexpected target {:x}", target);
}
switch (format) {
case GL_DEPTH_COMPONENT:
case GL_DEPTH_STENCIL:
// unfortunately, the accurate way is too slow for release
return;
state.draw.read_framebuffer = ConvertDepthToColor(level, format, type, height, width);
state.Apply();
break;
default:
state.draw.read_framebuffer = read_fbo_generic.handle;
state.Apply();
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, texture,
level);
}
GLenum status = glCheckFramebufferStatus(GL_READ_FRAMEBUFFER);
if (status != GL_FRAMEBUFFER_COMPLETE) {
LOG_DEBUG(Render_OpenGL, "Framebuffer is incomplete, status: {:X}", status);
}
glReadPixels(0, 0, width, height, format, type, pixels);
state.draw.read_framebuffer = old_read_buffer;
state.Apply();
}
} // namespace OpenGL

View file

@ -1,36 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/common_types.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
namespace OpenGL {
class OpenGLState;
class TextureDownloaderES {
static constexpr u16 max_size = 1024;
OGLVertexArray vao;
OGLFramebuffer read_fbo_generic;
OGLFramebuffer depth32_fbo, depth16_fbo;
OGLRenderbuffer r32ui_renderbuffer, r16_renderbuffer;
struct ConversionShader {
OGLProgram program;
GLint lod_location{-1};
} d24_r32ui_conversion_shader, d16_r16_conversion_shader, d24s8_r32ui_conversion_shader;
OGLSampler sampler;
void Test();
GLuint ConvertDepthToColor(GLuint level, GLenum& format, GLenum& type, GLint height,
GLint width);
public:
TextureDownloaderES(bool enable_depth_stencil);
void GetTexImage(GLenum target, GLuint level, GLenum format, const GLenum type, GLint height,
GLint width, void* pixels);
};
} // namespace OpenGL

View file

@ -1,138 +0,0 @@
// Copyright 2019 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
// modified from
// https://github.com/bloc97/Anime4K/blob/533cee5f7018d0e57ad2a26d76d43f13b9d8782a/glsl/Anime4K_Adaptive_v1.0RC2_UltraFast.glsl
// MIT License
//
// Copyright(c) 2019 bloc97
//
// Permission is hereby granted,
// free of charge,
// to any person obtaining a copy of this software and associated documentation
// files(the "Software"),
// to deal in the Software without restriction, including without limitation the rights to use,
// copy, modify, merge, publish, distribute, sublicense, and / or sell copies of the Software,
// and to permit persons to whom the Software is furnished to do so,
// subject to the following conditions :
//
// The above copyright notice and this permission notice shall be included in all copies
// or
// substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS",
// WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
// INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND
// NONINFRINGEMENT.IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#include "video_core/renderer_opengl/texture_filters/anime4k/anime4k_ultrafast.h"
#include "shaders/refine.frag"
#include "shaders/tex_coord.vert"
#include "shaders/x_gradient.frag"
#include "shaders/y_gradient.frag"
namespace OpenGL {
Anime4kUltrafast::Anime4kUltrafast(u16 scale_factor) : TextureFilterBase(scale_factor) {
const OpenGLState cur_state = OpenGLState::GetCurState();
vao.Create();
for (std::size_t idx = 0; idx < samplers.size(); ++idx) {
samplers[idx].Create();
state.texture_units[idx].sampler = samplers[idx].handle;
glSamplerParameteri(samplers[idx].handle, GL_TEXTURE_MIN_FILTER,
idx != 2 ? GL_LINEAR : GL_NEAREST);
glSamplerParameteri(samplers[idx].handle, GL_TEXTURE_MAG_FILTER,
idx != 2 ? GL_LINEAR : GL_NEAREST);
glSamplerParameteri(samplers[idx].handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(samplers[idx].handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
}
state.draw.vertex_array = vao.handle;
gradient_x_program.Create(tex_coord_vert.data(), x_gradient_frag.data());
gradient_y_program.Create(tex_coord_vert.data(), y_gradient_frag.data());
refine_program.Create(tex_coord_vert.data(), refine_frag.data());
state.draw.shader_program = gradient_y_program.handle;
state.Apply();
glUniform1i(glGetUniformLocation(gradient_y_program.handle, "tex_input"), 2);
state.draw.shader_program = refine_program.handle;
state.Apply();
glUniform1i(glGetUniformLocation(refine_program.handle, "LUMAD"), 1);
cur_state.Apply();
}
void Anime4kUltrafast::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) {
const OpenGLState cur_state = OpenGLState::GetCurState();
// These will have handles from the previous texture that was filtered, reset them to avoid
// binding invalid textures.
state.texture_units[0].texture_2d = 0;
state.texture_units[1].texture_2d = 0;
state.texture_units[2].texture_2d = 0;
const auto setup_temp_tex = [this, &src_rect](GLint internal_format, GLint format) {
TempTex texture;
texture.fbo.Create();
texture.tex.Create();
state.texture_units[0].texture_2d = texture.tex.handle;
state.draw.draw_framebuffer = texture.fbo.handle;
state.Apply();
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, texture.tex.handle);
glTexStorage2D(GL_TEXTURE_2D, 1, internal_format,
src_rect.GetWidth() * internal_scale_factor,
src_rect.GetHeight() * internal_scale_factor);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D,
texture.tex.handle, 0);
return texture;
};
auto XY = setup_temp_tex(GL_RG16F, GL_RG);
auto LUMAD = setup_temp_tex(GL_R16F, GL_RED);
state.viewport = {static_cast<GLint>(src_rect.left * internal_scale_factor),
static_cast<GLint>(src_rect.bottom * internal_scale_factor),
static_cast<GLsizei>(src_rect.GetWidth() * internal_scale_factor),
static_cast<GLsizei>(src_rect.GetHeight() * internal_scale_factor)};
state.texture_units[0].texture_2d = src_tex.handle;
state.texture_units[1].texture_2d = LUMAD.tex.handle;
state.texture_units[2].texture_2d = XY.tex.handle;
state.draw.draw_framebuffer = XY.fbo.handle;
state.draw.shader_program = gradient_x_program.handle;
state.Apply();
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
// gradient y pass
state.draw.draw_framebuffer = LUMAD.fbo.handle;
state.draw.shader_program = gradient_y_program.handle;
state.Apply();
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
// refine pass
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.draw.draw_framebuffer = draw_fbo.handle;
state.draw.shader_program = refine_program.handle;
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex.handle,
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
cur_state.Apply();
}
} // namespace OpenGL

View file

@ -1,38 +0,0 @@
// Copyright 2019 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class Anime4kUltrafast : public TextureFilterBase {
public:
static constexpr std::string_view NAME = "Anime4K Ultrafast";
explicit Anime4kUltrafast(u16 scale_factor);
void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override;
private:
static constexpr u8 internal_scale_factor = 2;
OpenGLState state{};
OGLVertexArray vao;
struct TempTex {
OGLTexture tex;
OGLFramebuffer fbo;
};
std::array<OGLSampler, 3> samplers;
OGLProgram gradient_x_program, gradient_y_program, refine_program;
};
} // namespace OpenGL

View file

@ -1,117 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D HOOKED;
uniform sampler2D LUMAD;
const float LINE_DETECT_THRESHOLD = 0.4;
const float STRENGTH = 0.6;
// the original shader used the alpha channel for luminance,
// which doesn't work for our use case
struct RGBAL {
vec4 c;
float l;
};
vec4 getAverage(vec4 cc, vec4 a, vec4 b, vec4 c) {
return cc * (1.0 - STRENGTH) + ((a + b + c) / 3.0) * STRENGTH;
}
#define GetRGBAL(x_offset, y_offset) \
RGBAL(textureLodOffset(HOOKED, tex_coord, 0.0, ivec2(x_offset, y_offset)), \
textureLodOffset(LUMAD, tex_coord, 0.0, ivec2(x_offset, y_offset)).x)
float min3v(float a, float b, float c) {
return min(min(a, b), c);
}
float max3v(float a, float b, float c) {
return max(max(a, b), c);
}
vec4 Compute() {
RGBAL cc = GetRGBAL(0, 0);
if (cc.l > LINE_DETECT_THRESHOLD) {
return cc.c;
}
RGBAL tl = GetRGBAL(-1, -1);
RGBAL t = GetRGBAL(0, -1);
RGBAL tr = GetRGBAL(1, -1);
RGBAL l = GetRGBAL(-1, 0);
RGBAL r = GetRGBAL(1, 0);
RGBAL bl = GetRGBAL(-1, 1);
RGBAL b = GetRGBAL(0, 1);
RGBAL br = GetRGBAL(1, 1);
// Kernel 0 and 4
float maxDark = max3v(br.l, b.l, bl.l);
float minLight = min3v(tl.l, t.l, tr.l);
if (minLight > cc.l && minLight > maxDark) {
return getAverage(cc.c, tl.c, t.c, tr.c);
} else {
maxDark = max3v(tl.l, t.l, tr.l);
minLight = min3v(br.l, b.l, bl.l);
if (minLight > cc.l && minLight > maxDark) {
return getAverage(cc.c, br.c, b.c, bl.c);
}
}
// Kernel 1 and 5
maxDark = max3v(cc.l, l.l, b.l);
minLight = min3v(r.l, t.l, tr.l);
if (minLight > maxDark) {
return getAverage(cc.c, r.c, t.c, tr.c);
} else {
maxDark = max3v(cc.l, r.l, t.l);
minLight = min3v(bl.l, l.l, b.l);
if (minLight > maxDark) {
return getAverage(cc.c, bl.c, l.c, b.c);
}
}
// Kernel 2 and 6
maxDark = max3v(l.l, tl.l, bl.l);
minLight = min3v(r.l, br.l, tr.l);
if (minLight > cc.l && minLight > maxDark) {
return getAverage(cc.c, r.c, br.c, tr.c);
} else {
maxDark = max3v(r.l, br.l, tr.l);
minLight = min3v(l.l, tl.l, bl.l);
if (minLight > cc.l && minLight > maxDark) {
return getAverage(cc.c, l.c, tl.c, bl.c);
}
}
// Kernel 3 and 7
maxDark = max3v(cc.l, l.l, t.l);
minLight = min3v(r.l, br.l, b.l);
if (minLight > maxDark) {
return getAverage(cc.c, r.c, br.c, b.c);
} else {
maxDark = max3v(cc.l, r.l, b.l);
minLight = min3v(t.l, l.l, tl.l);
if (minLight > maxDark) {
return getAverage(cc.c, t.c, l.c, tl.c);
}
}
return cc.c;
}
void main() {
frag_color = Compute();
}

View file

@ -1,20 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
out vec2 frag_color;
uniform sampler2D tex_input;
const vec3 K = vec3(0.2627, 0.6780, 0.0593);
// TODO: improve handling of alpha channel
#define GetLum(xoffset) dot(K, textureLodOffset(tex_input, tex_coord, 0.0, ivec2(xoffset, 0)).rgb)
void main() {
float l = GetLum(-1);
float c = GetLum(0);
float r = GetLum(1);
frag_color = vec2(r - l, l + 2.0 * c + r);
}

View file

@ -1,18 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
out float frag_color;
uniform sampler2D tex_input;
void main() {
vec2 t = textureLodOffset(tex_input, tex_coord, 0.0, ivec2(0, 1)).xy;
vec2 c = textureLod(tex_input, tex_coord, 0.0).xy;
vec2 b = textureLodOffset(tex_input, tex_coord, 0.0, ivec2(0, -1)).xy;
vec2 grad = vec2(t.x + 2.0 * c.x + b.x, b.y - t.y);
frag_color = 1.0 - length(grad);
}

View file

@ -1,46 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/renderer_opengl/texture_filters/bicubic/bicubic.h"
#include "shaders/bicubic.frag"
#include "shaders/tex_coord.vert"
namespace OpenGL {
Bicubic::Bicubic(u16 scale_factor) : TextureFilterBase(scale_factor) {
program.Create(tex_coord_vert.data(), bicubic_frag.data());
vao.Create();
src_sampler.Create();
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.draw.shader_program = program.handle;
state.texture_units[0].sampler = src_sampler.handle;
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
} // namespace OpenGL
void Bicubic::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) {
const OpenGLState cur_state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = src_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex.handle,
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
cur_state.Apply();
}
} // namespace OpenGL

View file

@ -1,54 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D input_texture;
// from http://www.java-gaming.org/index.php?topic=35123.0
vec4 cubic(float v) {
vec4 n = vec4(1.0, 2.0, 3.0, 4.0) - v;
vec4 s = n * n * n;
float x = s.x;
float y = s.y - 4.0 * s.x;
float z = s.z - 4.0 * s.y + 6.0 * s.x;
float w = 6.0 - x - y - z;
return vec4(x, y, z, w) * (1.0 / 6.0);
}
vec4 textureBicubic(sampler2D sampler, vec2 texCoords) {
vec2 texSize = vec2(textureSize(sampler, 0));
vec2 invTexSize = 1.0 / texSize;
texCoords = texCoords * texSize - 0.5;
vec2 fxy = fract(texCoords);
texCoords -= fxy;
vec4 xcubic = cubic(fxy.x);
vec4 ycubic = cubic(fxy.y);
vec4 c = texCoords.xxyy + vec2(-0.5, +1.5).xyxy;
vec4 s = vec4(xcubic.xz + xcubic.yw, ycubic.xz + ycubic.yw);
vec4 offset = c + vec4(xcubic.yw, ycubic.yw) / s;
offset *= invTexSize.xxyy;
vec4 sample0 = texture(sampler, offset.xz);
vec4 sample1 = texture(sampler, offset.yz);
vec4 sample2 = texture(sampler, offset.xw);
vec4 sample3 = texture(sampler, offset.yw);
float sx = s.x / (s.x + s.y);
float sy = s.z / (s.z + s.w);
return mix(mix(sample3, sample2, sx), mix(sample1, sample0, sx), sy);
}
void main() {
frag_color = textureBicubic(input_texture, tex_coord);
}

View file

@ -1,28 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class Bicubic : public TextureFilterBase {
public:
static constexpr std::string_view NAME = "Bicubic";
explicit Bicubic(u16 scale_factor);
void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override;
private:
OpenGLState state{};
OGLProgram program{};
OGLVertexArray vao{};
OGLSampler src_sampler{};
};
} // namespace OpenGL

View file

@ -1,47 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/renderer_opengl/texture_filters/nearest_neighbor/nearest_neighbor.h"
#include "shaders/nearest_neighbor.frag"
#include "shaders/tex_coord.vert"
namespace OpenGL {
NearestNeighbor::NearestNeighbor(u16 scale_factor) : TextureFilterBase(scale_factor) {
program.Create(tex_coord_vert.data(), nearest_neighbor_frag.data());
vao.Create();
src_sampler.Create();
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.draw.shader_program = program.handle;
state.texture_units[0].sampler = src_sampler.handle;
// linear minification still makes sense
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
}
void NearestNeighbor::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) {
const OpenGLState cur_state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = src_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex.handle,
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
cur_state.Apply();
}
} // namespace OpenGL

View file

@ -1,12 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D input_texture;
void main() {
frag_color = texture(input_texture, tex_coord);
}

View file

@ -1,27 +0,0 @@
// Copyright 2022 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class NearestNeighbor : public TextureFilterBase {
public:
static constexpr std::string_view NAME = "Nearest Neighbor";
explicit NearestNeighbor(u16 scale_factor);
void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override;
private:
OpenGLState state{};
OGLProgram program{};
OGLVertexArray vao{};
OGLSampler src_sampler{};
};
} // namespace OpenGL

View file

@ -1,46 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "video_core/renderer_opengl/texture_filters/scale_force/scale_force.h"
#include "shaders/scale_force.frag"
#include "shaders/tex_coord.vert"
namespace OpenGL {
ScaleForce::ScaleForce(u16 scale_factor) : TextureFilterBase(scale_factor) {
program.Create(tex_coord_vert.data(), scale_force_frag.data());
vao.Create();
src_sampler.Create();
state.draw.shader_program = program.handle;
state.draw.vertex_array = vao.handle;
state.draw.shader_program = program.handle;
state.texture_units[0].sampler = src_sampler.handle;
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
}
void ScaleForce::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) {
const OpenGLState cur_state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = src_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex.handle,
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
cur_state.Apply();
}
} // namespace OpenGL

View file

@ -1,137 +0,0 @@
//? #version 320 es
// from https://github.com/BreadFish64/ScaleFish/tree/master/scale_force
// MIT License
//
// Copyright (c) 2020 BreadFish64
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.
precision mediump float;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D input_texture;
vec2 tex_size;
vec2 inv_tex_size;
vec4 cubic(float v) {
vec3 n = vec3(1.0, 2.0, 3.0) - v;
vec3 s = n * n * n;
float x = s.x;
float y = s.y - 4.0 * s.x;
float z = s.z - 4.0 * s.y + 6.0 * s.x;
float w = 6.0 - x - y - z;
return vec4(x, y, z, w) / 6.0;
}
// Bicubic interpolation
vec4 textureBicubic(vec2 tex_coords) {
tex_coords = tex_coords * tex_size - 0.5;
vec2 fxy = modf(tex_coords, tex_coords);
vec4 xcubic = cubic(fxy.x);
vec4 ycubic = cubic(fxy.y);
vec4 c = tex_coords.xxyy + vec2(-0.5, +1.5).xyxy;
vec4 s = vec4(xcubic.xz + xcubic.yw, ycubic.xz + ycubic.yw);
vec4 offset = c + vec4(xcubic.yw, ycubic.yw) / s;
offset *= inv_tex_size.xxyy;
vec4 sample0 = textureLod(input_texture, offset.xz, 0.0);
vec4 sample1 = textureLod(input_texture, offset.yz, 0.0);
vec4 sample2 = textureLod(input_texture, offset.xw, 0.0);
vec4 sample3 = textureLod(input_texture, offset.yw, 0.0);
float sx = s.x / (s.x + s.y);
float sy = s.z / (s.z + s.w);
return mix(mix(sample3, sample2, sx), mix(sample1, sample0, sx), sy);
}
mat4x3 center_matrix;
vec4 center_alpha;
// Finds the distance between four colors and cc in YCbCr space
vec4 ColorDist(vec4 A, vec4 B, vec4 C, vec4 D) {
// https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.2020_conversion
const vec3 K = vec3(0.2627, 0.6780, 0.0593);
const float LUMINANCE_WEIGHT = .6;
const mat3 YCBCR_MATRIX =
mat3(K * LUMINANCE_WEIGHT, -.5 * K.r / (1.0 - K.b), -.5 * K.g / (1.0 - K.b), .5, .5,
-.5 * K.g / (1.0 - K.r), -.5 * K.b / (1.0 - K.r));
mat4x3 colors = mat4x3(A.rgb, B.rgb, C.rgb, D.rgb) - center_matrix;
mat4x3 YCbCr = YCBCR_MATRIX * colors;
vec4 color_dist = vec3(1.0) * YCbCr;
color_dist *= color_dist;
vec4 alpha = vec4(A.a, B.a, C.a, D.a);
return sqrt((color_dist + abs(center_alpha - alpha)) * alpha * center_alpha);
}
void main() {
vec4 bl = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(-1, -1));
vec4 bc = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(0, -1));
vec4 br = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(1, -1));
vec4 cl = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(-1, 0));
vec4 cc = textureLod(input_texture, tex_coord, 0.0);
vec4 cr = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(1, 0));
vec4 tl = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(-1, 1));
vec4 tc = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(0, 1));
vec4 tr = textureLodOffset(input_texture, tex_coord, 0.0, ivec2(1, 1));
tex_size = vec2(textureSize(input_texture, 0));
inv_tex_size = 1.0 / tex_size;
center_matrix = mat4x3(cc.rgb, cc.rgb, cc.rgb, cc.rgb);
center_alpha = cc.aaaa;
vec4 offset_tl = ColorDist(tl, tc, tr, cr);
vec4 offset_br = ColorDist(br, bc, bl, cl);
// Calculate how different cc is from the texels around it
float total_dist = dot(offset_tl + offset_br, vec4(1.0));
// Add together all the distances with direction taken into account
vec4 tmp = offset_tl - offset_br;
vec2 total_offset = tmp.wy + tmp.zz + vec2(-tmp.x, tmp.x);
if (total_dist == 0.0) {
// Doing bicubic filtering just past the edges where the offset is 0 causes black floaters
// and it doesn't really matter which filter is used when the colors aren't changing.
frag_color = cc;
} else {
// When the image has thin points, they tend to split apart.
// This is because the texels all around are different
// and total_offset reaches into clear areas.
// This works pretty well to keep the offset in bounds for these cases.
float clamp_val = length(total_offset) / total_dist;
vec2 final_offset = clamp(total_offset, -clamp_val, clamp_val) * inv_tex_size;
frag_color = textureBicubic(tex_coord - final_offset);
}
}

View file

@ -1,28 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class ScaleForce : public TextureFilterBase {
public:
static constexpr std::string_view NAME = "ScaleForce";
explicit ScaleForce(u16 scale_factor);
void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override;
private:
OpenGLState state{};
OGLProgram program{};
OGLVertexArray vao{};
OGLSampler src_sampler{};
};
} // namespace OpenGL

View file

@ -1,10 +0,0 @@
//? #version 330
out vec2 tex_coord;
const vec2 vertices[4] =
vec2[4](vec2(-1.0, -1.0), vec2(1.0, -1.0), vec2(-1.0, 1.0), vec2(1.0, 1.0));
void main() {
gl_Position = vec4(vertices[gl_VertexID], 0.0, 1.0);
tex_coord = (vertices[gl_VertexID] + 1.0) / 2.0;
}

View file

@ -1,35 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <string_view>
#include "common/common_types.h"
#include "common/math_util.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
namespace OpenGL {
class TextureRuntime;
class OGLTexture;
class TextureFilterBase {
friend class TextureFilterer;
public:
explicit TextureFilterBase(u16 scale_factor) : scale_factor(scale_factor) {
draw_fbo.Create();
};
virtual ~TextureFilterBase() = default;
private:
virtual void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) = 0;
protected:
OGLFramebuffer draw_fbo;
const u16 scale_factor{};
};
} // namespace OpenGL

View file

@ -1,92 +0,0 @@
/// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <functional>
#include <unordered_map>
#include "common/logging/log.h"
#include "video_core/renderer_opengl/texture_filters/anime4k/anime4k_ultrafast.h"
#include "video_core/renderer_opengl/texture_filters/bicubic/bicubic.h"
#include "video_core/renderer_opengl/texture_filters/nearest_neighbor/nearest_neighbor.h"
#include "video_core/renderer_opengl/texture_filters/scale_force/scale_force.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
#include "video_core/renderer_opengl/texture_filters/texture_filterer.h"
#include "video_core/renderer_opengl/texture_filters/xbrz/xbrz_freescale.h"
namespace OpenGL {
namespace {
using TextureFilterContructor = std::function<std::unique_ptr<TextureFilterBase>(u16)>;
template <typename T>
std::pair<std::string_view, TextureFilterContructor> FilterMapPair() {
return {T::NAME, std::make_unique<T, u16>};
};
static const std::unordered_map<std::string_view, TextureFilterContructor> filter_map{
{TextureFilterer::NONE, [](u16) { return nullptr; }},
FilterMapPair<Anime4kUltrafast>(),
FilterMapPair<Bicubic>(),
FilterMapPair<NearestNeighbor>(),
FilterMapPair<ScaleForce>(),
FilterMapPair<XbrzFreescale>(),
};
} // namespace
TextureFilterer::TextureFilterer(std::string_view filter_name, u16 scale_factor) {
Reset(filter_name, scale_factor);
}
bool TextureFilterer::Reset(std::string_view new_filter_name, u16 new_scale_factor) {
if (filter_name == new_filter_name && (IsNull() || filter->scale_factor == new_scale_factor))
return false;
auto iter = filter_map.find(new_filter_name);
if (iter == filter_map.end()) {
LOG_ERROR(Render_OpenGL, "Invalid texture filter: {}", new_filter_name);
filter = nullptr;
return true;
}
filter_name = iter->first;
filter = iter->second(new_scale_factor);
return true;
}
bool TextureFilterer::IsNull() const {
return !filter;
}
bool TextureFilterer::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect,
SurfaceType type) {
// Depth/Stencil texture filtering is not supported for now
if (IsNull() || (type != SurfaceType::Color && type != SurfaceType::Texture)) {
return false;
}
filter->Filter(src_tex, src_rect, dst_tex, dst_rect);
return true;
}
std::vector<std::string_view> TextureFilterer::GetFilterNames() {
std::vector<std::string_view> ret;
std::transform(filter_map.begin(), filter_map.end(), std::back_inserter(ret),
[](auto pair) { return pair.first; });
std::sort(ret.begin(), ret.end(), [](std::string_view lhs, std::string_view rhs) {
// sort lexicographically with none at the top
bool lhs_is_none{lhs == NONE};
bool rhs_is_none{rhs == NONE};
if (lhs_is_none || rhs_is_none)
return lhs_is_none && !rhs_is_none;
return lhs < rhs;
});
return ret;
}
} // namespace OpenGL

View file

@ -1,39 +0,0 @@
// Copyright 2020 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <memory>
#include <string_view>
#include <vector>
#include "video_core/rasterizer_cache/pixel_format.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class TextureFilterer {
public:
static constexpr std::string_view NONE = "Linear (Default)";
public:
explicit TextureFilterer(std::string_view filter_name, u16 scale_factor);
// Returns true if the filter actually changed
bool Reset(std::string_view new_filter_name, u16 new_scale_factor);
// Returns true if there is no active filter
bool IsNull() const;
// Returns true if the texture was able to be filtered
bool Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect, SurfaceType type);
static std::vector<std::string_view> GetFilterNames();
private:
std::string_view filter_name = NONE;
std::unique_ptr<TextureFilterBase> filter;
};
} // namespace OpenGL

View file

@ -1,94 +0,0 @@
// Copyright 2019 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
// adapted from
// https://github.com/libretro/glsl-shaders/blob/d7a8b8eb2a61a5732da4cbe2e0f9ad30600c3f17/xbrz/shaders/xbrz-freescale.glsl
// xBRZ freescale
// based on :
// 4xBRZ shader - Copyright (C) 2014-2016 DeSmuME team
//
// This file is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 2 of the License, or
// (at your option) any later version.
//
// This file is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with the this software. If not, see <http://www.gnu.org/licenses/>.
// Hyllian's xBR-vertex code and texel mapping
// Copyright (C) 2011/2016 Hyllian - sergiogdb@gmail.com
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
#include "video_core/renderer_opengl/texture_filters/xbrz/xbrz_freescale.h"
#include "shaders/xbrz_freescale.frag"
#include "shaders/xbrz_freescale.vert"
namespace OpenGL {
XbrzFreescale::XbrzFreescale(u16 scale_factor) : TextureFilterBase(scale_factor) {
const OpenGLState cur_state = OpenGLState::GetCurState();
program.Create(xbrz_freescale_vert.data(), xbrz_freescale_frag.data());
vao.Create();
src_sampler.Create();
state.draw.shader_program = program.handle;
state.Apply();
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glSamplerParameteri(src_sampler.handle, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
const GLint scale_loc = glGetUniformLocation(program.handle, "scale");
glUniform1f(scale_loc, static_cast<GLfloat>(scale_factor));
cur_state.Apply();
state.draw.vertex_array = vao.handle;
state.draw.shader_program = program.handle;
state.texture_units[0].sampler = src_sampler.handle;
}
void XbrzFreescale::Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) {
const OpenGLState cur_state = OpenGLState::GetCurState();
state.texture_units[0].texture_2d = src_tex.handle;
state.draw.draw_framebuffer = draw_fbo.handle;
state.viewport = {static_cast<GLint>(dst_rect.left), static_cast<GLint>(dst_rect.bottom),
static_cast<GLsizei>(dst_rect.GetWidth()),
static_cast<GLsizei>(dst_rect.GetHeight())};
state.Apply();
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, dst_tex.handle,
0);
glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0, 0);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
cur_state.Apply();
}
} // namespace OpenGL

View file

@ -1,244 +0,0 @@
//? #version 330
precision mediump float;
in vec2 tex_coord;
in vec2 source_size;
in vec2 output_size;
out vec4 frag_color;
uniform sampler2D tex;
uniform lowp float scale;
const int BLEND_NONE = 0;
const int BLEND_NORMAL = 1;
const int BLEND_DOMINANT = 2;
const float LUMINANCE_WEIGHT = 1.0;
const float EQUAL_COLOR_TOLERANCE = 30.0 / 255.0;
const float STEEP_DIRECTION_THRESHOLD = 2.2;
const float DOMINANT_DIRECTION_THRESHOLD = 3.6;
float ColorDist(vec4 a, vec4 b) {
// https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.2020_conversion
const vec3 K = vec3(0.2627, 0.6780, 0.0593);
const mat3 MATRIX = mat3(K, -.5 * K.r / (1.0 - K.b), -.5 * K.g / (1.0 - K.b), .5, .5,
-.5 * K.g / (1.0 - K.r), -.5 * K.b / (1.0 - K.r));
vec4 diff = a - b;
vec3 YCbCr = diff.rgb * MATRIX;
// LUMINANCE_WEIGHT is currently 1, otherwise y would be multiplied by it
float d = length(YCbCr);
return sqrt(a.a * b.a * d * d + diff.a * diff.a);
}
bool IsPixEqual(const vec4 pixA, const vec4 pixB) {
return ColorDist(pixA, pixB) < EQUAL_COLOR_TOLERANCE;
}
float GetLeftRatio(vec2 center, vec2 origin, vec2 direction) {
vec2 P0 = center - origin;
vec2 proj = direction * (dot(P0, direction) / dot(direction, direction));
vec2 distv = P0 - proj;
vec2 orth = vec2(-direction.y, direction.x);
float side = sign(dot(P0, orth));
float v = side * length(distv * scale);
return smoothstep(-sqrt(2.0) / 2.0, sqrt(2.0) / 2.0, v);
}
#define P(x, y) textureOffset(tex, coord, ivec2(x, y))
void main() {
vec2 pos = fract(tex_coord * source_size) - vec2(0.5, 0.5);
vec2 coord = tex_coord - pos / source_size;
//---------------------------------------
// Input Pixel Mapping: -|x|x|x|-
// x|A|B|C|x
// x|D|E|F|x
// x|G|H|I|x
// -|x|x|x|-
vec4 A = P(-1, -1);
vec4 B = P(0, -1);
vec4 C = P(1, -1);
vec4 D = P(-1, 0);
vec4 E = P(0, 0);
vec4 F = P(1, 0);
vec4 G = P(-1, 1);
vec4 H = P(0, 1);
vec4 I = P(1, 1);
// blendResult Mapping: x|y|
// w|z|
ivec4 blendResult = ivec4(BLEND_NONE, BLEND_NONE, BLEND_NONE, BLEND_NONE);
// Preprocess corners
// Pixel Tap Mapping: -|-|-|-|-
// -|-|B|C|-
// -|D|E|F|x
// -|G|H|I|x
// -|-|x|x|-
if (!((E == F && H == I) || (E == H && F == I))) {
float dist_H_F = ColorDist(G, E) + ColorDist(E, C) + ColorDist(P(0, 2), I) +
ColorDist(I, P(2, 0)) + (4.0 * ColorDist(H, F));
float dist_E_I = ColorDist(D, H) + ColorDist(H, P(1, 2)) + ColorDist(B, F) +
ColorDist(F, P(2, 1)) + (4.0 * ColorDist(E, I));
bool dominantGradient = (DOMINANT_DIRECTION_THRESHOLD * dist_H_F) < dist_E_I;
blendResult.z = ((dist_H_F < dist_E_I) && E != F && E != H)
? ((dominantGradient) ? BLEND_DOMINANT : BLEND_NORMAL)
: BLEND_NONE;
}
// Pixel Tap Mapping: -|-|-|-|-
// -|A|B|-|-
// x|D|E|F|-
// x|G|H|I|-
// -|x|x|-|-
if (!((D == E && G == H) || (D == G && E == H))) {
float dist_G_E = ColorDist(P(-2, 1), D) + ColorDist(D, B) + ColorDist(P(-1, 2), H) +
ColorDist(H, F) + (4.0 * ColorDist(G, E));
float dist_D_H = ColorDist(P(-2, 0), G) + ColorDist(G, P(0, 2)) + ColorDist(A, E) +
ColorDist(E, I) + (4.0 * ColorDist(D, H));
bool dominantGradient = (DOMINANT_DIRECTION_THRESHOLD * dist_D_H) < dist_G_E;
blendResult.w = ((dist_G_E > dist_D_H) && E != D && E != H)
? ((dominantGradient) ? BLEND_DOMINANT : BLEND_NORMAL)
: BLEND_NONE;
}
// Pixel Tap Mapping: -|-|x|x|-
// -|A|B|C|x
// -|D|E|F|x
// -|-|H|I|-
// -|-|-|-|-
if (!((B == C && E == F) || (B == E && C == F))) {
float dist_E_C = ColorDist(D, B) + ColorDist(B, P(1, -2)) + ColorDist(H, F) +
ColorDist(F, P(2, -1)) + (4.0 * ColorDist(E, C));
float dist_B_F = ColorDist(A, E) + ColorDist(E, I) + ColorDist(P(0, -2), C) +
ColorDist(C, P(2, 0)) + (4.0 * ColorDist(B, F));
bool dominantGradient = (DOMINANT_DIRECTION_THRESHOLD * dist_B_F) < dist_E_C;
blendResult.y = ((dist_E_C > dist_B_F) && E != B && E != F)
? ((dominantGradient) ? BLEND_DOMINANT : BLEND_NORMAL)
: BLEND_NONE;
}
// Pixel Tap Mapping: -|x|x|-|-
// x|A|B|C|-
// x|D|E|F|-
// -|G|H|-|-
// -|-|-|-|-
if (!((A == B && D == E) || (A == D && B == E))) {
float dist_D_B = ColorDist(P(-2, 0), A) + ColorDist(A, P(0, -2)) + ColorDist(G, E) +
ColorDist(E, C) + (4.0 * ColorDist(D, B));
float dist_A_E = ColorDist(P(-2, -1), D) + ColorDist(D, H) + ColorDist(P(-1, -2), B) +
ColorDist(B, F) + (4.0 * ColorDist(A, E));
bool dominantGradient = (DOMINANT_DIRECTION_THRESHOLD * dist_D_B) < dist_A_E;
blendResult.x = ((dist_D_B < dist_A_E) && E != D && E != B)
? ((dominantGradient) ? BLEND_DOMINANT : BLEND_NORMAL)
: BLEND_NONE;
}
vec4 res = E;
// Pixel Tap Mapping: -|-|-|-|-
// -|-|B|C|-
// -|D|E|F|x
// -|G|H|I|x
// -|-|x|x|-
if (blendResult.z != BLEND_NONE) {
float dist_F_G = ColorDist(F, G);
float dist_H_C = ColorDist(H, C);
bool doLineBlend = (blendResult.z == BLEND_DOMINANT ||
!((blendResult.y != BLEND_NONE && !IsPixEqual(E, G)) ||
(blendResult.w != BLEND_NONE && !IsPixEqual(E, C)) ||
(IsPixEqual(G, H) && IsPixEqual(H, I) && IsPixEqual(I, F) &&
IsPixEqual(F, C) && !IsPixEqual(E, I))));
vec2 origin = vec2(0.0, 1.0 / sqrt(2.0));
vec2 direction = vec2(1.0, -1.0);
if (doLineBlend) {
bool haveShallowLine =
(STEEP_DIRECTION_THRESHOLD * dist_F_G <= dist_H_C) && E != G && D != G;
bool haveSteepLine =
(STEEP_DIRECTION_THRESHOLD * dist_H_C <= dist_F_G) && E != C && B != C;
origin = haveShallowLine ? vec2(0.0, 0.25) : vec2(0.0, 0.5);
direction.x += haveShallowLine ? 1.0 : 0.0;
direction.y -= haveSteepLine ? 1.0 : 0.0;
}
vec4 blendPix = mix(H, F, step(ColorDist(E, F), ColorDist(E, H)));
res = mix(res, blendPix, GetLeftRatio(pos, origin, direction));
}
// Pixel Tap Mapping: -|-|-|-|-
// -|A|B|-|-
// x|D|E|F|-
// x|G|H|I|-
// -|x|x|-|-
if (blendResult.w != BLEND_NONE) {
float dist_H_A = ColorDist(H, A);
float dist_D_I = ColorDist(D, I);
bool doLineBlend = (blendResult.w == BLEND_DOMINANT ||
!((blendResult.z != BLEND_NONE && !IsPixEqual(E, A)) ||
(blendResult.x != BLEND_NONE && !IsPixEqual(E, I)) ||
(IsPixEqual(A, D) && IsPixEqual(D, G) && IsPixEqual(G, H) &&
IsPixEqual(H, I) && !IsPixEqual(E, G))));
vec2 origin = vec2(-1.0 / sqrt(2.0), 0.0);
vec2 direction = vec2(1.0, 1.0);
if (doLineBlend) {
bool haveShallowLine =
(STEEP_DIRECTION_THRESHOLD * dist_H_A <= dist_D_I) && E != A && B != A;
bool haveSteepLine =
(STEEP_DIRECTION_THRESHOLD * dist_D_I <= dist_H_A) && E != I && F != I;
origin = haveShallowLine ? vec2(-0.25, 0.0) : vec2(-0.5, 0.0);
direction.y += haveShallowLine ? 1.0 : 0.0;
direction.x += haveSteepLine ? 1.0 : 0.0;
}
origin = origin;
direction = direction;
vec4 blendPix = mix(H, D, step(ColorDist(E, D), ColorDist(E, H)));
res = mix(res, blendPix, GetLeftRatio(pos, origin, direction));
}
// Pixel Tap Mapping: -|-|x|x|-
// -|A|B|C|x
// -|D|E|F|x
// -|-|H|I|-
// -|-|-|-|-
if (blendResult.y != BLEND_NONE) {
float dist_B_I = ColorDist(B, I);
float dist_F_A = ColorDist(F, A);
bool doLineBlend = (blendResult.y == BLEND_DOMINANT ||
!((blendResult.x != BLEND_NONE && !IsPixEqual(E, I)) ||
(blendResult.z != BLEND_NONE && !IsPixEqual(E, A)) ||
(IsPixEqual(I, F) && IsPixEqual(F, C) && IsPixEqual(C, B) &&
IsPixEqual(B, A) && !IsPixEqual(E, C))));
vec2 origin = vec2(1.0 / sqrt(2.0), 0.0);
vec2 direction = vec2(-1.0, -1.0);
if (doLineBlend) {
bool haveShallowLine =
(STEEP_DIRECTION_THRESHOLD * dist_B_I <= dist_F_A) && E != I && H != I;
bool haveSteepLine =
(STEEP_DIRECTION_THRESHOLD * dist_F_A <= dist_B_I) && E != A && D != A;
origin = haveShallowLine ? vec2(0.25, 0.0) : vec2(0.5, 0.0);
direction.y -= haveShallowLine ? 1.0 : 0.0;
direction.x -= haveSteepLine ? 1.0 : 0.0;
}
vec4 blendPix = mix(F, B, step(ColorDist(E, B), ColorDist(E, F)));
res = mix(res, blendPix, GetLeftRatio(pos, origin, direction));
}
// Pixel Tap Mapping: -|x|x|-|-
// x|A|B|C|-
// x|D|E|F|-
// -|G|H|-|-
// -|-|-|-|-
if (blendResult.x != BLEND_NONE) {
float dist_D_C = ColorDist(D, C);
float dist_B_G = ColorDist(B, G);
bool doLineBlend = (blendResult.x == BLEND_DOMINANT ||
!((blendResult.w != BLEND_NONE && !IsPixEqual(E, C)) ||
(blendResult.y != BLEND_NONE && !IsPixEqual(E, G)) ||
(IsPixEqual(C, B) && IsPixEqual(B, A) && IsPixEqual(A, D) &&
IsPixEqual(D, G) && !IsPixEqual(E, A))));
vec2 origin = vec2(0.0, -1.0 / sqrt(2.0));
vec2 direction = vec2(-1.0, 1.0);
if (doLineBlend) {
bool haveShallowLine =
(STEEP_DIRECTION_THRESHOLD * dist_D_C <= dist_B_G) && E != C && F != C;
bool haveSteepLine =
(STEEP_DIRECTION_THRESHOLD * dist_B_G <= dist_D_C) && E != G && H != G;
origin = haveShallowLine ? vec2(0.0, -0.25) : vec2(0.0, -0.5);
direction.x -= haveShallowLine ? 1.0 : 0.0;
direction.y += haveSteepLine ? 1.0 : 0.0;
}
vec4 blendPix = mix(D, B, step(ColorDist(E, B), ColorDist(E, D)));
res = mix(res, blendPix, GetLeftRatio(pos, origin, direction));
}
frag_color = res;
}

View file

@ -1,26 +0,0 @@
// Copyright 2019 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/texture_filters/texture_filter_base.h"
namespace OpenGL {
class XbrzFreescale : public TextureFilterBase {
public:
static constexpr std::string_view NAME = "xBRZ freescale";
explicit XbrzFreescale(u16 scale_factor);
void Filter(const OGLTexture& src_tex, Common::Rectangle<u32> src_rect,
const OGLTexture& dst_tex, Common::Rectangle<u32> dst_rect) override;
private:
OpenGLState state{};
OGLProgram program{};
OGLVertexArray vao{};
OGLSampler src_sampler{};
};
} // namespace OpenGL

View file

@ -1,17 +0,0 @@
//? #version 330
out vec2 tex_coord;
out vec2 source_size;
out vec2 output_size;
uniform sampler2D tex;
uniform lowp float scale;
const vec2 vertices[4] =
vec2[4](vec2(-1.0, -1.0), vec2(1.0, -1.0), vec2(-1.0, 1.0), vec2(1.0, 1.0));
void main() {
gl_Position = vec4(vertices[gl_VertexID], 0.0, 1.0);
tex_coord = (vertices[gl_VertexID] + 1.0) / 2.0;
source_size = vec2(textureSize(tex, 0));
output_size = source_size * scale;
}