Next Generation Emulation banner

Links and Guides to Custom Shaders for Pete's OpenGL2 plugin

876K views 1K replies 255 participants last post by  Trentem 
Hey, great job KrossX and SimoneT! Tks!

I recommend using 5xBR, which is faster and works well at 3x and 5x scale factor.

Besides, there's a variant called "b" of that shader, which preserve fonts a bit more. This variant you can get by exchanging a single line in that shader:

This:
Code:
	interp_restriction_lv1      = ((e!=f) && (e!=h));
By this:

Code:
	interp_restriction_lv1      = ((e!=f) && (e!=h) &&  ( f!=b && h!=d || e==i && f!=i4 && h!=i5 || e==g || e==c ) );
It's a bit slower, though. That "b" variant was made available in XML shader format in this thread.



Some screenshots of 5xBR running with 2D psx games would be interesting! :p

I'd like to know if this filter could be applied over textures of 3D games.


EDIT: Hey, that ColourLength function! I didn't see that! I'll look what you did and possibly use in my cg shader to speed things up! :pP

BTW: I've made this bicubic fast shader you can port (if you like) to ePSXe: bicubic-fast.cg
 
I think it would be a good filter to have as a texture filter.
It seems alright. This "b" version is better for 2D RPGs where you need to read many text boxes.

It would be better to take screenshots in a perfect integer scale. So, as the psx native resolution is 320x240, you'd need to take shots at 1600x1200 to get perfect 5x scale factor.

For example, this is what I get when scaling 5x perfectly Super Metroid (SNES): Super Metroid
 
Hey, I've tried those optimizations (ColourLength(), RGBtoHSV()) here in Cg, but it's 90% slower than my original code using matrices.

Besides, I've forgotten to delete some of those RGBtoYUV usages (LOL). That's because my cg compiler is smart enough to ignore those who aren't never used ahead! :p

I'm using that NVshaderperf to profile here.
 
I have found a bug in your cg shader...

Ciao.
Thanks for the info, maybe it's a problem for some graphics cards. On PS3 there's no bug of this sort. I think the profile is different for each hardware and I only can test for PS3. I think there's no problem with that implicit cast, it's just the compiler complaining over something not relevant.

Have you noticed any graphical differences using this original code over your glsl code?
 
Your original code do the dot(vec3(0.299, 0.587, 0.114),color), that it's the luminance (Y) in YIQ or YUV color space. It's faster then my code but It introduce some artefact in some cases. My code use an algorithm from http://www.compuphase.com/cmetric.htm, slower but more precise.
About the compiler warning, the function mul(half3x3,half3) return a half3, not a float... ciao.
Thanks for the links.

Those values for luminance are from the HQx implementations, which are the same from ITU (International Telecommunications Union) norms. I'll read and look about your improvements and see if I can introduce into the Cg version.

It seems a very subtle bug, indeed. I'll look if it's worth to fix it. Besides, if it doesn't harm to much the image and speed is a concern, then we can call it some kind of speed hack. :heh:

EDIT: That mul return half3, but the abs function turn it into half (or float?), which is then converted to float.

There's another implementation for ePSXe. Could you look into the way Guest made it? It seems he got some optimizations too: Another xBR for ePSXe
 
The abs function return the absolute value. Ex: abs(-0.54675) = 0.54675. The cg compiler returns the first value of the half3. Ex: float3(0.1,0.2,0.3) = half(0.1). In your case (cg compiler) isn't a problem but, for example, the AMD_ATI parser will return an error. The nvidia GLSL compiler use the cg compiler for translate the GLSL shader to ASM. That's why reports only a warning and not an error.
Humm, I didn't see that. I have to look on how to fix that and not hit speed...
 
float4 RGBtoYUV(half4x3 mat_color)
{
float4 a = mul(mat_color,yuv_weighted[0]);
return a;
}

Ciao.
It worked! And there wasn't a speed hit! :thumb:

EDIT: Humm... it gave me an idea to use swizzle operators. Let's see...

EDIT 2: IT WORKED!! Now using swizzling it's 27% faster! Hey SimoneT, I love you! :pPP

EDIT 3: Here the 3.6 versions -> xBR 3.6. There are three variants according to how corners are treated: a, b and c. "a" is rounded; "b" is semi-rounded and "c" is squared. Only testing you'll know what I mean.
 
Yes, but I think the "a" version is enough good for most PSX games (and gamers). I'm working on a version that permit the use of the internal screen filtering. When I think is ready I sent you a PM.
Ciao.
Ok. Does the 'internal screen filtering' work over textures?

P.S.: have you tested my "vertex calculation" trick?
Not yet. But I've passed my eyes a bit over it. It seems you know some clever shader tricks. :wub:

I'll try to translate to Cg later and see what happens. :thumb:
 
SimoneT, why don't you use variables instead assuming resolution of 1024x512?

I'll use something like this in my code (not tested yet):

Code:
	half2 ps = half2(1.0/IN.texture_size.x, 1.0/IN.texture_size.y);
	half dx  = ps.x;
	half dy  = ps.y;

//    A1 B1 C1
// A0  A  B  C C4
// D0  D  E  F F4
// G0  G  H  I I4
//    G5 H5 I5

	OUT.texCoord = texCoord;
	OUT.t1 = texCoord.xxxy + half4( -dx, 0, dx,-2.0*dy); // A1 B1 C1
	OUT.t2 = texCoord.xxxy + half4( -dx, 0, dx,    -dy); //  A  B  C 
	OUT.t3 = texCoord.xxxy + half4( -dx, 0, dx,      0); //  D  E  F 
	OUT.t4 = texCoord.xxxy + half4( -dx, 0, dx,     dy); //  G  H  I 
	OUT.t5 = texCoord.xxxy + half4( -dx, 0, dx, 2.0*dy); // G5 H5 I5
	OUT.t6 = texCoord.xyyy + half4(-2.0*dx,-dy, 0,  dy); // A0 D0 G0
	OUT.t7 = texCoord.xyyy + half4( 2.0*dx,-dy, 0,  dy); // C4 F4 I4
 
SimoneT, I've seen you changed this code:

Code:
	half3 res = nc.x ? px.x ? F : H : nc.y ? px.y ? B : F : nc.z ? px.z ? D : B : nc.w ? px.w ? H : D : E;
By this:

Code:
if ((nc.x && px.x) || (nc.y && !px.y)) 
{
	E = F;
} 
else
if ((nc.y && px.y) || (nc.z && !px.z)) 
{
	E = B;
} 
else
if ((nc.z && px.z) || (nc.w && !px.w)) 
{
	E = D;
} 
else
if ((nc.w && px.w) || (nc.x && !px.x)) 
{
	E = H;
}
This is wrong, because you changed priorities. I expect some small artifacts to appear now.
 
I don't know if we can assume it always happen based only in one example picture.

The fact I was trying to demonstrate is that those pieces of codes weren't equivalent anymore. We can test more and if it becomes clear your optimizations fix some defects, then I will incorporate to the cg codes. Improvements are always welcomed!
 
Ok. I will test with more games and emulators (I have ported yours shader to bsnes but I haven't released it...) and i will post some more screenshots.
Great. This code is so tight I always expect artifacts when something is changed. :D:D


Bsnes? Someone else already made it in this thread: xml shader

Maybe you wanna contribute there.

EDIT: Your implementation seems a bit different from mine. This is what I'm getting with my Cg shader on PS3:
 
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top