Go Back   Project64 Forums > General Discussion > Open Discussion

Reply
 
Thread Tools Display Modes
  #1241  
Old 22nd July 2014, 12:15 AM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Quote:
Originally Posted by GPDP View Post
Fair warning, though: it's quite prone to crashes, and despite using the Zilmar spec, a lot of plugins do not work or make the emulator crash. I can confirm this plugin as well as HatCat's Static Interpreter RSP plugin both work. Not sure about a working audio plugin, but I honestly didn't try to look for one very hard.
So I downloaded it, and ran into a problem. Since no HLE video plugin I've tried works, I try to use z64gl. It doesn't seem to be reading the config file. Kinda weird how almost no plugins work ;/ .
Quote:
Originally Posted by zuzma View Post
It looks like the svn revision that was copied to the mupen64plus github account is 1416 too. So there's source code for it too if anyone cares.
Hopefully I can see what they did to the core and just improve the original mupen. Idk wth they did to make it so buggy.
Reply With Quote
  #1242  
Old 22nd July 2014, 12:23 AM
GPDP GPDP is offline
Senior Member
 
Join Date: May 2013
Posts: 146
Default

If your plan is to backport the improvements back to original Mupen, why not just use the latest source from their repo?
Reply With Quote
  #1243  
Old 22nd July 2014, 12:34 AM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Quote:
Originally Posted by GPDP View Post
If your plan is to backport the improvements back to original Mupen, why not just use the latest source from their repo?
I may. It just depends how difficult a task it is to port. I figure for now, it would be easier to port changes from that old version first.

Lol is there even a speed limiter? For now I'm depending on the audio plugin to do that ;/ .

I wish someone sped up Mupen's recompiler ;/ . Then I wouldn't bother ever using a different emulator. Accurate audio + high compatibility is nice.
Reply With Quote
  #1244  
Old 22nd July 2014, 04:33 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Code:
    mask  = (right - center) | (left - right);
    mask &= (right - left) | (center - right);
...

Code:
    mask  = (right - center) | (left - right);
    mask &= -(left - right) | -(right - center);
...

Code:
    mask  = +(m) | +(n);
    mask &= -(n) | -(m);
...

Code:
    mask = (+m || +n) && (-n || -m);
Now, (+m & -m) and (+n & -n) are both logically impossible. For any real number x, (+x & -x) can't have the most significant bit (sign bit, as in, m = (a - b < 0) or (a < b) set, for a similar reason as to why (m & ~m) is always 0.

This allows the following alternate form of writing:
Code:
    mask = (+m | +n) & (-n | -m);
...
    mask = (+m & -n) | (+n & -m);
Therefore:
Code:
    mask  = (right - center) | (left - right);
    mask &= (right - left) | (center - right);

...

    mask = (right - center)&(right - left) | (left - right)&(center - right);
Reply With Quote
  #1245  
Old 22nd July 2014, 04:51 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Code:
(c - b)*(c - a) | (a - c)*(b - c)

(c^2 - a*c - b*c + b*a) | (a*b - a*c - b*c + c^2)

(c^2 - a*c - b*c + b*a) + (a*b - a*c - b*c + c^2)

c^2 - a*c - b*c + b*a + a*b - a*c - b*c + c^2

2*c^2 - 2*a*c - 2*b*c + 2*a*b

2*(c^2 - a*c - b*c + a*b)

2*(c*(c - a) - b*(c + a))

?? c&(c - a) - b&(c + a)
... aaaand
PROFIT!!!

eh, I think I'm happy with the way it is now lol.
Either way VI divot filtering has been greatly staticized over the original.
Reply With Quote
  #1246  
Old 22nd July 2014, 03:40 PM
ReyVGM ReyVGM is offline
Project Supporter
Senior Member
 
Join Date: Mar 2014
Posts: 212
Default

Quote:
Originally Posted by HatCat View Post
Just made some changes to when down-scaling to a smaller screen size, like from the native 640x480 VI resolution into 320x240, 440x330 etc.. Before it was using GL_NEAREST for a pixelated downscale, which didn't work because when the VI read the image in it was not simply doubling pixels to exactly the same RGB value in neighboring pixels. So I changed it to GL_LINEAR, and now it's even better than how glDrawPixels makes it look:



Before, it looked no better than this did when glPixelZoom was doing a raster downsize:



The animation of the picture still does show a bit of line interleaving with the scrolling text since it doesn't counteract pixel-accurate emulation of the upscaling to 640x480 done by the VI in the first place, but an improvement is still nice.
That first image looks great!! That means I can get the Goldeneye screenies with the correct res AND the filters now?
Reply With Quote
  #1247  
Old 22nd July 2014, 05:03 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

It's an improvement on the side of down-scaling the pixel-accurate image through better filtering methods than what DirectDraw offered, but it's still only pixel-accurate. It may look better than before, but due to the pixel-accuracy you can still see the text font interleaving a couple of times in the animation of the text, in the way you were describing before.

The only thing I may be able to do about that is to do oddMLan's suggestion and try to implement a new option to use OpenGL to stretch the image rather than the real hardware's version of it.
Reply With Quote
  #1248  
Old 22nd July 2014, 05:27 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Okay so old function:
Code:
static void divot_filter(
    CCVG* final, CCVG center, CCVG left, CCVG right)
{
    signed int leftr, leftg, leftb;
    signed int rightr, rightg, rightb;
    signed int centerr, centerg, centerb;

    *final = center;
    if ((center.rgba[CCVG_CVG]&left.rgba[CCVG_CVG]&right.rgba[CCVG_CVG]) == 7)
        return;

    leftr = left.rgba[CCVG_RED];
    leftg = left.rgba[CCVG_GRN];
    leftb = left.rgba[CCVG_BLU];
    rightr = right.rgba[CCVG_RED];
    rightg = right.rgba[CCVG_GRN];
    rightb = right.rgba[CCVG_BLU];
    centerr = center.rgba[CCVG_RED];
    centerg = center.rgba[CCVG_GRN];
    centerb = center.rgba[CCVG_BLU];

    if ((leftr >= centerr && rightr >= leftr) || (leftr >= rightr && centerr >= leftr))
        final->rgba[CCVG_RED] = leftr;
    else if ((rightr >= centerr && leftr >= rightr) || (rightr >= leftr && centerr >= rightr))
        final->rgba[CCVG_RED] = rightr;

    if ((leftg >= centerg && rightg >= leftg) || (leftg >= rightg && centerg >= leftg))
        final->rgba[CCVG_GRN] = leftg;
    else if ((rightg >= centerg && leftg >= rightg) || (rightg >= leftg && centerg >= rightg))
        final->rgba[CCVG_GRN] = rightg;

    if ((leftb >= centerb && rightb >= leftb) || (leftb >= rightb && centerb >= leftb))
        final->rgba[CCVG_BLU] = leftb;
    else if ((rightb >= centerb && leftb >= rightb) || (rightb >= leftb && centerb >= rightb))
        final->rgba[CCVG_BLU] = rightb;
    return;
}
and new function written by me:
Code:
static void divot_filter(
    CCVG* final, CCVG center, CCVG left, CCVG right)
{
    unsigned char possibilities[2];
    signed int l_rgb[4], c_rgb[4], r_rgb[4];
    signed int mask;
    const int sign_bit = 8*sizeof(int) - 1;

    *final = center;
    mask = center.rgba[CCVG_CVG] & left.rgba[CCVG_CVG] & right.rgba[CCVG_CVG];
    if (mask == 0x07)
        return;

    l_rgb[0] = left.rgba[CCVG_RED];
    l_rgb[1] = left.rgba[CCVG_GRN];
    l_rgb[2] = left.rgba[CCVG_BLU];
    r_rgb[0] = right.rgba[CCVG_RED];
    r_rgb[1] = right.rgba[CCVG_GRN];
    r_rgb[2] = right.rgba[CCVG_BLU];
    c_rgb[0] = center.rgba[CCVG_RED];
    c_rgb[1] = center.rgba[CCVG_GRN];
    c_rgb[2] = center.rgba[CCVG_BLU];

    possibilities[0] = right.rgba[CCVG_RED];
    possibilities[1] = final -> rgba[CCVG_RED];
    mask =
        (r_rgb[0] - c_rgb[0]) & (r_rgb[0] - l_rgb[0])
      | (l_rgb[0] - r_rgb[0]) & (c_rgb[0] - r_rgb[0]);
    final->rgba[CCVG_RED] = possibilities[(unsigned)mask >> sign_bit];
    possibilities[0] = left.rgba[CCVG_RED];
 /* possibilities[1] = final -> rgba[CCVG_RED]; */
    mask =
        (l_rgb[0] - c_rgb[0]) & (l_rgb[0] - r_rgb[0])
      | (r_rgb[0] - l_rgb[0]) & (c_rgb[0] - l_rgb[0]);
    final->rgba[CCVG_RED] = possibilities[(unsigned)mask >> sign_bit];

    possibilities[0] = right.rgba[CCVG_GRN];
    possibilities[1] = final -> rgba[CCVG_GRN];
    mask =
        (r_rgb[1] - c_rgb[1]) & (r_rgb[1] - l_rgb[1])
      | (l_rgb[1] - r_rgb[1]) & (c_rgb[1] - r_rgb[1]);
    final->rgba[CCVG_GRN] = possibilities[(unsigned)mask >> sign_bit];
    possibilities[0] = left.rgba[CCVG_GRN];
 /* possibilities[1] = final -> rgba[CCVG_GRN]; */
    mask =
        (l_rgb[1] - c_rgb[1]) & (l_rgb[1] - r_rgb[1])
      | (r_rgb[1] - l_rgb[1]) & (c_rgb[1] - l_rgb[1]);
    final->rgba[CCVG_GRN] = possibilities[(unsigned)mask >> sign_bit];

    possibilities[0] = right.rgba[CCVG_BLU];
    possibilities[1] = final -> rgba[CCVG_BLU];
    mask =
        (r_rgb[2] - c_rgb[2]) & (r_rgb[2] - l_rgb[2])
      | (l_rgb[2] - r_rgb[2]) & (c_rgb[2] - r_rgb[2]);
    final->rgba[CCVG_BLU] = possibilities[(unsigned)mask >> sign_bit];
    possibilities[0] = left.rgba[CCVG_BLU];
 /* possibilities[1] = final -> rgba[CCVG_BLU]; */
    mask =
        (l_rgb[2] - c_rgb[2]) & (l_rgb[2] - r_rgb[2])
      | (r_rgb[2] - l_rgb[2]) & (c_rgb[2] - l_rgb[2]);
    final->rgba[CCVG_BLU] = possibilities[(unsigned)mask >> sign_bit];
    return;
}
So in the new function I've made it forward-extensible to SSE/vectorization possibilities by converting the inequality "greater than or equal" comparisons to subtraction of 2 signed ints to check the sign MSB, with the subtraction using the variables organized as array elements parallelized into matching indices for the vector file.

Not only that but the code is smaller than it was before, and no longer does if/else-if dynamic branching. Absolute Crap seems to be only 1 VI/s faster...so no huge speed-up, but if it's noticeable + smaller code, it's worth keeping.
Reply With Quote
  #1249  
Old 22nd July 2014, 07:19 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

This game never seems to end sometimes lol.

With a little more immutability and memory indirection, SSE could pre-determine the results so that active negation isn't needed in the middle of storing the results.

Code:
static void divot_filter(
    CCVG* final, CCVG center, CCVG left, CCVG right)
{
    CCVG* possibilities[3];
    signed int l_rgb[4], c_rgb[4], r_rgb[4];
    signed int mask;
    const int sign_bit = 8*sizeof(int) - 1;

    *final = center;
    mask = center.rgba[CCVG_CVG] & left.rgba[CCVG_CVG] & right.rgba[CCVG_CVG];
    if (mask == 0x07)
        return;

    possibilities[0] = &left;
    possibilities[1] = final;
    possibilities[2] = &right;

    l_rgb[0] = left.rgba[CCVG_RED];
    l_rgb[1] = left.rgba[CCVG_GRN];
    l_rgb[2] = left.rgba[CCVG_BLU];
    r_rgb[0] = right.rgba[CCVG_RED];
    r_rgb[1] = right.rgba[CCVG_GRN];
    r_rgb[2] = right.rgba[CCVG_BLU];
    c_rgb[0] = center.rgba[CCVG_RED];
    c_rgb[1] = center.rgba[CCVG_GRN];
    c_rgb[2] = center.rgba[CCVG_BLU];

    mask =
        (r_rgb[0] - c_rgb[0]) & (r_rgb[0] - l_rgb[0])
      | (l_rgb[0] - r_rgb[0]) & (c_rgb[0] - r_rgb[0]);
    mask = (unsigned)(mask) >> sign_bit; /* mask = (mask < 0); */
    final->rgba[CCVG_RED] = possibilities[2 - mask] -> rgba[CCVG_RED];
    mask =
        (r_rgb[1] - c_rgb[1]) & (r_rgb[1] - l_rgb[1])
      | (l_rgb[1] - r_rgb[1]) & (c_rgb[1] - r_rgb[1]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_GRN] = possibilities[2 - mask] -> rgba[CCVG_GRN];
    mask =
        (r_rgb[2] - c_rgb[2]) & (r_rgb[2] - l_rgb[2])
      | (l_rgb[2] - r_rgb[2]) & (c_rgb[2] - r_rgb[2]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_BLU] = possibilities[2 - mask] -> rgba[CCVG_BLU];

    mask =
        (l_rgb[0] - c_rgb[0]) & (l_rgb[0] - r_rgb[0])
      | (r_rgb[0] - l_rgb[0]) & (c_rgb[0] - l_rgb[0]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_RED] = possibilities[0 + mask] -> rgba[CCVG_RED];
    mask =
        (l_rgb[1] - c_rgb[1]) & (l_rgb[1] - r_rgb[1])
      | (r_rgb[1] - l_rgb[1]) & (c_rgb[1] - l_rgb[1]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_GRN] = possibilities[0 + mask] -> rgba[CCVG_GRN];
    mask =
        (l_rgb[2] - c_rgb[2]) & (l_rgb[2] - r_rgb[2])
      | (r_rgb[2] - l_rgb[2]) & (c_rgb[2] - l_rgb[2]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_BLU] = possibilities[0 + mask] -> rgba[CCVG_BLU];
    return;
}
At the point where we start doing (0 + mask), rather than (2 - mask), we can permanently modify the value of r_rgb[0, 1, 2] and c_rgb[0, 1, 2] as follows:
Code:
r_rgb[0] -= l_rgb[0];
r_rgb[1] -= l_rgb[1];
r_rgb[2] -= l_rgb[2];
/* r_rgb[3] -= l_rgb[3]; // for SIMD code generation */
c_rgb[0] -= l_rgb[0];
c_rgb[1] -= l_rgb[1];
c_rgb[2] -= l_rgb[2];
/* c_rgb[3] -= l_rgb[3]; */
As an example, instead of computing:
Code:
    mask =
        (l_rgb[0] - c_rgb[0]) & (l_rgb[0] - r_rgb[0])
      | (r_rgb[0] - l_rgb[0]) & (c_rgb[0] - l_rgb[0]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_RED] = possibilities[0 + mask] -> rgba[CCVG_RED];
With the above permanent modification, we can now write:
Code:
    mask = (-c_rgb[0] & -r_rgb[0]) | (+r_rgb[0] & +c_rgb[0]);
    mask = (unsigned)(mask) >> sign_bit;
    final->rgba[CCVG_RED] = possibilities[0 + mask] -> rgba[CCVG_RED];
See even in C optimization is fun!
Reply With Quote
  #1250  
Old 22nd July 2014, 10:11 PM
Bighead's Avatar
Bighead Bighead is offline
Alpha Tester
Project Supporter
Junior Member
 
Join Date: May 2009
Posts: 13
Default

Been trying to follow this thread, but it moves so fast. :P
I can't really offer any useful insight, but I can clear this one up:
Quote:
Originally Posted by RPGMaster View Post
I try to use z64gl. It doesn't seem to be reading the config file.
z64 only seems to read the config file from "EmuFolder\Plugin" directory, which is probably hard coded in the source. I first discovered this when attempting to use a global plugin directory for all N64 emulators. As long as it's in a "Plugin" folder (not Plugins, Plugin/RSP, etc.) it should work.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT. The time now is 10:57 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.