|
#361
|
||||
|
||||
![]()
Cash, money, baby.
![]() That Pau faggot from the start wouldn't stop pretending to blame me for allying with mupen64plus and porting my plugin over to Linux, even though HE COULD PLAINLY SEE IN THE POST HE LINKED THAT MUDLORD DID IT, not me. Fucking idiots pretending like they don't have any common sense. But, honestly nothing to be mad at mudlord about. If mudlord didn't have ecsv port the plugin for mupen64plus, Pau would have found some other excuse to pussy out of a project too complex for him to comprehend, anyway, so it doesn't matter.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#362
|
||||
|
||||
![]()
Yet another bug accidentally found yet swiftly realized.
I'm kind of blogging these btw in case other people make my mistakes. Code:
-#define BES(address) (address ^ ENDIAN&03) -#define HES(address) (address ^ ENDIAN&02) -#define MES(address) (address ^ ENDIAN&01) -#define WES(address) (address ^ ENDIAN&00) +#define BES(address) ((address) ^ ((ENDIAN) & 03)) +#define HES(address) ((address) ^ ((ENDIAN) & 02)) +#define MES(address) ((address) ^ ((ENDIAN) & 01)) +#define WES(address) ((address) ^ ((ENDIAN) & 00)) I see ((((((code)expressions)like)this)all too often)) and it pisses me off. So I picked up the semi-bad habit of minimizing the recursion of my parenthetic expressions. This was at a cost because the red-color code above, caused a bug with this code: Code:
#define VR_B(v, e) (*(unsigned char *)(((unsigned char *)(VR + v)) + MES(e))) ... /*** Scalar, Coprocessor Operations (vector unit, scalar cache transfers) ***/ INLINE void LS_Group_I(int direction, int length) { /* Group I vector loads and stores, as defined in SGI's patent. */ register unsigned long addr; register int i; register int e = (inst.R.sa >> 1) & 0xF; const signed int offset = -(inst.W & 0x00000040) | inst.R.func; addr = (SR[inst.R.rs] + length*offset); if (direction == 0) /* "Load %s to Vector Unit" */ for (i = 0; i < length; i++) VR_B(inst.R.rt, e+i | 0x0) = RSP.DMEM[BES(addr+i & 0xFFF)]; else /* "Store %s from Vector Unit" */ for (i = 0; i < length; i++) RSP.DMEM[BES(addr+i & 0xFFF)] = VR_B(inst.R.rt, e+i & 0xF); return; } Because XOR has a miniscule edge in order of operations in priority, over OR, causing it to be read as: (e+i | (0x0 ^ 03)) instead of: ((e+i | 0x0) ^ 03) Anyway, the only reason I put that null operation of "| 0x0" in there was solely for the artistic congruence of both the L?V and S?V Group I emulations. (The second line has & 0xF, so I threw in | 0x0 in the first line to make them more consistent, and equal-width, similar algorithms of code lines are somewhat easier to read anyway.) How many more bugs are left lmao? I can go back and count later. It's 100% impossible that any of the vector operations have any bugs tho cause those are 100% untouched since the last stable release.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#363
|
||||
|
||||
![]() Quote:
![]() Thanks for the tip anyway though. I can see this catches way more iconoclasms than Visual Studio does. I would keep it turned on but there are a few overly fixated things I cannot remove the warnings for yet at this phase.... Quote:
What about: Code:
In file included from ../../rsp/rsp.c:7:0: ../../rsp/rsp.h: In function `trace_RSP_registers`: ../../rsp/rsp.h:182:9: warning: format `%X` expects argument of type `unsigned int`, but argument 3 has type `long unsigned int` [-Wformat] ... etc. , repeated Code:
fprintf(out, "SP_MEM_ADDR: %08X CMD_START: %08X\n", *RSP.SP_MEM_ADDR_REG, *RSP.DPC_START_REG); fprintf(out, "SP_DRAM_ADDR: %08X CMD_END: %08X\n", *RSP.SP_DRAM_ADDR_REG, *RSP.DPC_END_REG); fprintf(out, "SP_DMA_RD_LEN: %08X CMD_CURRENT: %08X\n", *RSP.SP_RD_LEN_REG, *RSP.DPC_CURRENT_REG); fprintf(out, "SP_DMA_WR_LEN: %08X CMD_STATUS: %08X\n", *RSP.SP_WR_LEN_REG, *RSP.DPC_STATUS_REG); fprintf(out, "SP_STATUS: %08X CMD_CLOCK: %08X\n", *RSP.SP_STATUS_REG, *RSP.DPC_CLOCK_REG); fprintf(out, "SP_DMA_FULL: %08X CMD_BUSY: %08X\n", *RSP.SP_DMA_FULL_REG, *RSP.DPC_BUFBUSY_REG); fprintf(out, "SP_DMA_BUSY: %08X CMD_PIPE_BUSY: %08X\n", *RSP.SP_DMA_BUSY_REG, *RSP.DPC_PIPEBUSY_REG); fprintf(out, "SP_SEMAPHORE: %08X CMD_TMEM_BUSY: %08X\n", *RSP.SP_SEMAPHORE_REG, *RSP.DPC_TMEM_REG); fprintf(out, "SP_PC_REG: 04001%03X\n\n", *RSP.SP_PC_REG & 0x00000FFF); Do I have to write (unsigned int) next to every one of them now lmao? Sure, maybe zilmar didn't have to use M$' `DWORD` macro all the time, but it is also logical in a sense that he did because `long` is the way to guarantee at least 32 bits of storage, which is a requirement for MIPS registers. Also, -pedantic or -ansi or whatever doesn't like me saying (long long), for things like the 64-bit accumulator union addressing modes. How am I supposed to shut these warnings up without compromising the speed of my plugin XD (dropping the chance for 64-bit SSE writes to the acc., etc.).
__________________
http://theoatmeal.com/comics/cat_vs_internet Last edited by HatCat; 27th August 2013 at 03:10 AM. |
#364
|
||||
|
||||
![]()
Know something funny MarathonMan?
All this time that this plugin won't even boot Super Mario 64 or Quest 64 without garbage vector graphics and screwed up static audio, yet I just now realized that Conker's BFD RDP graphics are once again flawlessly handled by my plugin. ![]() Damn, I thought that would be one of the last games I would get working again on this plugin after the rewrite, not the first. Anyway, now that I have a chance to finally benchmark your help: Code:
RSP Public Release 4 (stable) copyright text screen: 31-33 vi/s nintendo logo: 57-64 vi/s slowest part of chainsaw demo: 12-13 vi/s RSP Current Beta with help from MarathonMan copyright text screen: 30-32 vi/s nintendo logo: 56-64 vi/s slowest part of chainsaw demo: 12-13 vi/s Plus I haven't implemented any pseudo-opcodes of special specifier cases that I can now take advantage of by using your method. SO I guess it's a matter of going back to this stuff. AND, have not done the 2-D table for the vector ops. Man, flabbergasting. I still can't believe that quest64/mario64/etc. have broken audio AND gfx, yet Conker's works great on my not-yet-100%-stable RSP rewrite.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#365
|
||||
|
||||
![]() Quote:
![]() l => long h => short %hd = short int %ld = long int etc. Quote:
Last edited by MarathonMan; 27th August 2013 at 04:20 PM. |
#366
|
||||
|
||||
![]() Quote:
Take your working plugin and make a static FILE *. Initialize it with fopen("rsp.dump", "w"), and, after every instruction, do fwrite(VR, 16, 32, file) and fwrite(SR, 4, 32, file). Then in your NEW plugin, fread both back from the file after every instruction and assert out if they're not equal. Use a debugger to determine the faulting instruction and reason of occurrence - it makes it super easy. I usually can't do such things, though, because I'm in essence bootstrapping my own plugins. ![]() Branch folding time! https://github.com/cxd4/rsp/blob/master/execute.h#L74 Just always force the result to zero. It doesn't matter if someone changed it. ![]() https://github.com/cxd4/rsp/blob/master/execute.h#L79 Wut. Branch delay slots? Could use a LUT holding the mask either 0xFFF or 0xFFC and always force stage to zero. Last edited by MarathonMan; 27th August 2013 at 04:19 PM. |
#367
|
||||
|
||||
![]()
Thx
![]() Quote:
Not without dropping 64-bit R/W to the, 48-bit accumulators. D= Meh, can never 100% win with those things. Unless writing less than 64 bits at a time you think could be faster anyway. Quote:
I just presumed you meant, run both versions of the interpreter in sync, constantly comparing all the data after each instruction. I could do it that way also though I am a bit of a masochist. I was actually stepping through the RSP commands to manually track it myself, but I ended up fixing an entirely unrelated bug instead (something somewhat minor I didn't bother posting here). I've pretty much sanity-checked everything. Most of the CP0/CP2 stuff is direct copy pasta off my stable build, and the only possible way in hell L/S BV,SV,LV,DV ?WC2 could have the bug is if I am passing the wrong segment length unit to my universal Group I handler function.... So IDK, might just have to do that. Funny how I'm so lazy I want to do things the hard way all the time instead of doing the work to make it easier on my laziness. ![]() Quote:
Passing 0 as the message priority makes it a null statement because I have defined `MINIMUM_MESSAGE_PRIORITY` to 1. So if you check this in the x86 output, lines 74 and 75 are omitted by the compiler as they should be. If I wanted to debug cases of where games overwrite the zero register, I would decrement the MINIMUM_MESSAGE_PRIORITY so that that would compile in. I have already seen some of such cases to the extent necessary. https://github.com/cxd4/rsp/blob/master/execute.h#L79 Quote:
And yes, it is branch delay slots. When J/JAL/BEQ/BNE/etc. schedules a branch, I set stage to 1 before the end of the function. stage is then <<= 1 to convert it to 2 for the next instruction, before re-starting the CPU loop for the next instruction. If, at any point before the <<= 1, it is already 2, it initiates the branch right away. It's loosely based on the organization of delay slots zilmar did in his RSP interpreter, only he used random numbers for the #define's and there was no mathematical pattern you could use to take advantage of semi-static branch scheduling as in this manner.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#368
|
||||
|
||||
![]() Quote:
![]() EDIT: There must be a reason it's not faster. I'm certain of it. I'll take a look at the assembly dump of main loop after work and see what intuition it provides. Last edited by MarathonMan; 27th August 2013 at 07:15 PM. |
#369
|
||||
|
||||
![]() Quote:
I made it slower on purpose, several times, while adding your technique to make it faster. ![]() You can see this without the assembly output, just by looking at the C. Part of why I did this was to make my rewrites less prematurely hazard-prone. Another part is to balance your speed-up, with slower code from me ![]() ![]() It is stuff that I can go back and undo later to re-claim the speed-up. The introduction of the PC scheduler phase in the main CPU loop is one of the things I did to slow this plugin down. Before it just added 0x004 to PC whenever you did a branch, and used a speed hack to maintain the correct link offsets for JAL, JALR, BLTZAL, and BGEZAL. Now it adds 0x008 which is the actual, documented and correct algorithm, at a speed cost of constantly checking the branch scheduler frame every time in the CPU loop. Other things I can think of to slow this plugin down: LH, LHU, LW, SH, and SW no longer try to write 16/32 bits in one single instruction. They force 8-bit writes and use a very, very slow software-forced zero- or sign-extension macro I wrote from the MIPS manual. (I know it's slower than just saying (signed short) type conversion or something; bear with me! ![]() Similar for LSV, LLV, LDV, SSV, SLV, and SDV. SDV is very frequently executed by games, but rather than doing 4 16-bit writes and only 1 AND mask by 0xFFF, it does 8 8-bit writes, 7 AND-masks by 0xFFF, and 7 increments on the DMEM address instead of hard-coding them into LEA ops or something. So, yeah. Several reasons I can think of why this plugin is 1 VI/s slower than the last stable release, rather than 10-15% faster like you predicted. Maybe you are right though...perhaps it shouldn't have had that drastic of an effect. I am not sure there.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#370
|
||||
|
||||
![]() Code:
In file included from ../../rsp/rsp.c:7:0: ../../rsp/rsp.h: In function `trace_RSP_registers`: ../../rsp/rsp.h:230:13: warning: format `%hX1 expects a machine `int` argument [-Wformat] ../../rsp/rsp.h:236:13: warning: format `%hX` expects a machine `int` argument [-Wformat] Code:
for (i = 0; i < 10; i++) fprintf( out, " $v%i: [%04hX][%04hX][%04hX][%04hX][%04hX][%04hX][%04hX][%04hX]\n", VR[i][00], VR[i][01], VR[i][02], VR[i][03], VR[i][04], VR[i][05], VR[i][06], VR[i][07]); for (i = 10; i < 32; i++) /* decimals "10" and higher with two characters */ fprintf( out, "$v%i: [%04hX][%04hX][%04hX][%04hX][%04hX][%04hX][%04hX][%04hX]\n", VR[i][00], VR[i][01], VR[i][02], VR[i][03], VR[i][04], VR[i][05], VR[i][06], VR[i][07]); VR is array of short arrays, yet it tells me now it expects type `int`. ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |