|
#351
|
||||
|
||||
![]()
I'm not sure yet that it converted the functions into the 2-D jump table as it should.
The section labels are there using the names, but ".cfi_startproc" makes me think it is keeping them all to their own function. ![]() Code:
LFE32: .p2align 2,,3 .def _SLL; .scl 3; .type 32; .endef _SLL: LFB35: .cfi_startproc movb _inst+1, %al shrb $3, %al movzbl %al, %eax movb _inst+2, %dl andl $31, %edx movl _inst, %ecx shrl $6, %ecx movl _SR(,%edx,4), %edx sall %cl, %edx movl %edx, _SR(,%eax,4) ret .cfi_endproc LFE35: .p2align 2,,3 .def _SRL; .scl 3; .type 32; .endef _SRL: LFB36: .cfi_startproc movb _inst+1, %al shrb $3, %al movzbl %al, %eax movb _inst+2, %dl andl $31, %edx movl _inst, %ecx shrl $6, %ecx movl _SR(,%edx,4), %edx shrl %cl, %edx movl %edx, _SR(,%eax,4) ret .cfi_endproc LFE36: .p2align 2,,3 .def _SRA; .scl 3; .type 32; .endef _SRA: LFB37: .cfi_startproc ;# ......... Chiefly, because my jump table is fixated and consistent, always 64x64, so I did not need a structure like you did to include the AND-mask value, only the shift. Code:
//in execute.h: EX_SCALAR[inst.J.op][(inst.W >> sub_op_table[inst.J.op]) & 077](); //because, in su.h: #define OFF_FUNCTION 0 #define OFF_SA 6 #define OFF_E 7 #define OFF_RD 11 #define OFF_RT 16 #define OFF_RS 21 #define OFF_OPCODE 26 const int sub_op_table[64] = { OFF_FUNCTION, /* SPECIAL */ OFF_RT, /* REGIMM */ OFF_OPCODE, /* J */ OFF_OPCODE, /* JAL */ OFF_OPCODE, /* BEQ */ OFF_OPCODE, /* BNE */ OFF_OPCODE, /* BLEZ */ OFF_OPCODE, /* BGTZ */ OFF_OPCODE, /* ADDI */ OFF_OPCODE, /* ADDIU */ OFF_OPCODE, /* SLTI */ OFF_OPCODE, /* SLTIU */ OFF_OPCODE, /* ANDI */ OFF_OPCODE, /* ORI */ OFF_OPCODE, /* XORI */ OFF_OPCODE, /* LUI */ OFF_RS, /* COP0 */ OFF_RS, OFF_RS, /* COP2 */ OFF_RS, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, /* LB */ OFF_OPCODE, /* LH */ OFF_OPCODE, OFF_OPCODE, /* LW */ OFF_OPCODE, /* LBU */ OFF_OPCODE, /* LHU */ OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, /* SB */ OFF_OPCODE, /* SH */ OFF_OPCODE, OFF_OPCODE, /* SW */ OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_RD, /* LWC2 */ OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_RD, /* SWC2 */ OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE, OFF_OPCODE }; Like LB, LH and LW for example, I can make versions of those functions where inst.R.rs is zero, so we save the time of decoding (SR[base] + offset) & 0xFFF, because we always know that SR[0] is fixed to 0. In the meantime, wow lmao, the DLL size jumped up from 80 KB to 104. zilmar's RSP emulator was always way too bulky and over-sized because of defficient build settings. When I modified his RSP emulator I got it down to as low as 34 KB (the one in 2.0 I think is like 70 something), but now it seems I have to go the opposite way and accept the fact that his DLL is always going to be smaller than mine from now on. As for performance, did not even bother waiting to make this post first. ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet Last edited by HatCat; 23rd August 2013 at 06:58 PM. |
#352
|
||||
|
||||
![]() Quote:
Now that each RSP instruction is contained within it's own C function, the interpreter will be able to dispatch functions with easy since it's a single, flat indirect just without any checks needed in regards to the bounds of the jump range or things like that. Quote:
Quote:
|
#353
|
||||
|
||||
![]() Quote:
It is possibly still better with this new way I guess. Before, I avoided the assumption that inner, sub-opcodes were necessarily common in the ROM, so I only did a switch() on the primary opcode to avoid resorting to calling any functions at all, but if it was a vector instruction it was a function pointer table regardless before even reading in the primary opcode. Only one way to know for sure here, to get the run-time stable! ![]() Quote:
As long as it's not like a megabyte or anything, there is nothing really concerning about it to me. What I AM picky about is making the file size nice and aligned. ![]() The old size of 80 KB was not the best example, but at least it was 2*2*2*2*5 KB. It's just an obsession of mine. Quote:
![]() I knew all along that it was going to be broken. It didn't even take me running my plugin in PJ64 just now to give it a test drive to have known all along, before I even made that post, that I had to have broken something. All I really concentrated on was stabilizing all the stuff under LWC2/SWC2, the scalar loads/stores and almost every other opcode anyway. What I know I didn't concentrate on was maintaining the stability of the Jumps and Branches, and the BREAK opcode SP_STATUS_HALT reader for continuing the RSP CPU loop. I almost intentionally broke those things, because I was in a hurry to just rewrite the damn thing quickly! But those are all the things that are easy to go back to and fix. I just stressed out my time span of the rewrite on the opcodes that are not easy to go back and fix in the case of unexpected/invisible bugs, as you only get one shot to catch those things before such interpreter bugs are extremely hard to find. Should have it fixed over today.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#354
|
||||
|
||||
![]()
Sorry, did not even try to fix it today because actually I completed (and I think perfected) my own dynamic RSP disassembler instead!
It would have come in handy so long ago.... The RSP disassembler I wrote is the new file $rsp/matrix.h . Besides official info, so far as algorithm this was mostly through my own ideas, but it is otherwise in terms of accuracy as compared to zilmar's RSP disassembler built into RSP 1.7.0.9 for Project64 2.x:
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#355
|
||||
|
||||
![]()
Have still not fixed it yet!
I underestimated the amount of interpreter bugs I could have to go back to. I am at the same time making the core more stable than the current public release anyway, so the final result won't have any new weaknesses. Some things I have run into: * had to fix branch-and-link scheduler phase mechanism, which I broke purposely anyway * Lazy copy-pasta of bit-wise AND, into the XORI function from ANDI * Shouldn't use sizeof(VR) as a computational constant for 16-bite LS V lengths * also purposely broke the SP_STATUS_BROKE exception handler and fixed it ![]() * union bit-field decoder bug with I-type which MM helped me fix today * was not supposed to shift offset for scalar loads and stores, only vector-DMEM transactions Quote:
Code:
INLINE void LS_Group_I(int direction, int length) { /* Group I vector loads and stores, as defined in SGI's patent. */ register unsigned long addr; register int i; register int e = (inst.R.sa >> 1) & 0xF; const signed int offset = -(inst.W & 0x00000040) | inst.R.func; addr = (SR[inst.R.rs] + length*offset); if (direction == 0) /* "Load %s to Vector Unit" */ for (i = 0; i < length; i++) VR_B(inst.R.rt, e+i | 0x0) = RSP.DMEM[BES(addr+i & 0xFFF)]; else /* "Store %s from Vector Unit" */ for (i = 0; i < length; i++) RSP.DMEM[BES(addr+i & 0xFFF)] = VR_B(inst.R.rt, e+i & 0xF); return; } Code:
void LLV(void) { LS_Group_I(0, sizeof(long) > 4 ? 4 : sizeof(long)); return; } void LDV(void) { LS_Group_I(0, sizeof(long long) > 8 ? 8 : sizeof(long long)); return; } void SBV(void) { LS_Group_I(1, sizeof(unsigned char)); return; } void SSV(void) { LS_Group_I(1, sizeof(short) > 2 ? 2 : sizeof(short)); return; }
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#356
|
||||
|
||||
![]()
And another bug fixed, caused again by the 144-hour-or-so frustrating rewrite of the entire scalar unit inducing tire and laziness.
Even though I rewrote like half of the file, yet again lmao, the actual fix was a simple one-liner for SH: Code:
*(short *)(RSP.DMEM + addr - HES(0x000)*(addr%4 - 1)); *(short *)(RSP.DMEM + addr - HES(0x000)*(addr%4 - 1)) = (short)(SR[rt]); The fortunate thing about these bugs is that it's impossible for me to not notice them because a game is always bound to exploit them, so there won't be any invisible/unnoticeable bugs when I am done sorting through this. I spent about:
Anyway, it boots Quest 64 and shows all the intro, start menu etc. graphics perfectly and flawlessly. But there is a new bug by some rare RSP opcode that waited until now to execute, probably something under SWC2 Groups II and III somewhere. So it's not enough to boot Mario64 3-D geometry and compare MarathonMan's 2-D jump table to my old RSP interpreter's speed. ![]() That should be an easy fix tomorrow when I wake up. I mean today when I wake up. *passes out*
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#357
|
||||
|
||||
![]() Quote:
Compile with: Code:
gcc -Wall ... If you really want to be standard conformant and have clean code: Code:
gcc -Wall -Wextra -pedantic -ansi ... |
#358
|
|||
|
|||
![]()
your love for gnu/linux and anything made by stallman including gcc sickens me.
go peddle your foss manlove elsewhere. |
#359
|
||||
|
||||
![]()
Actually those are for the most part in common with Visual Studio also, which has the /Wall parameter.
And if you hate GNU/GCC/Stallman so bad then why keep porting my plugins to mupen64plus? ![]() No doubt that uses GCC and is FOSS. Last edited by HatCat; 26th August 2013 at 04:34 PM. |
#360
|
||||
|
||||
![]() Quote:
|