|
#171
|
||||
|
||||
![]()
With Jabo's LLE graphics i can get in game on Star Wars: Rogue Squadron, but i need to use interpreter if not textures get very twisted.
|
#172
|
||||
|
||||
![]()
There is a small texture bug at the beginning of the intro of Rogue Squadron, caused by the VR4300 recompiler CPU of Project64 1.6 (don't recall if 2.0 fixed it), which the interpreter CPU made go away.
Unless you mean the RSP CPU recompiler/interpreter of pj64 RSP
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#173
|
||||
|
||||
![]() Quote:
Yes,to avoid confusion I dont bother with per game plugin settings while testing,I stick to global. Over this upcoming weekend I am going to attempt to play Ocarina of Time,from start to finish using your LLE RSP plugin,w/z64gl. I would like to use D3D8,but there are lines when using it. ![]()
__________________
Intel core i5 3470 @ 3.9 8gb ddr 3 ram GTX 460 1gb |
#174
|
||||
|
||||
![]() Quote:
Quote:
![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#175
|
||||
|
||||
![]() Quote:
Last edited by Lithium; 9th May 2013 at 09:10 PM. |
#176
|
||||
|
||||
![]() Code:
_VAND: LFB32: pushl %esi pushl %ebx movl 12(%esp), %eax movl 16(%esp), %edx sall $4, %edx leal _VR(%edx), %ecx sall $4, %eax leal _VR(%eax), %ebx leal _VR+16(%edx), %esi cmpl %esi, %ebx jb L76 L74: movdqa _VR(%edx), %xmm0 pand _VC, %xmm0 movdqa %xmm0, _VR(%eax) popl %ebx popl %esi ret L76: ; # non-SSE variation in pure software, using 4 AND(longword) ops Code:
#include "vu.h" static void VAND(int vd, int vs, int vt, int e) { register int i; // SHUFFLE_VECTOR(vt, e); for (i = 0; i < 8; i++) /* Try to write 128 b: *(__int128 *)VR[vd] = ... */ VR[vd][i] = VR[vs][i] & VC[i]; /* for (i = 0; i < 8; i++) VACC[i].s[LO] = VR[vd][i]; */ return; }
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#177
|
||||
|
||||
![]() Code:
#ifdef PARALLELIZE_VECTOR_TRANSFERS #define VR_T(i) VC[i] #else #define VR_T(i) VR[vt][ei[e][i]] // 2-D look-up buffer for scalar decodes #endif #ifdef PARALLELIZE_VECTOR_TRANSFERS #define ACC_S(i) VR[vd][i] #define ACC_D(i) VACC[i].s[00] // assume low 16-bit slice of each acc #else #define ACC_S(i) VACC[i].s[00] #define ACC_D(i) VR[vd][i] #endif /* * If we want to parallelize vector transfers, we probably also want to * linearize the register files. (VR dest. reads from VR src. op. VR trg.) * Lining up the emulator for VR[vd] = VR[vs] & VR[vt] is a lot easier than * doing it for VACC[i](15..0) = VR[vs][i] & VR[vt][i] inside of some loop. * However, the correct order in vector units is to update the accumulator * register file BEFORE the vector register file. This is slower but more * accurate and even required in some cases (VMAC* and VMAD* operations). * However, it is worth sacrificing if it means doing vectors in parallel. */ int sub_mask[16] = { 0x0, 0x0, 0x1, 0x1, 0x3, 0x3, 0x3, 0x3, 0x7, 0x7, 0x7, 0x7, 0x7, 0x7, 0x7, 0x7 }; inline void SHUFFLE_VECTOR(int vt, int e) { register int i, j; #if (0 == 1) /* speed mode (not yet stabilized */ j = sub_mask[e]; e = j ^ 07; for (i = 0; i < 8; i++) VC[i] = VR[vt][(i & e) | j]; #else /* compatibility mode (temporary choice) */ if (e & 0x8) for (i = 0; i < 8; i++) VC[i] = VR[vt][(i & 00) | (e & 0x7)]; else if (e & 0x4) for (i = 0; i < 8; i++) VC[i] = VR[vt][(i & 04) | (e & 0x3)]; else if (e & 0x2) for (i = 0; i < 8; i++) VC[i] = VR[vt][(i & 06) | (e & 0x1)]; else // e == 0b0000 || e == 0b0001 for (i = 0; i < 8; i++) VC[i] = VR[vt][(i & 07) | (e & 0x0)]; #endif return; } Code:
static void VAND(int vd, int vs, int vt, int e) { register int i; for (i = 0; i < 8; i++) ACC_S(i) = VR[vs][i] & VR_T(i); for (i = 0; i < 8; i++) ACC_D(i) = ACC_S(i); return; } If it is not defined, the algorithm is done accurately in pure, iterative software, securing proper write-backs to the destination vector register file in a way that does not conflict with the source/target vector register files by means of gating the writes through the accumulator crossbar as a data transfer hazard barrier. the new vector half of the RSP CPU loop from `execute.h`: Code:
EX: if (inst >> 25 == 0x25) /* is a VU instruction */ { const int vd = (inst & 0x000007C0) >> 6; const int vs = (inst & 0x0000FFFF) >> 11; const int vt = (inst & 0x001F0000) >> 16; const int e = (inst & 0x01E00000) >> 21; #ifdef PARALLELIZE_VECTOR_TRANSFERS SHUFFLE_VECTOR(vt, e); #endif SP_COP2_C2[inst %= 64](vd, vs, vt, e); continue; }
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#178
|
||||
|
||||
![]()
I have released the source of rsp_interface (the modified LLE rsp) that I used in HleAudio pluign.
rsp_interface source here or from HleAudio thread here Let me know if you think there are more efficient ways to use rsp, or errors, or improvements. Thanks.
__________________
--------------------- CPU: Intel U7300 1.3 GHz GPU: Mobile Intel 4 Series (on board) AUDIO: Realtek HD Audio (on board) RAM: 4 GB OS: Windows 7 - 32 bit Last edited by shunyuan; 10th May 2013 at 10:09 AM. |
#179
|
||||
|
||||
![]()
The implementation seems simple enough to secure, though I don't really know any experience with setting up audio HLE from a LLE RSP.
I see you based off a new template of zilmar's InitiateRSP off the one from the plugin specs, using only the elementary pointer symbols needed for audio, DMA and such. I was curious if this would work in your audio HLE plugin? Code:
// #undef SEMAPHORE_LOCK_CORRECTIONS /* The CPU-RSP semaphore is a lock defining synchronization with the host. * As of the time in which bpoint reversed the RSP, host interpretation of * this lock was incorrect. The problem has been inherent for a very long * time until a rather recent update applied between Project64 1.7:2.0. * * If this is on, 1964 and Mupen64 will have no sound for [any?] games. * It will be almost completely useless on Project64 1.6 or older. * The exception is HLE audio, where it will work for almost every game. * * Keep this off when using audio LLE or playing games booting off the NUS- * CIC-6105 chip (also uses the semaphore); keep it on with Project64 2.0. */ #define SEMAPHORE_LOCK_CORRECTIONS // Recommended only for CPUs supporting it For your own analysis, this macro is only used in $(rsp)/su/cop0/mfc0.h, reading from SP_SEMAPHORE_REG.
__________________
http://theoatmeal.com/comics/cat_vs_internet Last edited by HatCat; 10th May 2013 at 03:03 PM. |
#180
|
||||
|
||||
![]()
Ah, I might have forgotten something.
In RunRSP you applied this: Code:
void RunRSP() { unsigned int TaskType; // clear all registers memset(&RspRegs, 0, sizeof(RspRegs)); ... First, most of the RSP registers are powered on to a randomized/undefined state and value. The exceptions include the 8 RSP accumulator segments and some bits of the system control (CP0) registers and status flags. The game software is always responsible for initializing these registers after system startup, or else their value is undefined (not necessarily zero'd). So is it really necessary to zero the memory for RspRegs every time there is an RSP task to execute? -------- Second, I'm not sure whether this check is really necessary. Code:
TaskType = *((unsigned int *)(RSP.DMEM + 0xFC0)); if (TaskType == 2) run_microcode(); If the RSP plugin judges it to be a audio task, it can request your audio plugin to HLE it. So you're checking here to make sure it's an audio task in case the RSP plugin was wrong about it being an audio task. This is stable and safe and fine, but slower because your if() causes an extra branch every time we want to run something in HLE. The branch frame could be worked around like this: Code:
if (TaskType != 2) // MessageBoxA(NULL, "RSP thought this task was aud?", NULL, 0x00000030); return; else run_microcode(); // Rename this to `run_task()` for my next release
__________________
http://theoatmeal.com/comics/cat_vs_internet |