Go Back   Project64 Forums > General Discussion > Open Discussion

Reply
 
Thread Tools Display Modes
  #171  
Old 9th May 2013, 06:35 PM
Lithium's Avatar
Lithium Lithium is offline
Alpha Tester
Project Supporter
Member
 
Join Date: Aug 2012
Location: hue hue br br
Posts: 68
Default

Quote:
Originally Posted by suanyuan View Post
Need to use LLE graphics plugin, such as z64gl.
With Jabo's LLE graphics i can get in game on Star Wars: Rogue Squadron, but i need to use interpreter if not textures get very twisted.
Reply With Quote
  #172  
Old 9th May 2013, 06:41 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

There is a small texture bug at the beginning of the intro of Rogue Squadron, caused by the VR4300 recompiler CPU of Project64 1.6 (don't recall if 2.0 fixed it), which the interpreter CPU made go away.

Unless you mean the RSP CPU recompiler/interpreter of pj64 RSP
Reply With Quote
  #173  
Old 9th May 2013, 06:47 PM
mesk14's Avatar
mesk14 mesk14 is offline
Junior Member
 
Join Date: Sep 2010
Posts: 19
Default

Quote:
Originally Posted by FatCat View Post
Hey, did you make sure the globally set RSP plugin is to my RSP plugin?

Yes,to avoid confusion I dont bother with per game plugin settings while testing,I stick to global.

Over this upcoming weekend I am going to attempt to play Ocarina of Time,from start to finish using your LLE RSP plugin,w/z64gl. I would like to use D3D8,but there are lines when using it.


__________________
Intel core i5 3470 @ 3.9
8gb ddr 3 ram
GTX 460 1gb
Reply With Quote
  #174  
Old 9th May 2013, 06:58 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Quote:
Originally Posted by mesk14 View Post
Yes,to avoid confusion I dont bother with per game plugin settings while testing,I stick to global.
Right. Well, no idea why that RunRSP template issue happens, but feel free to work around the pj64 interface by using another emulator like Mupen64 to play Indiana Jones.

Quote:
Originally Posted by mesk14 View Post
Over this upcoming weekend I am going to attempt to play Ocarina of Time,from start to finish using your LLE RSP plugin,w/z64gl.
I actually just finished doing that a few days ago. But feel free to play OOT anyway, haven't done MM yet.
Reply With Quote
  #175  
Old 9th May 2013, 09:04 PM
Lithium's Avatar
Lithium Lithium is offline
Alpha Tester
Project Supporter
Member
 
Join Date: Aug 2012
Location: hue hue br br
Posts: 68
Default

Quote:
Originally Posted by FatCat View Post
There is a small texture bug at the beginning of the intro of Rogue Squadron, caused by the VR4300 recompiler CPU of Project64 1.6 (don't recall if 2.0 fixed it), which the interpreter CPU made go away.

Unless you mean the RSP CPU recompiler/interpreter of pj64 RSP
I am using interpreter CPU, on recompiler CPU textures are twisted.

Last edited by Lithium; 9th May 2013 at 09:10 PM.
Reply With Quote
  #176  
Old 10th May 2013, 01:20 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Code:
_VAND:
LFB32:
	pushl	%esi
	pushl	%ebx
	movl	12(%esp), %eax
	movl	16(%esp), %edx
	sall	$4, %edx
	leal	_VR(%edx), %ecx
	sall	$4, %eax
	leal	_VR(%eax), %ebx
	leal	_VR+16(%edx), %esi
	cmpl	%esi, %ebx
	jb	L76
L74:
	movdqa	_VR(%edx), %xmm0
	pand	_VC, %xmm0
	movdqa	%xmm0, _VR(%eax)
	popl	%ebx
	popl	%esi
	ret
L76:
; # non-SSE variation in pure software, using 4 AND(longword) ops
A very short SSE (probably some SSE2) emulation of the Vector AND RSP operation, assuming a pre-shuffled target vector coefficient and delayed accumulator write as follows:

Code:
#include "vu.h"

static void VAND(int vd, int vs, int vt, int e)
{
    register int i;

    // SHUFFLE_VECTOR(vt, e);
    for (i = 0; i < 8; i++) /* Try to write 128 b:  *(__int128 *)VR[vd] = ... */
        VR[vd][i] = VR[vs][i] & VC[i]; /*
    for (i = 0; i < 8; i++)
        VACC[i].s[LO] = VR[vd][i]; */
    return;
}
It would be the most accurate emulation of any vector op but not necessarily the fastest depending on the method used for shuffling the vector (discussed before).
Reply With Quote
  #177  
Old 10th May 2013, 04:17 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Code:
#ifdef PARALLELIZE_VECTOR_TRANSFERS
#define VR_T(i) VC[i]
#else
#define VR_T(i) VR[vt][ei[e][i]] // 2-D look-up buffer for scalar decodes
#endif

#ifdef PARALLELIZE_VECTOR_TRANSFERS
#define ACC_S(i)    VR[vd][i]
#define ACC_D(i)    VACC[i].s[00] // assume low 16-bit slice of each acc
#else
#define ACC_S(i)    VACC[i].s[00]
#define ACC_D(i)    VR[vd][i]
#endif
/*
 * If we want to parallelize vector transfers, we probably also want to
 * linearize the register files.  (VR dest. reads from VR src. op. VR trg.)
 * Lining up the emulator for VR[vd] = VR[vs] & VR[vt] is a lot easier than
 * doing it for VACC[i](15..0) = VR[vs][i] & VR[vt][i] inside of some loop.
 * However, the correct order in vector units is to update the accumulator
 * register file BEFORE the vector register file.  This is slower but more
 * accurate and even required in some cases (VMAC* and VMAD* operations).
 * However, it is worth sacrificing if it means doing vectors in parallel.
 */

int sub_mask[16] = {
    0x0,
    0x0,
    0x1, 0x1,
    0x3, 0x3, 0x3, 0x3,
    0x7, 0x7, 0x7, 0x7, 0x7, 0x7, 0x7, 0x7
};

inline void SHUFFLE_VECTOR(int vt, int e)
{
    register int i, j;
#if (0 == 1) /* speed mode (not yet stabilized */
    j = sub_mask[e];
    e = j ^ 07;
    for (i = 0; i < 8; i++)
        VC[i] = VR[vt][(i & e) | j];
#else /* compatibility mode (temporary choice) */
    if (e & 0x8)
        for (i = 0; i < 8; i++)
            VC[i] = VR[vt][(i & 00) | (e & 0x7)];
    else if (e & 0x4)
        for (i = 0; i < 8; i++)
            VC[i] = VR[vt][(i & 04) | (e & 0x3)];
    else if (e & 0x2)
        for (i = 0; i < 8; i++)
            VC[i] = VR[vt][(i & 06) | (e & 0x1)];
    else // e == 0b0000 || e == 0b0001
        for (i = 0; i < 8; i++)
            VC[i] = VR[vt][(i & 07) | (e & 0x0)];
#endif
    return;
}
Example using VAND:

Code:
static void VAND(int vd, int vs, int vt, int e)
{
    register int i;

    for (i = 0; i < 8; i++)
        ACC_S(i) = VR[vs][i] & VR_T(i);
    for (i = 0; i < 8; i++)
        ACC_D(i) = ACC_S(i);
    return;
}
So per my implementation, if we have defined `PARALLELIZE_VECTOR_TRANSFERS` then try to execute all 8 vector slice transactions in parallel (simultaneously) by maintaining the VD = VS (op) VT register file match-up, before buffering to the accumulator crossbar.

If it is not defined, the algorithm is done accurately in pure, iterative software, securing proper write-backs to the destination vector register file in a way that does not conflict with the source/target vector register files by means of gating the writes through the accumulator crossbar as a data transfer hazard barrier.

the new vector half of the RSP CPU loop from `execute.h`:
Code:
EX:
        if (inst >> 25 == 0x25) /* is a VU instruction */
        {
            const int vd = (inst & 0x000007C0) >>  6;
            const int vs = (inst & 0x0000FFFF) >> 11;
            const int vt = (inst & 0x001F0000) >> 16;
            const int e  = (inst & 0x01E00000) >> 21;

#ifdef PARALLELIZE_VECTOR_TRANSFERS
            SHUFFLE_VECTOR(vt, e);
#endif
            SP_COP2_C2[inst %= 64](vd, vs, vt, e);
            continue;
        }
Reply With Quote
  #178  
Old 10th May 2013, 09:59 AM
shunyuan's Avatar
shunyuan shunyuan is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Apr 2013
Posts: 491
Default

I have released the source of rsp_interface (the modified LLE rsp) that I used in HleAudio pluign.

rsp_interface source here or from HleAudio thread here

Let me know if you think there are more efficient ways to use rsp, or errors, or improvements.

Thanks.
__________________
---------------------
CPU: Intel U7300 1.3 GHz
GPU: Mobile Intel 4 Series (on board)
AUDIO: Realtek HD Audio (on board)
RAM: 4 GB
OS: Windows 7 - 32 bit

Last edited by shunyuan; 10th May 2013 at 10:09 AM.
Reply With Quote
  #179  
Old 10th May 2013, 03:00 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

The implementation seems simple enough to secure, though I don't really know any experience with setting up audio HLE from a LLE RSP.

I see you based off a new template of zilmar's InitiateRSP off the one from the plugin specs, using only the elementary pointer symbols needed for audio, DMA and such.

I was curious if this would work in your audio HLE plugin?

Code:
// #undef  SEMAPHORE_LOCK_CORRECTIONS
/* The CPU-RSP semaphore is a lock defining synchronization with the host.
 * As of the time in which bpoint reversed the RSP, host interpretation of
 * this lock was incorrect.  The problem has been inherent for a very long
 * time until a rather recent update applied between Project64 1.7:2.0.
 *
 * If this is on, 1964 and Mupen64 will have no sound for [any?] games.
 * It will be almost completely useless on Project64 1.6 or older.
 * The exception is HLE audio, where it will work for almost every game.
 *
 * Keep this off when using audio LLE or playing games booting off the NUS-
 * CIC-6105 chip (also uses the semaphore); keep it on with Project64 2.0.
 */

#define SEMAPHORE_LOCK_CORRECTIONS // Recommended only for CPUs supporting it
To execute on LLE it needs Project64 2.x's CPU, but I'm not sure what happens if you try it in HLE...

For your own analysis, this macro is only used in $(rsp)/su/cop0/mfc0.h, reading from SP_SEMAPHORE_REG.

Last edited by HatCat; 10th May 2013 at 03:03 PM.
Reply With Quote
  #180  
Old 10th May 2013, 03:22 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Ah, I might have forgotten something.

In RunRSP you applied this:
Code:
void RunRSP()
{
	unsigned int TaskType;

	// clear all registers
	memset(&RspRegs, 0, sizeof(RspRegs));
...
Again, I have no HLE-in-the-plugins experience here, but this raises a couple questions to me.

First, most of the RSP registers are powered on to a randomized/undefined state and value. The exceptions include the 8 RSP accumulator segments and some bits of the system control (CP0) registers and status flags.
The game software is always responsible for initializing these registers after system startup, or else their value is undefined (not necessarily zero'd).

So is it really necessary to zero the memory for RspRegs every time there is an RSP task to execute?

--------

Second, I'm not sure whether this check is really necessary.
Code:
	TaskType = *((unsigned int *)(RSP.DMEM + 0xFC0));

	if (TaskType == 2)
		run_microcode();
When the CPU host knows there is a SP task, it tells the RSP plugin first.
If the RSP plugin judges it to be a audio task, it can request your audio plugin to HLE it.

So you're checking here to make sure it's an audio task in case the RSP plugin was wrong about it being an audio task.
This is stable and safe and fine, but slower because your if() causes an extra branch every time we want to run something in HLE.

The branch frame could be worked around like this:
Code:
if (TaskType != 2)
    // MessageBoxA(NULL, "RSP thought this task was aud?", NULL, 0x00000030);
    return;
else
    run_microcode(); // Rename this to `run_task()` for my next release
Some of that other code (clear SP_STATUS_HALT, reset the program counter) might also be obsolete/redundant since the main CPU (pj64) or indirectly the RSP plugin itself is responsible for maintaining that security, but again you might keep it there just to be safe.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT. The time now is 12:39 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.