|
#221
|
||||
|
||||
![]() Quote:
Quote:
Were you meaning to highlight something particular? :/
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#222
|
||||
|
||||
![]()
Son of a bitch.
![]() I can't trust Windows Explorer to handle file copies anymore. It just copies all the root files and forgets all the stuff under the subdirectories that I need. I just realized because everything under rsp/su/* and rsp/vu/* is out-of-date source code with the old release. I'll fix it in the next release which thanks to jelta's bug report I am bound to do soon.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#223
|
||||
|
||||
![]() Quote:
Why when PARALLELIZE_VECTOR_TRANSFERS is defined, then you swap the definitions of ACC_R(i), and ACC_W(i)?
__________________
--------------------- CPU: Intel U7300 1.3 GHz GPU: Mobile Intel 4 Series (on board) AUDIO: Realtek HD Audio (on board) RAM: 4 GB OS: Windows 7 - 32 bit |
#224
|
||||
|
||||
![]()
Because parallelization is better achieved when reading out source computational results of the VR file to a destination buffer of the same register file (VR[vd]) and then written out to the vector accumulator (VACC[i]), than in the other order.
Which order you read from and then write to is a matter of accuracy/efficiency, depending on if parallelism can be maintained.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#225
|
||||
|
||||
![]()
Now that's interesting.
Animal Forest / Doubutsu no Mori uses the F3DZEX graphics microcode. I always thought only the two Zelda64 games use F3DZEX (hell, what else could the Z mean). But nope. Zelda OOT uses F3DZEX 2.06H, Zelda MM uses F3DZEX 2.08I, Zelda Master Quest (GC) AND Doubutsu no Mori use F3DZEX 2.08J. I suppose it's possible (but highly unlikely, because we have a special CFB/semaphore task for this particular game) that I may be fixing some part of Zelda MQ simultaneously. Anyway I see the problem and am fixing it, should take half an hour. I hope to have a new public release posted to this thread within the next 24 hours so jelta can test.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#226
|
||||
|
||||
![]() Quote:
The slow down is not due to re-compilation, but caused by graphics rendering (rdp emulation). I didn't see too much differences in re-compliation between my code and glN64 or RiceVideo. The z64gl emulate rdp commands in a more generic way, but some rdp commands are very slow and some rdp commands cause error or crash. Any chance for you to look at the rdp emulation of z64gl?
__________________
--------------------- CPU: Intel U7300 1.3 GHz GPU: Mobile Intel 4 Series (on board) AUDIO: Realtek HD Audio (on board) RAM: 4 GB OS: Windows 7 - 32 bit |
#227
|
||||
|
||||
![]() Quote:
![]() But it seems that only you at the moment know how to do that.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#228
|
||||
|
||||
![]() Code:
/* #define EMULATE_VECTOR_RESULT_BUFFER */ /* #define PARALLELIZE_VECTOR_TRANSFERS */ #ifdef EMULATE_VECTOR_RESULT_BUFFER static short Result[8]; #endif #ifdef EMULATE_VECTOR_RESULT_BUFFER #ifdef PARALLELIZE_VECTOR_TRANSFERS #define ACC_R(i) Result[i] #define ACC_W(i) VACC[i].s[00] #else #define ACC_R(i) VACC[i].s[00] #define ACC_W(i) Result[i] #endif #else #ifdef PARALLELIZE_VECTOR_TRANSFERS #define ACC_R(i) VR[vd][i] #define ACC_W(i) VACC[i].s[00] #else #define ACC_R(i) VACC[i].s[00] #define ACC_W(i) VR[vd][i] #endif #endif Code:
EX: if (inst >> 25 == 0x25) /* is a VU instruction */ { const int vd = (inst & 0x000007C0) >> 6; const int vs = (inst & 0x0000FFFF) >> 11; const int vt = (inst & 0x001F0000) >> 16; const int e = (inst & 0x01E00000) >> 21; #ifdef PARALLELIZE_VECTOR_TRANSFERS SHUFFLE_VECTOR(vt, e); /* *(__int128 *)VC = shuffle(VT, mask(e)); */ #endif SP_COP2_C2[inst %= 64](vd, vs, vt, e); #ifdef EMULATE_VECTOR_RESULT_BUFFER memcpy(VR[vd], Result, 16); #endif continue; }
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#229
|
||||
|
||||
![]()
New public release is out to this thread for testing.
Get it from the latest attachment to the first post. Thanks to jelta for mining out that bug with Animal Forest. And again to Olivieryuyu/mesk14 for those more duplicated addr unaligned handles. View the list of changes in this version. |
#230
|
||||
|
||||
![]()
I have send a PM to you.
__________________
--------------------- CPU: Intel U7300 1.3 GHz GPU: Mobile Intel 4 Series (on board) AUDIO: Realtek HD Audio (on board) RAM: 4 GB OS: Windows 7 - 32 bit |