|
#1091
|
||||
|
||||
![]()
Lol I don't blame you for getting mixed up. Even I used to be confused about it. Lucky you! I could use a new PC ;/ .
|
#1092
|
||||
|
||||
![]() Quote:
![]() Last edited by oddMLan; 29th October 2014 at 05:24 PM. |
#1093
|
||||
|
||||
![]() Quote:
Don't really need SSE3 or later though; everything the N64 RSP does can be directly mapped to SSE2 which is why I'm hoping there won't even be a reason to have a SSSE3 version of the plugin.
__________________
http://theoatmeal.com/comics/cat_vs_internet Last edited by HatCat; 29th October 2014 at 05:08 PM. |
#1094
|
|||
|
|||
![]()
FWIW i never had any issues with the SSSE3 variant, but I wasn't really doing much specific testing so maybe some quirkass RSP usage in some game that has an issue with SSSE3 exists :P
|
#1095
|
||||
|
||||
![]()
Probably not. The gap seems to be smaller, with this current source. Although I'm using Intel since I am unable to compile with GCC. It's giving me trouble, saying 'there's conflicting types for DllMain".
|
#1096
|
||||
|
||||
![]()
Oh I'm sure that would have to have been a bug with the GNU C compiler.
All the vector stuff was optimized 95% or more on the C level; any responsibility beyond that went to the compiler (which actually did a poor job with the multiply packs). It's a little more under my control and less the compiler's these days since some inline-assembly-ish intrinsics are being used (inflexible and unportable code ftw? ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#1097
|
||||
|
||||
![]() Quote:
Code:
ld --shared --entry _DllMain@12 -o ../rsp/bin/rsp.dll ../rsp/obj/module.o ../rsp/obj/su.o ../rsp/obj/vu.o ../rsp/obj/multiply.o ../rsp/obj/add.o ../rsp/obj/select.o ../rsp/obj/logical.o ../rsp/obj/divide.o -lkernel32
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#1098
|
||||
|
||||
![]() Quote:
![]() thnx |
#1099
|
||||
|
||||
![]()
There is no change in performance, just in method of expressing the algorithm.
It can however be said that the binary itself has "multiplied performance". The algorithm is adjusted to fixate itself to present-day compiler limitations for Intel SSE2. I'm sure the native C version of the algorithm would beat out the inline-assembly-ish intrinsics for many other operations. ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#1100
|
|||
|
|||
![]() Quote:
Ugh these confusing names |