|
#201
|
|||
|
|||
![]()
What's wrong with the type of COLOR? You can't make color components 8-bit, color combiner operates on sign.9 values. If you attempt to do this, you'll break the color of the big rotating "N" in Zelda OoT intro.
Last edited by angrylion; 19th June 2013 at 03:19 AM. |
#202
|
||||
|
||||
![]() Quote:
Even still, a UINT16 is better than a UINT32, no? On 64-bit archs, you could pass around a full RGBA value instead of a pointer and save yourself the cost of dereferencing. Last edited by MarathonMan; 19th June 2013 at 03:42 AM. |
#203
|
|||
|
|||
![]()
Maybe, I don't recommend to trust anything without experimenting with a profiler. Last I heard, generally 32-bit memory accesses and arithmetic were faster than 16-bit on x86-32. If you want to avoid dereferencing pointers for the combiner, it means you want to have a set of "combiner input" COLORs and copy the same colors to them twice in 2-cycle mode?
|
#204
|
||||
|
||||
![]() Quote:
I thought that was just to save space/program size. AFAIK the fastest data type is just `int`, not long, short, char, w/e. Which on Win32 is the INT32 or UINT32 macro in size effect, so maybe that's why MESS chose UINT32? Not sure but I didn't think UINT16 was any better performance than UINT32 on a 32-bit system. I don't mind if he uses UINT32, but it needs to be movzbl like you said or, if doing register-register communication instead, the MOV EAX, DH like I said (move 8-bit upper of 16-bit DX into 32-bit EAX as the resulting color1.r structure member) if he wants it to be optimized. Quote:
(Btw I never say "UINT8" instead of "unsigned char" because I prefer relying on the pure, internal C keywords--yes you're more than welcome to accuse me of making ugly code to scare people away there. ![]() This is perfectly fine ultimately: You're still shifting a 32-bit int to the right and trapping the resulting lo 8 bits. I don't like to write it this way however because it issues an ANSI C warning (an insignificant warning but a warning message nonetheless), shifting 32 bits right by 8 and storing it into an 8-bit target without explicitly type-converting it to strictly define it as (unsigned char) source type. It will be as optimized, but so as to get rid of the ANSI C warning some compilers might brag about, I would say byte = (unsigned char)(shit >> 8) instead of just byte = (shit >> 8), but only to be careful/explicit. You don't have to though. It does make it even more strongly emphasized that we're moving 8 bits though (and even more strongly optimized that we're moving SHIT!), so in that sense I consider the intended optimization more well-known visible to the reader as well as the compiler. Quote:
As you were saying, target = (bitstring & 1) ? 0xFF : 0x00 ... compiles to that assembly code you posted earlier. I wrote the C version of that asm code. That's why I find it more readable. My argument is really as simple as that, but it doesn't prove that I'm right or that you're wrong in your view of it. It reminds humans of the intended, necessary optimization the compiler SHOULD make--even if it's not one of the modern ones that does, either way. I see it as a habit/practice, not saying everyone likes it. ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#205
|
|||
|
|||
![]() Quote:
Last edited by angrylion; 19th June 2013 at 04:09 AM. |
#206
|
||||
|
||||
![]() Quote:
You're shifting a 32-bit value by 8. The compiler would be DUMB to use AH access because AH is 8-bit register access. You only specified take a 32-bit value and shift it right by 8. That does not say discard the upper 24 bits or 16 bits. For the compiler to make that assumption for you is wrong. So no, you must specify (unsigned char) or & 0xFF if you want it to use AH access. If you prefer to go by the manual and not do that, fine. Also, I didn't see your post on the last page before, and I don't have time to address other replies just yet.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#207
|
|||
|
|||
![]()
Ah, you're partially correct, for some reason I shift a 16-bit unsigned value in some texel fetching functions and 32-bit unsigned value in others. I'll see if I have an inclination to change this to always shift a 16-bit unsigned value.
Last edited by angrylion; 19th June 2013 at 05:09 AM. |
#208
|
||||
|
||||
![]()
This is just the thing that I was talking about.
Relying on intrinsics in the compiler is not always everything. Sometimes it's just better for maintainability of the code to always make sure of the type. Code:
UINT8 function(UINT16 uint16) { return ((unsigned char)(uint16 >> 8)); } UINT8 function(int uint32) { return ((unsigned char)(uint32 >> 8)); } Either way, why remove it? You know that you desire exactly 8 bits. The more information you explicitly define to the compiler, the more it doesn't forget to optimize knowing all the facts. Even for the first of those two functions where (unsigned char) isn't needed, it's less hazardous to future maintenance of code (maybe later you change it from UINT16 into fast int type) and minimizes external changes, when you make your code, portable. It's just safer design IMO, and no less readable. And if I'm wrong about the MOV EAX, DH 8-bit register access, OK, compiler won't generate that code. It will pick something better if I'm wrong. ![]() Quote:
![]() Back then MOV AX, DH was beautiful. You didn't need to shift by 8 because you had this internal workaround. There also was no "E"AX, "E"BX...it was just the base register names. If things have since then changed, fine, the compiler should know I'm wrong. Doesn't change my C advice as it should increase likelihood that correct output is used as well as code security. Quote:
![]() It was just to demonstrate that branch management/weighing could be eliminated, and as I already admitted at the end of the post it was better to use a branch and use the NEG opcode, not IMUL. [^ in reply to me predicting this would happen about the (var & ~03) >> 8 thing, being really just the same thing as (var >> 6)] Which is one reason why I didn't change it. It's easy for compiler theory to understand that (var & ~03) >> 8 is redundant. If however you also consider that more readable than just saying (var >> 6) then I think you have a very strong opinion of readability vs. compiler theory. Ultimately, either way the real reason I'm not making my own suggested change to that code, is it's wisest to keep it in sync with yours, letter-for-letter, to make merging in those more colossal source updates on Google Code easier. Quote:
It was a valid example of how to remove a branch, but it was not a good example of removing a branch in a situation where you should. So mid-way, I cancelled the multiply code and commented it out, instead showing that it could be more pre-optimized even in the case that you must branch.
__________________
http://theoatmeal.com/comics/cat_vs_internet |
#209
|
||||
|
||||
![]() Quote:
I was saying that, on x86_64 (because IDGAF about IA-32 anymore ![]() Code:
struct COLOR { UINT16T r,g,b,a; }; Code:
foo: movl 0(%rdi), %eax ; red component movl 4(%rdi), %ebx ; green component ... ; prologue; write the (modified parts) of the color back out movl %eax, 0(%rdi) movl %ebx, 4(%rdi) Code:
foo: ; COLOR is already in %rdi ; prologue is cheap now! movq %rdi, %rax ret Quote:
Last edited by MarathonMan; 19th June 2013 at 01:25 PM. |
#210
|
||||
|
||||
![]()
Crazy.
You are wild with SSE ideas. ![]() Well, 64-bit programming might benefit some things. But I think of it this way. MIPS CPU is a 32-bit processor. RCP CPU is 32-bit. Win32 machine is 32-bit. I would just do emulator in 32-bit. There are lots of things like the RSP vector accumulators and that color struct idea you mentioned though that in fact could benefit from 64-bit programming and the SSE/SSE2 they entail support for. The problem I have with SSSE3, is that the Nintendo 64 supported those complex vector operations since 1995. The PC with SSSE3, only starts supporting *some* of those RSP vector ops, like, 20 years later? That really gets to me. ![]()
__________________
http://theoatmeal.com/comics/cat_vs_internet |