Go Back   Project64 Forums > General Discussion > Open Discussion

Reply
 
Thread Tools Display Modes
  #851  
Old 21st June 2014, 01:15 AM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

I know a little about object linking. I just never felt a need to do a large function in pure assembly yet. Then again I never made anything big yet lol. I should try it out sometime I guess. My issue is, what about inline functions? How could I do those in assembly? Calling very small functions is not my style, even though calling a function solves the compiler issue.

What is your assembler of choice? I really need to learn the syntax for special features.
Reply With Quote
  #852  
Old 21st June 2014, 02:34 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

First, you don't need an inline property for assembly languages. You can just declare an `extern' of the INLINE function's prototype in the C source and have the definition in asm, and any in-lining can be done at link-time code generation across modules. Second, C is unreliable compared to assembly languages when it comes to inlining or not inlining functions when you want them to. Third, if you can't handle programming more than a single, small function in an assembly language then you would probably do better to give up on both asm and inline asm and just stick to C for the duration of the project. Fourth, I don't have a favorite assembler, because they all function nearly the exact same way anyway. I alternate between them all only to test syntax compatibility.
Reply With Quote
  #853  
Old 21st June 2014, 05:31 AM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

I just like finding shortcuts, but I've reached a dead end. I can handle writting many functions in assembly, I'd just rather write most of it in C if the end result will be the same. Sometimes inline assembly works exactly how I want it to. So in those cases, it's easier for me to just do inline assembly, especially when it's only for a small part of a function. When I run into problems with inline assembly, then I'll probably just write the entire function in assembly. I actually just tried the __declspec(naked) feature since 1964 Audio is locked to MSVC anyway. It is quite useful.

I'd love to make 1964 Audio work on other compilers, but I'll have to learn how to substitute some of the MSVC exclusive features. I also need to figure out why it works on Mupen64 and 1964, but not PJ64 1.6. It's also weird how the GUI only works in 1964 too.

I was just curious what syntax / features of the different assemblers. I guess I'll have to look around to figure stuff out. I haven't tried linking obj files yet, since I need to get used to MASM again and learn its syntax, since idk it too well outside of the basics. I like taking advantage of useful features in a compiler or assembler.

You're right about C not being reliable with inlining functions. I just write macros for small pieces of code. I keep forgetting the functionality difference between an inline function and a macro .
Reply With Quote
  #854  
Old 21st June 2014, 05:49 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

A rather double-ended view on language preference.

On one hand, you prefer to write most of it in C if "the end result will be the same".
On the other hand, you'd rather use inline assembly than to write ALL of it in C, because for some reason you don't like intrinsics.

So basically, you'd rather code entire functions in C than in assembly for times in which the resulting executable would still be the same, yet you'd rather mix inline assembly and only use partial, non-portable and non-standards-compliant C than to use 100% C with intrinsics (necessary for asm tricks like byte-swapping etc.) when it still maintains the same output?

If a SDK hands you an intrinsic for solving an entire problem (e.g. SSE instructions, some intrinsics for rotates/swaps), there's a good reason why you're being offered those resources. You don't ignore those features and use inline ass, because that's ass. By hard-coding the machine instructions in the middle of high-level/functional C code, you're making it way harder than it has to be for the compiler to optimize the program *as a whole*. It can't take the entire algorithm in as a whole if you throw off the C logic with asm logic and expect one to not dominate the other.

Either be consistent and code *at least* an entire function in the same, consistent language throughout, or don't trust the C compiler and code wholly in asm and not C. :P Personally, I prefer the latter. Intrinsics are ugly code that show no algorithm, but making the algorithm even less natural and symbolic of a fundamental computer system with inline ass is no better.
Reply With Quote
  #855  
Old 21st June 2014, 07:51 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

You could say I'm just very picky lol. I like to use whatever gets the job done the best while also factoring in convenience. Sometimes I don't always follow this philosophy and go out of my way to the extreme for either convenience or performance, depending on my mood.

I'm not 100% against intrinsics. I don't mind using them to move data from one location to another. It's just that I feel I can do a better job sometimes because I get to make assumptions that the compiler has no knowledge of. I do find SSE intrinsics to be very ugly. At least for right now, reading SSE assembly code is more readable to me. Another problem I have with intrinsics is it seems to also be able to screw up the compiler. I tested this by using the NOP instrinsic and the compiler acted the same way it did when I used inline assembly for NOP's. I also could not get the byteswap intrinsic to work ;/ .

Byteswapping is one of those things I'd rather do in C if the output was the same, because it's just moving data. I can't see the compiler doing a worse job than I would.

I've gotten to the point where I have a good sense of when to use inline assembly and when not to. Generally if the function isn't too large, then a small amount of assembly code will not be very harmful if written properly. My only issue with mixing assembly right now is the fact that if you use it enough, then you're stuck with that compiler and compiler settings as well. Then again even a pure assembly function can have this problem too when doing fastcall, although it's not as bad. Unless there's something I'm missing, since to my knowledge, you can't guarantee what registers the compiler will use to pass parameters.

I think GCC does a better job with mixing assembly code and C, but I've heard that it can change your code around.

I really need to test out linking obj files. Hopefully that's even better than using declspec(naked) aside from the increase in portability. The one thing I'm not happy about __declspec(naked) functions is that the compiler still does a suboptimal job with passing parameters. I was trying to write the SSE equivalent of certain functions in 1964's HLE implementation, and for some reason, instead of passing the third parameter into eax, it moves it into stack then in the beginning of the function moves it back into eax.. Not sure if linking objects will be better though.

I don't get why people are against heavily using ifdef's. What's the reason for this?
Reply With Quote
  #856  
Old 21st June 2014, 08:03 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Quote:
Originally Posted by RPGMaster View Post
I'm not 100% against intrinsics. I don't mind using them to move data from one location to another. It's just that I feel I can do a better job sometimes because I get to make assumptions that the compiler has no knowledge of.
But none of these assumptions have to do with small tasks, like swapping bytes or SSE operations. If you used a C-style byteswap intrinsic to swap bytes on a 32-bit variable, set to 0x00000000, it would detect on a high-level that there was no point in swapping bytes. However if you used in-line ass (note that in this particular case, we're talking about a particular version of inline asm called inline ass) to hardcode a bswap instruction, it wouldn't necessarily detect this. So actually you have it switched backwards--it's C-level intrinsics that are capable of making safe assumptions that your inline assembly code has no knowledge of.

I know how much you hate the saying that "you can't beat the compiler", but for individual steps, small enough stages of operations to fit into a C intrinsic function, you would be hard-pressed to find counterexamples.
Reply With Quote
  #857  
Old 21st June 2014, 08:24 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Right. Sometimes the compiler makes more assumptions than I could make. I was more refering to things like knowing the guaranteed value of a register at that exact moment in time. Like I said, I'd use byteswap instrinsic if I could get it to work lol. I already use intrinsics for moving data. I admit I don't have too much experience with intrinsics, so maybe for general cases where I can't make assumptions about the values of registers, it would be better to use intrinsics. I'll have to see how good/bad MSVC generates code using instrinsics.

Based on my experience with non-sse code, I get the feeling I can sometimes do a beter job with registry management. Although I've learned tricks to improve compiler generated code. I still don't get why sometimes the compiler will do unnecesary memory stores on variables that won't be used again, but if I use another temporary local variable, the compiler won't do those memory stores. I believe it's a technique called aliasing right? Or maybe I'm just getting mixed up with the terms I've read about.

I guess I'll try comparing intrinsics to my assembly routines for the few functions I've tweaked in 1964 Audio.
Reply With Quote
  #858  
Old 24th June 2014, 10:58 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

bump.............

Quote:
Originally Posted by RPGMaster View Post
Right. Sometimes the compiler makes more assumptions than I could make. I was more refering to things like knowing the guaranteed value of a register at that exact moment in time.
Same as what I said, an intrinsic would know the guaranteed properties of a variable better than inline assembly.

Quote:
Originally Posted by RPGMaster View Post
Like I said, I'd use byteswap instrinsic if I could get it to work lol.
Why not?
The byteswap intrinsic isn't the same on all compilers...I know it's supposed to be one thing on GCC, something else on the other. It should be working for you though. I know MarathonMan uses it. I don't though, so I can't tell ya offhand.

Quote:
Originally Posted by RPGMaster View Post
I still don't get why sometimes the compiler will do unnecesary memory stores on variables that won't be used again, but if I use another temporary local variable, the compiler won't do those memory stores. I believe it's a technique called aliasing right?
wait, wut? </filler>
Reply With Quote
  #859  
Old 24th June 2014, 11:34 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Quote:
Originally Posted by HatCat View Post
Same as what I said, an intrinsic would know the guaranteed properties of a variable better than inline assembly.
Hmm, well I'll need to practice it more then. So far, I've been doing as you suggested and did pure ASM functions . I'll try using intrinsics to see if what the difference is.
Quote:
Originally Posted by HatCat View Post
Why not?
The byteswap intrinsic isn't the same on all compilers...I know it's supposed to be one thing on GCC, something else on the other. It should be working for you though. I know MarathonMan uses it. I don't though, so I can't tell ya offhand.
It's been a while, it either did some inefficient swapping or didn't work at all, can't remember. You're probably right about it not being the same on all compilers. I doubt MarathonMan uses MSVC . With MSVC, you have to be a lot more explicit and can't rely on the compiler nearly as much, unless of course optimizations aren't too important. I'll try again today, now that I'm more familiar with stuff.
Quote:
Originally Posted by HatCat View Post
wait, wut? </filler>
Kinda hard to explain. There are times where the compiler does a poor job with optimizing register usage. This is especially true with global variables. Iirc, when i used a global variable as a counter in a for loop, it did a poor job, compared to when I passed the value of a global variable into a local and used that as the counter. Sometimes even with local variables, the compiler does dumb things.

One example was I once did something like
Code:
var = var | 0x20202020;
then checked its value in an if statement. Keep in mind this was near the end of a medium sized function, and the variable was never used again. Yet it saved the value to a memory location. If I did
Code:
temp_var = var | 0x20202020;
it never saved the value into a memory location. Sure this is so small it doesn't matter. But why does this happen in the first place lol? I usually don't go this far, but these are good techniques to learn for when every cycle counts.
Reply With Quote
  #860  
Old 24th June 2014, 11:43 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

One thing I like about global variables is for benchmarking loops of small things, even optimizing compilers won't optimize such things out.

For example if you do this in C:
Code:
int main(void)
{
    int ass;

    ass = 5;
    return 0;
}
then it just compiles to xor eax, eax; ret; because ass is defined but unused.

But even with a compiler that smart it knows better than to do that for this:
Code:
int ass;

int main(void)
{
    ass = 5;
    return 0;
}
then it compiles instead to something like

mov DWORD PTR ass[rip], 5;
xor eax, eax;
ret;

...which is much more spicy
but more to the point, useful for debugging huge loops and benchmarking things.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT. The time now is 03:16 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.