Go Back   Project64 Forums > General Discussion > Open Discussion

Reply
 
Thread Tools Display Modes
  #101  
Old 12th April 2014, 04:06 AM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Yay for Microdox!

Welp, it sounds like I need to do a mid-update or something that throws in the .lib's, and hopefully some updated DLLs with one or two screenshot superpowers.

And this time the source code will be in C, not C++. That was like, the very first commit from the original code I did this time.

Oh, and if you select "Options/Configure Graphics Plugin..." in Project64 or w/e, it'll instantly toggle all the VI filters on and off. Kind of fun to play around with actually. Anyway, I hope I'm not just being all talk here, first I got to finish the damn 32-bit screenshot method first before I talk about it .
Reply With Quote
  #102  
Old 12th April 2014, 05:44 AM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Lol I finally got it working. Turns out that the lib file I downloaded was the issue.

Now the fun begins ! Btw HatCat, have you tried compiling it with gcc? After what I've seen, I don't see how MSVC would compile better code lol. I just like how easy things are with MSVC.

Here's another example of how inefficient MSVC's compiler is. Also, it looks like GCC doesn't always do the best job.
Code:
variable = (i & 1) ? 7 : 3; //MSVC
mov  ecx,dword ptr [esp+8]
and  ecx,1
add  ecx,ecx
add  ecx,ecx
or   ecx,3

variable = (i & 1) ? 7 : 3; //Intel
mov  eax,dword ptr [esp]  
and  eax,1  
lea  edx,[eax*4+3]

variable = (i & 1) ? 7 : 3; //GCC
mov  eax,dword ptr [esp + 0x1c]
and  eax,1
cmp  eax,1
sbb  eax,eax
and  eax, 0xfffffffc
add  eax,7

Last edited by RPGMaster; 12th April 2014 at 05:33 PM.
Reply With Quote
  #103  
Old 12th April 2014, 02:05 PM
ReyVGM ReyVGM is offline
Project Supporter
Senior Member
 
Join Date: Mar 2014
Posts: 212
Default

Quote:
Originally Posted by HatCat View Post
The wait? I thought the reason you wanted an update was because there was no screenshot feature.

I never was the most attentive person though.
Yes, the wait will be longer because now I want it to be released even faster :P

So wait, you're working on this plugin again or in your own branch? Will the whole resolution thing be fixed on your branch or in this one?

Last edited by ReyVGM; 12th April 2014 at 03:22 PM.
Reply With Quote
  #104  
Old 12th April 2014, 08:12 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

I updated the original post with a sort of mid-release to address the linker concerns RPGMaster was having.

This time, remember to try out "Options/Configure Graphics Plugin" in Project64 while playing. Shunyuan already forces all the VI filtering off, so this time around I figured I'd give you guys the simple option of toggling it on and off. It yields something of a 10 VI/s speed boost, depending on how much the RDP is to blame.

So it should be about as fast as SoftGraphic by now, but I didn't test.

Quote:
Originally Posted by ReyVGM View Post
Yes, the wait will be longer because now I want it to be released even faster :P
Ended up updating the build to address some issues anyway.

While I did successfully implement BMP screenshots, the feature of raw RDRAM access is still experimental. angrylion is doing some weird shit with his prescale pixel table adjustments, so you'll usually get like 640x237 rather than 320x237 screenshot sizes. You can fix this by doing a non-filtered, raw pixel resize in the image editor of your choice until I tweak angrylion's pre-scale code to not mess with the BMP sources.

Quote:
Originally Posted by ReyVGM View Post
So wait, you're working on this plugin again or in your own branch? Will the whole resolution thing be fixed on your branch or in this one?
I don't understand much about branches or version control systems for that matter.
I use Git, not Google Code like angrylion does. Anyway Google sucks.

I think it would be accurate to say that I'm forking angrylion's code and improving upon it. I have yet to delve into most of the VI stuff about fixing angrylion's resolution since I mostly cared about performance.

Still, this build is all angrylion's original code. I pretty much started over fresh to organize things better.

Quote:
Originally Posted by RPGMaster View Post
Lol I finally got it working. Turns out that the lib file I downloaded was the issue.
That's what I thought.

No way in fuck would Microsoft make you link to ddraw.dll during run-time to get a pointer to the function.

I don't really care what their page says; it sounds to me like they worded it wrong.
In any case, I anticipated the worst case scenario for you and provided the latest libs Visual Studio 2013 gave me from the Windows 8 API.

Quote:
Originally Posted by RPGMaster View Post
Now the fun begins ! Btw HatCat, have you tried compiling it with gcc?
The GCC build is faster than the Visual Studio build but only with my fork of the plugin. In the stuff I am posting to this thread where I started over fresh, the Visual Studio build is faster, so I don't bother to include the GCC binary just yet.

It's partly due to angrylion's explicit uses of STRICTINLINE all the time, causing overrides against GCC's better judgement.

Quote:
Originally Posted by RPGMaster View Post
Here's another example of how inefficient MSVC's compiler is. Also, it looks like GCC doesn't always do the best job.
Code:
variable = (i & 1) ? 7 : 3; //MSVC
mov  ecx,dword ptr [esp+8]
and  ecx,1
add  ecx,ecx
add  ecx,ecx
or   ecx,3

variable = (i & 1) ? 7 : 3; //Intel
mov  eax,dword ptr [esp]  
and  eax,1  
lea  edx,[eax*4+3]

variable = (i & 1) ? 7 : 3; //GCC
mov  eax,dword ptr [esp + 0x1c]
and  eax,1
cmp  eax,1
sbb  eax,eax
and  eax, 0xfffffffc
add  eax,7
Yeah that's pretty weird. Thanks for the comparison!

I would just do something like:
Code:
MOV     eax, 7
NOT     ecx
AND     ecx, 1
SHR     eax, ecx
or maybe mov eax, 3 and that lea 4*x thing Intel did
Reply With Quote
  #105  
Old 12th April 2014, 09:01 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

I've been studying compiler output for various things. I'm quite disappointed in some of my findings. In order to have good code, you have to write in a way that best suits the compiler you are using. For example, strcpy usually does a lousy job when using MSVC or Intel, but does a great job with GCC. This is a pain when portability is a concern. Even GCC isn't perfect and will sometimes do worse. The only way afaik to overcome this portability issue is to have a lot of #ifdef's. Good thing I can quickly find and replace text though, for pieces of code that is used a lot.

Inline assembly usually changes the surrounding code, even if you write something like _asm nop. So in many cases there is no convenient way for me to efficiently mix assembly code with c code. That's why I'm trying to get better with C and learn new tricks.

What's the difference between an inline function and a macro? I tried inlining a very small and simple function just to see how function inlining works and MSVC wouldn't even inline it... That's part of the reason I just stick to macros lol.

I'm going to try profiling your current project to see where the hotspots are.
Reply With Quote
  #106  
Old 12th April 2014, 09:41 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

A macro is more reliable for statically in-lining code, but for general maintenance and non-hazard-prone design it's generally best to apply the use of macros for smaller algorithms and inline functions for larger, more complicated code.

Thanks for helping out btw. I tend to rely on my own instincts a lot, so I don't bother to use code profilers or benchmark utilities...most of what I use is just compiler and wit. I can tell you right now though, without you having to run a profiler, that the greatest bottleneck is the render spans functions.

The complex RDP triangle commands, as well as tile and rectangle commands, end with a function call to one of several possible render spans functions. (In the outdated codebase Shunyuan used, back then it was all unified under a single render_spans function for 16 versus 32 bits.)

The second-greatest bottleneck next to that is the VI filtering. The zilmar spec function UpdateScreen calls rdp_update, and that reads the RDRAM pixels written by the RDP commands, and filters them using the VI code. By removing these filters using the new feature in my plugin, parts of games using no *complex* RDP commands now run at 200 something VI/s instead of 40. So that's a big frekkin' speed bump, more than what you'd ever gain just by optimizing the RDP for simplicity.

Still, overall the real challenge is the render spans code.
Reply With Quote
  #107  
Old 12th April 2014, 10:01 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

Lol I guess I don't need to profile it since you already know what the bottleneck is.

What I'll do is take a look at the render spans assembly code to see if there's any issues with the compiler output.

Edit:nvm about the errors. I should have linked the d3d include files.

Last edited by RPGMaster; 12th April 2014 at 10:07 PM.
Reply With Quote
  #108  
Old 12th April 2014, 10:08 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

Erm...yeah. The project files were all made with Visual Studio 2013. I recently upgraded from 2010 to 2013 for improvements, but no, you don't have to upgrade to compile it. The project files are easy to set up yourself; I only supplied them for convenience and encouragement to use the latest Microsoft compilers. You probably should upgrade it anyway though, judging by some of that redundant shit you found in the VS2010 asm output.

Either that, or by all means, use GCC/MinGW, with a VS2010 project alternative for you to test the Visual Studio debugging environment. I actually regularly use both GCC and VS to compile the source, so I alternate between them.

Quote:
unresolved external symbol "void __cdecl capture_screen_to_file(char *)" (?capture_screen_to_file@@YAXPAD@Z) referenced in function _CaptureScreen
It should be compiling bitmap.c and linking against bitmap.obj to locate that symbol.

I couldn't be arsed to include bloated or official winapi headers for something so simple and uncompressed as the bitmap image format, so I just hardcoded my own BMP image compliance to official specifications by hand. Reduces dependencies. PNG screenshots I have no idea how to do; I would probably have to link against even more unnecessary dependencies.
Reply With Quote
  #109  
Old 12th April 2014, 10:20 PM
RPGMaster's Avatar
RPGMaster RPGMaster is offline
Alpha Tester
Project Supporter
Super Moderator
 
Join Date: Dec 2013
Posts: 2,008
Default

The error I had earlier had to do with me being too lazy to include the direct sdk directory. I'm going to be more thorough from now on.

Well I guess I should test the 2013 compiler. I don't like its UI or debugger. For some reason the newer MSVC doesn't display variable names in the disassembly. I'll probably keep both versions I guess.

Anyway I still have 1 problem with the project. I cannot compile as C, because the source code has initialized variables in the middle of a function. If I compile as C++, I have to type cast the malloc and make a few other changes. I was able to get it to compile in C++ but I'd rather not have to do that. I'm wondering how you were able to compile it in C.

Last edited by RPGMaster; 12th April 2014 at 10:23 PM.
Reply With Quote
  #110  
Old 12th April 2014, 10:32 PM
HatCat's Avatar
HatCat HatCat is offline
Alpha Tester
Project Supporter
Senior Member
 
Join Date: Feb 2007
Location: In my hat.
Posts: 16,236
Default

You're partly right. To convert from C++ to C I had to reorder a lot of variable decl's to the top of the block scopes in functions and such, and I also had to make function pointer initializers more static. However, it compiles just fine in C, using the project files I uploaded, in VS2013. Or have you not tested that out yet?

I suppose the reason it compiles in VS2013 but not VS2010 is because Microsoft later added C99 support. But yes, in ANSI C, the code is probably not going to compile. I forgot about attending to that.

Try GCC anyway; that seems to compile it just fine. I did not intend for this code to only work on C99; I will address that minor concern later.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT. The time now is 09:17 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.