Elysian Shadows

Posted: **Sun Aug 21, 2011 12:07 am**

Hey guys,

just a thought provoking question.

Normally I avoid declaring unneeded local variables, and often end up with statements such as:

Code: Select all

_whatevs.insert(a.someMethod()->GetX().ToPolar().Invert(), std::pair<int, new ZombieObject(35, 150, std::string())>)

When in reality, we could write the same code, in a much more readable form:

Code: Select all

var x = a.someMethod()->GetX().ToPolar().Invert();
var *y = new ZombieObject(35, 150, std::string();

_whatevs.insert(x, y);

I'm curious, is the first example any way more efficient then the second way?

I ask, because in the first code sample all sub-expressions must be evaluated, then there must be some form of storage for each sub expression before the final result is inserted into _whatevs. I would guess in the first code example, each sub-expression result is stored on the stack.

In the second example we explicitly name the temporary storage (var x and y), whereas in the first example we do not.

Does the second method (in the general case) have any performance difference?

Clarification: By sub-expression I mean evaluating each parameter of the insert call (ie: a.someMethod()->GetX().ToPolar().Invert(), and the ZombieObject

Posted: **Sun Aug 21, 2011 8:32 am**

That's a good question. Although the second is usually more readable, I'll often use the first. The gain in readability isn't worth it to me to declare so many extra variables all the time unless readability is hurt severely by not doing so. I have no idea of the performance implications though, if any.

Posted: **Sun Aug 21, 2011 8:47 am**

I guess looking at the disassembly would show what's going on in both cases. Althoug I guess the compiler will be able to optimize the case you describe, without basing it on any facts what-so-ever

Posted: **Sun Aug 21, 2011 12:43 pm**

In this particular scenario, the assembly generated by the first code sgement (without the temporary variables) would most likely be SLIGHTLY faster due to two less copies from the two local variables to the invoked function's parameters.

HOWEVER, this is generally not the case. If you did something like:

Code: Select all

_whatevs.insert(a.someMethod()->GetX().ToPolar().Invert(), std::pair<int, new ZombieObject(35, 150, std::string())>)
_whatevs.insert(a.someMethod()->GetX().ToPolar().Invert(), std::pair<int, new ZombieObject(35, 150, std::string())>)

vs

Code: Select all

var x = a.someMethod()->GetX().ToPolar().Invert();
var *y = new ZombieObject(35, 150, std::string();

_whatevs.insert(x, y);
_whatevs.insert(x, y);

where you are using the temporary variables more than just once, the chances are that the second segment (with the cached variables) would be faster. You are saving yourself from having to evaluate the intermediate value of a.someMethod()->GetX().ToPolar().Invert() twice.

If you are using the cached value more than once, and it has to be evaluated (rather than just a constant), the cached local variable will generally always be faster. Especially as the Java-style nightmare of accessors grows: x = a.Java.SomeShit().OtherShit().CodeForMe().PrintSomething().HelloWorld(), your cached value is saving more and more.

short wrote:I would guess in the first code example, each sub-expression result is stored on the stack.

They're both stored on the stack. The first example is just ONLY stored in whatevs.insert()'s stack frame. The second example is stored in both the caller's stack frame AND the callee's stack frame.

Posted: **Sun Aug 21, 2011 1:32 pm**

Cool, thanks for that insight Falco.

Posted: **Sun Aug 21, 2011 5:44 pm**

If you have data that is declared in a higher scope, and it remains persistent over multiple iterations of a block of code, is there a good way of guaranteeing caching of that data? Here's an example of what I mean:

Code: Select all


void update() {
    var x = a.someMethod()->GetX().ToPolar().Invert();
    var *y = new ZombieObject(35, 150, std::string();

    for ((some arbitrary number of loops)) {
        doStuff(x, y);
        doMoreStuff();
        doOtherStuff();
    }
}

If there's a lot of stuff that happens in doMoreStuff() and doOtherStuff(), is there a way of ensuring your x and *y remain cached between loops? Are you at the mercy of your processor (having a large enough and intelligent enough cache) or is there something you can do as the programmer to help the data stay cached?

Posted: **Sun Aug 21, 2011 8:57 pm**

Aaaaah... Now we're talking about cache optimization, that's a whole science within itself.

Fortunately for you, that's an extremely trivial case. You can essentially consider any stack-allocated local variable "cache-optimized." That's the beauty of a "run-time" stack. As it grows, all of the data within a certain range of the current stack frame is going to be cached. So depending on the size of your cache lines, at very least a few function calls (stack frames) back from your stack pointer is still going to be a cache hit. Unless you do something like:

Code: Select all

void functionA() {
    int array[CACHE_LINE_SIZE];
    blah blah;
}

void function B {
    int variable;
    functionA();
}

where function B calling A pushes the stack pointer so far down that B's "variable" may no longer be cached, you will never need to worry about this (and even then, you don't have many options). A stack is an EXTREMELY good structure for exploiting "spatial locality" (where values located near a cache hit are also cached).

This is also one of the reasons that a C-style approach tends to outperform C++ with respect to the cache. The implicit "this" variable for member functions can assrape the cache. Also, tending to group heterogeneous data into arrays of "objects" rather than having multiple homogeneous arrays tends to have poorer cache performance (especially for iteration).

When you do need to worry about caching is when you are performing operations on data that is NOT on the stack. At a C/++ level, there are actually a variety of things that you can do.
1) use structures that tend to exploit the spatial locality of the cache well (contiguous structures, arrays)
2) use SIMD-style approaches (C data organization, not C++)
3) prefetches in loops

and many other things. I'm certainly not an "expert" in the field, but I'll show you who is: http://research.scea.com/research/pdfs/ ... 8Mar03.pdf. This is a presentation given by the developers of "God of War" for the PS2. They know how to exploit the fuck out of the cache.

Posted: **Mon Aug 22, 2011 4:13 pm**

GyroVorbis wrote:Aaaaah... Now we're talking about cache optimization, that's a whole science within itself.

Fortunately for you, that's an extremely trivial case. You can essentially consider any stack-allocated local variable "cache-optimized." That's the beauty of a "run-time" stack. As it grows, all of the data within a certain range of the current stack frame is going to be cached. So depending on the size of your cache lines, at very least a few function calls (stack frames) back from your stack pointer is still going to be a cache hit. Unless you do something like:
Code: Select all
void functionA() {
    int array[CACHE_LINE_SIZE];
    blah blah;
}

void function B {
    int variable;
    functionA();
}
where function B calling A pushes the stack pointer so far down that B's "variable" may no longer be cached, you will never need to worry about this (and even then, you don't have many options). A stack is an EXTREMELY good structure for exploiting "spatial locality" (where values located near a cache hit are also cached).

This is also one of the reasons that a C-style approach tends to outperform C++ with respect to the cache. The implicit "this" variable for member functions can assrape the cache. Also, tending to group heterogeneous data into arrays of "objects" rather than having multiple homogeneous arrays tends to have poorer cache performance (especially for iteration).

When you do need to worry about caching is when you are performing operations on data that is NOT on the stack. At a C/++ level, there are actually a variety of things that you can do.
1) use structures that tend to exploit the spatial locality of the cache well (contiguous structures, arrays)
2) use SIMD-style approaches (C data organization, not C++)
3) prefetches in loops

and many other things. I'm certainly not an "expert" in the field, but I'll show you who is: http://research.scea.com/research/pdfs/ ... 8Mar03.pdf. This is a presentation given by the developers of "God of War" for the PS2. They know how to exploit the fuck out of the cache.

Question: Couldn't making local variables in functions and or methods be avoided in the following ways?

Code: Select all

//Functions.//
int my_int = 0;
void Pointless()
{
     my_int = 0;
     while( my_int < 5 )
        my_int++;
}
//Methods.//
class My_Class
{
     int my_int;
     public:
         void Pointless()
        {
            my_int = 0;
            my_int++;
        }
};

And for iterate and or doing whatever with a object/variable.

Code: Select all

//Im not going to show the function version of this because you just have to remove the class parts.//
class My_Class
{
     unsigned int i;
     //Note: 5 is arbitrary also im naming them the same thing because it will work either way in the example.//
     /*You can use a pointer because it takes up almost no memory and you don't have to keep making and destroying them, and copy objects ect.*/
    //These are for copying objects.//
    //im going to skip the example were I you can use variables.//
    std::vector<Object> Copies;
    Object Copies[5];
    public:
            //THESE MUST BE HANDLES WITH CARE!!!!//
           //They are public so you don't have  to use a method to add a element to them.//
           //Variable.//
          int* Array[5];
          std::vector<int*> Array;
         //Object, Note: this could even be a base class pointer.//
         Object* Array[5];
         std::vector<Object*> Array;
       void Crete_Copy( Object Obj )
      {
         Copies.push_back( Obj );
         //With a array (pretend there is a integer tracking the amount of objects in the array.//
         Copies[track] = Obj;
      }
      void Destroy_Copy()
      {
          //TO DO: Create method body.//
       }
       void Pointless()
       {
           i = 0;
           //Again pretend int Size.//
           while( i < Size && Size != 0 )
           {
               //Skipping variable example.//
               Array[i]->Update();
               i++;
            }
       }
};

Forgive me if this is a bad example of what I mean.

Posted: **Mon Aug 22, 2011 4:24 pm**

...why in god's name would you ever want to do that?

Sure, you can definitely avoid using function-local variables by using "functors" (function objects), but my point was that that is far less efficient and far less cache friendly. That's a great example of when a C approach would outperform a C++-style approach.

Posted: **Mon Aug 22, 2011 4:52 pm**

GyroVorbis wrote:...why in god's name would you ever want to do that?

Sure, you can definitely avoid using function-local variables by using "functors" (function objects), but my point was that that is far less efficient and far less
cache friendly.

So you are saying that my example is less efficient? If so please explain.

GyroVorbis wrote: That's a great example of when a C approach would outperform a C++-style approach.

It could also be done without a class (C-Style):

Code: Select all

     unsigned int i;
     //Note: 5 is arbitrary also im naming them the same thing because it will work either way in the example.//
     /*You can use a pointer because it takes up almost no memory and you don't have to keep making and destroying them, and copy objects ect.*/
    //These are for copying objects.//
    //im going to skip the example were I you can use variables.//
      Object Array[5];
      std::vector<Object> Array;
      std::vector<Object> Copies;
      Object Copies[5];
       Object* Ptrs[5];
       std::vector<Object*> Ptrs;
       void Pointless()
       //if using copies.//
        Object Pointless()
       {
           i = 0;
           //Again pretend int Size.//
           while( i < Size && Size != 0 )
           {
               //Skipping variable example.//
               Array[i].Update();
               //Copied would be done the same way.//
               //With pointers.//
               Ptrs[i]->Update();
               i++;
            }
             //If you are using copies.//
             return Copies[DesiredObj];
       }

Agreed that a C-Approach is much better.
Please not I would not normally make all the pointers and stuff I would usally just just do:

Code: Select all

class My_Class
{
      unsigned int i;
      public:
      int[5] Pointless( int Array[5], unsigned int Size; )
      {
          i = 0;
        if( Size != 0 )
        {
            while( i < Size )
            {
                Array[i] = 10;
                i++;
             }
         }
         return Array;
      }
};

Posted: **Mon Aug 22, 2011 5:00 pm**

THe Floating Brain wrote:So you are saying that my example is less efficient? If so please explain.

THe Floating Brain wrote:Agreed that a C-Approach is much better.

Seriously? Other than the fact that the segment you gave makes absolutely no sense, I see two static arrays, two vectors, and an array dereference in a loop vs merely declaring something on a stack?

Posted: **Mon Aug 22, 2011 5:15 pm**

GyroVorbis wrote:
THe Floating Brain wrote:So you are saying that my example is less efficient? If so please explain.

THe Floating Brain wrote:Agreed that a C-Approach is much better.
Seriously? Other than the fact that the segment you gave makes absolutely no sense, I see two static arrays, two vectors, and an array dereference in a loop vs merely declaring something on a stack?

I declared the same things multiple times for all hypothetical scenarios.
Are you refrencing:

Code: Select all

           while( i < Size && Size != 0 )
           {
               //Skipping variable example.//
               Array[i].Update();
               //Copied would be done the same way.//
               //With pointers.//
               Ptrs[i]->Update();
               i++;
            }

for the array de-refrence? If so I would not be calling both Array.Update(); and Ptrs->Update();. They are both shown for hypothetical scenario's.
---EDIT---
Just relised I made the stupid mistake of declaring

Code: Select all

int Array[5];

in

Code: Select all

class My_Class

when it should be external to it.

Posted: **Mon Aug 22, 2011 5:40 pm**

Good read with that pdf! Where do you find useful documents like that?

-edit
Woah!! After implementing some cache awareness my project is running on average 45ms, down from 85!! LOL

Posted: **Tue Aug 23, 2011 9:48 am**

Rebornxeno wrote:Good read with that pdf! Where do you find useful documents like that?

I spend half of my life researching... What can I say? I'm a lifelong scholar.

Rebornxeno wrote:Woah!! After implementing some cache awareness my project is running on average 45ms, down from 85!! LOL

JESUS CHRIST. Do you mind posting what you changed? I'm sure it would help a bunch of people out, and I'm just dying of curiosity now. You must have taken the cache out on a date and shown her a good time.

Posted: **Tue Aug 23, 2011 10:01 am**

Rebornxeno wrote: -edit
Woah!! After implementing some cache awareness my project is running on average 45ms, down from 85!! LOL

yeah holy shit that's a ridiculous improvement. Sounds like you got a homerun on your date ;]

Elysian Shadows

Minimizing local variable declarations

Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations

Re: Minimizing local variable declarations