## Monday, April 2, 2012

### Limits of Named Return Value Optimization

I decided to test how well MSVC 10 can perform return value optimization.

Here's the code for my benchmark:

#include <iostream>

struct A {
A() { std::cout << "A() Constructor\n"; }
A(int) { std::cout << "A(int) Constructor\n"; }
A(const A&) { std::cout << "RVO Failed (copy)\n"; }
A(A&&) { std::cout << "RVO Failed (move)\n"; }
A& operator=(const A&) { std::cout << "RVO Failed (copy assignment)\n"; }
A& operator=(A&&) { std::cout << "RVO Failed (move assignment)\n"; }
A operator+(const A&) const { return A(0); }
};

A rvo(int i)
{
if (i < 0) {
return A();
} else {
return A(i);
}
}

A rvo2(int i)
{
if (i < 0) {
return A();
} else {
return A()+A();
}
}

A nrvo(int i)
{
A a;
if (i < 0) {
return a;
} else {
return a;
}
}

A nrvo2(int i)
{
A a;
if (i < 0) {
return a;
} else {
A b(i);
return b;
}
}

A hybrid(int i)
{
A a;
if (i < 0) {
return a;
} else {
return A(i);
}
}

int main()
{
std::cout << "rvo(-1):\n";
rvo(-1);
std::cout << "\n";

std::cout << "rvo(1):\n";
rvo(1);
std::cout << "\n";

std::cout << "rvo2(-1):\n";
rvo2(-1);
std::cout << "\n";

std::cout << "rvo2(1):\n";
rvo2(1);
std::cout << "\n";

std::cout << "nrvo(-1):\n";
nrvo(-1);
std::cout << "\n";

std::cout << "nrvo(1):\n";
nrvo(1);
std::cout << "\n";

std::cout << "nrvo2(-1):\n";
nrvo2(-1);
std::cout << "\n";

std::cout << "nrvo2(1):\n";
nrvo2(1);
std::cout << "\n";

std::cout << "hybrid(-1):\n";
hybrid(-1);
std::cout << "\n";

std::cout << "hybrid(1):\n";
hybrid(1);
std::cout << "\n";

}

The function rvo tests simple application of return value optimization.
nrvo tests for named return value optimization in the simplest case.
nrvo2 returns two different objects, depending on execution path.
hybrid performs nrvo in one path, and rvo in the other.

If RVO/NRVO is enabled, no move/copy constructor shall be invoked. As MSVC 10 supports move semantics, we don't expect to see any copy operations at all.

Here's the result:

rvo(-1):
A() Constructor

rvo(1):
A(int) Constructor

rvo2(-1):
A() Constructor

rvo2(1):
A() Constructor
A() Constructor
A(int) Constructor

nrvo(-1):
A() Constructor

nrvo(1):
A() Constructor

nrvo2(-1):
A() Constructor
RVO Failed (move)

nrvo2(1):
A() Constructor
A(int) Constructor
RVO Failed (move)

hybrid(-1):
A() Constructor
RVO Failed (move)

hybrid(1):
A() Constructor
A(int) Constructor

The conclusion is clear:

RVO is successfully performed. Just create the unnamed object in the return statement. It can be constructed as part of a more complicated statement too, the only requirement seems to be that it is an rvalue.

NRVO succeeds when you declare the object to be returned and return that object in all paths.
If you return different objects in different paths, no NRVO is performed for any path (this is not too surprising if you know how NRVO is implemented).
The hybrid test shows that these two conclusions may be combined: if you have a function that performs RVO in one path, and attempts to perform NRVO in another, by what I've just said, no NRVO will be performed, but RVO will. This is precisely the result you see in the hybrid test.

It should be noted, however, that move operations are usually very fast (but not always, see my previous post about small string optimization and move operations), as they only copy and zero out a couple of pointers values.

Conclusion:

When you can, construct the rvalue object in the return statement (this works only for trivial cases). RVO always kicks in.

If you need to construct an object differently based on different paths, declare it first, then let the different paths do whatever they need, and return that same named object. NRVO will kick in as long as you make sure to always return the same named object. In this case, do not combine RVO by constructing an unnamed object in a return statement, as that will disable all NRVO operations.

Put more compactly: use all RVO or all NRVO. Don't combine. Or don't bother too much, as move operations are usually fast anyway.

1. Does this work?

A 2nrvo(int i)
{
if (i < 0) {
A a;
return a;
} else {
A a(i);
return a;
}
}

1. No, it will not invoke NRVO. That you use the same name in two different scopes doesn't not alter the fact that they are two separate objects. Do RVO, or modify and return the same object:

A a;
if (i < ) {
return a; // nrvo
} else {
a = 1; // assuming such an assignment has been defined
return a; // nrvo
}