C++ references, because pointers are overrated


I read some day in some C++ book a quote: “References are the sweets of C++”. I can almost swear that this quote belongs to Bjarne Stroustrup as my weak memory remembers, but I have no evidence. This quote is the motivation behind this blog post.

GOALS

  • Shed the light on C++ references beauty
  • Suggest a Do/Don’t guidelines and good practices for using references
  • Indirectly begging you not to use pointers everywhere to look geek

NON-GOALS

  • Another cplusplus.com tutorial on references or pointers
  • Another stackoverflow question in the regex form of: * Pointers * References *?
  • Illustrate how dumb is writing using namespace in header files

TO KNOW

[o] References and Pointers have exact dereference and member function call assembly

In case you thought that one has performance advantage over the other, they only differ in semantics

    int x = 10;
00D453E8  mov         dword ptr [x],0Ah  
    int& xRef = x;
00D453EF  lea         eax,[x]  
00D453F2  mov         dword ptr [xRef],eax  
    int* xPtr = &x;
00D453F5  lea         eax,[x]  
00D453F8  mov         dword ptr [xPtr],eax  

    xRef = 20;
00D453FB  mov         eax,dword ptr [xRef]  
00D453FE  mov         dword ptr [eax],14h  
    *xPtr = 30;
00D45404  mov         eax,dword ptr [xPtr]  
00D45407  mov         dword ptr [eax],1Eh  
    Foo foo;
    Foo& fooRef = foo;
009E50BD  lea         eax,[foo]  
009E50C0  mov         dword ptr [fooRef],eax  
    Foo* fooPtr = &foo;
009E50C3  lea         eax,[foo]  
009E50C6  mov         dword ptr [fooPtr],eax  

    fooRef.Bar();
009E50C9  mov         ecx,dword ptr [fooRef]  
009E50CC  call        Foo::Bar (09E1429h)  
    fooPtr->Bar();
009E50D1  mov         ecx,dword ptr [fooPtr]  
009E50D4  call        Foo::Bar (09E1429h)  

[o] Pointers inheritance/polymorphism rules apply

class A { public: void Foo() { /* Do A Stuff */ } };  
class B : public class A { public: void Foo() { A::Foo(); /* Do B Stuff */ } };  
B b;  
A& a = b; // Reference acts like a pointer  
a.Foo(); // B::Foo will be called which will call A::Foo

DO

[v] Pass Out and In/Out parameters by reference and In parameters by const reference

A pass by non-const reference is a contract between the caller and callee that a parameter is either write (Out) or read/write (In/Out), to distinguish between them in Visual Studio use the SAL annotation _Inout_. More on SAL here. On the other hand, a pass by const reference is a contract between the caller and callee that a parameter is read-only (In), and you don’t need SAL to specify that, it is implicit.

[v] Hold STL containers indexing\accessor return by reference

For C++11, auto& will always work for STL containers unless you intentionally need to work with a copy

// Assume: vector< vector<Tile> > grid;
// This is a bad design for a matrix grid
for (size_t y = 0; y < grid.size(); ++y)
{
	for (size_t x = 0; x < grid[y].size(); ++x)
  {
		Tile& tile0 = grid[y][x];
		// Or the C++11 way
		auto& tile1 = grid[y][x];
		// This is a tile copy
		auto tile2 = gird[y][x];

     /*
		A chunk of code operating on tile
		No need to spam writing grid[y][x] everywhere,
		indexing is cheap but it may not be for map or set
		*/
	}
}

[v] Use auto& to avoid unintended value copy (C++11)

auto is a bit tricky, you think it is smart enough to understand that you want your type by reference, but it isn’t

One example is the std::map indexing operator [], its return type is a reference to a value type

// Assuming an empty std::map<string, Member> accounts
// s is a value copy of Member
auto s = students["KokoElDa3ef"];
// Chang the s copy of the original member record
s.email = "koko@wawa.com";
// Outputs nothing
cout << students["KokoElDa3ef"].email;

// s is a reference to a Member
auto& s = students["KokoElDa3ef"];
// Chang the original member record
s.email = "koko@wawa.com";
// Outputs koko@wawa.com
cout << students["KokoElDa3ef"].email;

// Assuming: std::map<int, Actor> sceneActors;

// actor is a reference type
for (auto& actor : sceneActors)
{ /* something goes here */ }

// actor is a value type, and is a shallow copy of 
// the actual Actor stored in sceneActors map
for (auto actor : sceneActors)
{ /* something goes here */ }

[v] Use pointer parameters when null is expected and legal

void Foo(Bar* barPtr)
{
	// This is dangerous, it assumes that you will always
	// get a non null barPtr, be defensive with if (nullptr == barPtr) return;
	barPtr->bar();
}

void Foo(_In_opt_ Bar* barPtr)
{
	// _In_opt_ says that the pointer can be null
	// it is unsafe to use it without validation, but
	// at least the contract is clearer with SAL
	// Use _In_ if you have a guarantee that barPtr
	// will always be valid (the caller guarantees that)
	// You can't always guarantee the caller, thats why
	// a defensive if/return is a must especially if you are
	// writing an API that will be used by a 3rd party code
	// Use reference to make it clear without SAL
}

void Foo(Bar& barRef)
{
	// You can access bar instance safely. 
	// The referance contract says that it won't be null
	// If bar was a pointer, you would guard yourself with
	// If (barPtr != nullptr) return;
	// Using SAL with references is still good and makes
	// your code ultimately clear
}

[v] Use reference to an object for a cleaner operator call syntax

// Assume a 2D dynamic matrix pointer vector< vector<int> >* matrixPtr
// some 2 for loops i,j operating on m(i,j) using indexing operator
int cell = matrixPtr->operator[](i)->operator[](j);

// Assume a 2D dynamic matrix reference vector< vector<int> >& matrixRef
// some 2 for loops i,j operating on m(i,j) using indexing operator
int cell = matrixRef[i][j];

[v] Capture by reference [&var] () {} for lambda closures when a reference makes sense (C++11)

Unless you really want your closure to have a copy of one of its non basic type captures, capture by reference, especially if they are will be used as read-only

string str = "Hello World";
auto displayStr = [&str] () { /* something goes here with str */ }

[v] Return ready-only class property by reference/const reference to specify object ownership

I consider this point as the most important of all them all. Unless Null is allowed for a class property return value, return property value by reference/const reference to specify that the an object owns what is returned from this property in terms of memory management AND that returned value (either it is another object or a simple type) is as alive as the owning object is alive and dies when the owning object dies. This practice will make your native programs less prone to memory leaks and make your life easier. Examples will make the concept more clear

// Assuming that Mat4x4 is a 16 float sized object
class GameObject
{
public:
	// Bad: does not add any value, const ref is the way to go
	Mat4x4 Transform1() const { return m_transform; }

	// Bad for read/write transform (Can I get a nullptr matrix? Can I delete it?)
	Mat4x4* Transform2() { return &m_transform; }

	// Bad for read-only transform (Can I get a nullptr matrix?)
	const Mat4x4* Transform3()  const { return &m_transform; }

	// Good for read/write transform
	Mat4x4& Transform4() { return m_transform; }

	// Good for read-only transform
	const Mat4x4& Transform5() const { return m_transform; }
	

	// Can I get the RigidBody pointer and delete it and let you crash and burn?
	PhysicsRigidBody* RigidBody1() { return m_&pRigidBody; }

	// You can get the RigidBody read it, but you can't delete it
	const PhysicsRigidBody* RigidBody2() const { return& m_pRigidBody; }

	// Bad: Will you crash if m_pRigidBody is nullptr?
	// Can I hold a reference onto it for a while? Will it stay valid?
	const PhysicsRigidBody& RigidBody3() const { return *m_pRigidBody; }

	// Kinda Good: You can get around the nullptr problem by using
	// the Null Object design pattern
	const PhysicsRigidBody& RigidBody4() const
	{
		if (nullptr == m_pRigidBody)
			return PhysicsRigidBody::Null();
		return *m_pRigidBody;
	}
	
private:
	Mat4x4 m_transform;
	PhysicsRigidBody *m_pRigidBody;
}

DONT

[x] Use const reference for In parameters of basic types

Most basic types have size <= size of a pointer (i.e sizeof(void*) or sizeof(size_t)) (e.g char, short, int, unsigned char, unsigned short, etc ..)

[x] Pass read-only basic type parameters by reference

It makes sense to pass a parameter by const reference if its type size is > pointer size. Why? Observe the assembly and judge

void Foo(char c)
{
// Omitted assembly goes here
    char a = c;
002E394E  mov         al,byte ptr [c]  
002E3951  mov         byte ptr [a],al   
};

void Tar(const char*c)
{
// Omitted assembly goes here
    char a = *c;
002E3A0E  mov         eax,dword ptr [c]  
002E3A11  mov         cl,byte ptr [eax]  
002E3A13  mov         byte ptr [a],cl 
};

void Bar(const char& c) 
{ 
// Omitted assembly goes here
    char a = c;
002E2C3E  mov         eax,dword ptr [c]  
002E2C41  mov         cl,byte ptr [eax]  
002E2C43  mov         byte ptr [a],cl  
};

CONCLUSION

  • Consider passing by reference as your default choice unless:
    • A NULL object/value is a valid case for which you pass by pointer
    • sizeof(Type) sizeof(void*) pass by const reference. You can safely pass most basic types by value without worrying about a performance hit.
    • A parameter is meant to be copied for which you pass by value
  • Consider using SAL annotations to help clarify your contracts even if you are not willing to run VisualStudio static code analysis

*Please notify me if you find any mistake. Your feedback is very welcomed.

One Response to “C++ references, because pointers are overrated”


Leave a comment