Sacred Cows: Reference types and value types

 One of the standard interview questions we ask is "What is the difference between reference types and value types". The usual answer we get is "Reference types are allocated on the heap and value types are allocated on the stack". That is fairly common currency among .NET developers and articles like this one. When I then respond is that always true? I usually get some puzzled looks and maybe a stab at boxing. But what I am really searching for is an understanding that the common wisdom is, at least, equivocal with the truth.

Definitions

Let us get some definitions out of the way:

The Stack: Also called the call stack it is an area of memory reserved for information about the currently executing call. The main job of the stack is to store the return address which we pop off the stack when we return (called unwinding the stack). You can see the stack in the Visual Studio IDE via the stack trace window. Local storage can happen on the stack. This is much faster than allocating memory from the heap. Variables declared on the stack are destroyed when the method returns, so don’t need to be garbage collected (see also stackalloc).

The Heap: This is a block of memory used for dynamic memory allocation. It is more expensive to allocate here and, as we have to free the memory we are using, variables declared on the heap need to be garbage collected.

Reference Type: A pointer to the location on the heap of an object. Note the use of indirection, this is not the object itself. Hence the use of ReferenceType a reference (pointer) to a type.

Value type: An object in memory, accessed without indirection i.e. the thing itself.

Note that parameter passing is always by-value unless you declare it by-reference (ref or out). That means we copy. We copy both the value and the reference type (the pointer). That is why changes to a reference type are reflected when the method returns and value types are not unless you pass by-reference. It is also why if you assign a new object to your reference type that change is not reflected when you exit the method, you have only changed the copy of the pointer, unless you pass by-reference.

Reference Types, Indirection, and Value Types 

Still here? Good then let us talk about where value types and reference types are stored. First the easy part:

The object pointed to by a reference type always goes on the heap. That is easy enough and what we usually mean by: reference types are allocated on the heap. Note that we mean the memory that we point to not the pointer itself. The lifetime of the pointer, and the object on the heap are different.

Now it gets a little harder.

Local value types are allocated on the stack (value types declared within a method body). Their lifetime is the life of the method, so we can allocate them on the stack (because it is okay to destroy them when the method returns). Reference types that are declared within a method body are allocated on the stack too! But only the pointer to the object, not the object itself. The object it points to is always on the heap, and may be available for garbage collection if the last pointer to it just vanished, but not the pointer. That lives on the stack if the declaration is within a method body.

Value types and pointers declared as members of a reference type are allocated on the heap along with the reference type that they are part of. The lifetime of the parts of the object are the same as the lifetime of the object.  If we could allocate them on the stack, then parts of our object would be de-allocated when the method returned, but parts would remain valid on the heap. If we had another extant pointer to that object (say an argument passed by the calling method) then attempting to access those contained value types and reference types would access memory that had been de-allocated. Bang! So value types and pointers declared as members of a reference type are allocated on the heap along with the reference type that they are part of.

Boxing

Boxing is used to refer to a value type via a pointer. When a variable is boxed memory for it is allocated on the heap and the state of the value type is copied into the heap. We then have a pointer to that heap allocated memory as a result of the boxing operation. That is what it means to convert from a value type to a reference type. The heap memory we point to is garbage collected, so when the last pointer to the boxed value type goes, it will be available for collection.

Note that this is why changing the state of  boxed type doesn’t change the state of the value type. It’s a copy. It is a bit disingeneous to say that in this case "value types are on the heap" as some candidates do. What we have done instead is to create a copy of the value type on the heap, which our reference type points to.

What we are looking for

So the reason for asking this question is to flush out whether the developer knows anything about the difference between the stack and the heap and indirection (pointers). It’s one way of differentiating skill levels.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s