The Weird and Wonderful World of Garbage Collection

Fundamentals

Each process has its separate address space. The CLR allocates a segment in memory to store and manage objects. This address space is called managed memory heap as opposed to a native heap. All threads in the process allocate objects on the managed object heap

How Garbage Collection Works

The garbage collection is an automatic process of reclaiming memory from the objects that are no longer in use. It provides several benefits:
· Enables you to develop without having to explicitly fee memory, using ‘delete’ and thus eliminates the potential bugs associated with this manual process whereby a developer deletes and objects still in use or forgets to do so leading to memory leaks

· Significantly improves the allocation performance. In order to allocate an object in CLR world, all the framework has to do is to advance the next object pointer, relying on the fact that the memory is compacted

The garbage collection occurs when the system becomes low on memory; the size of the managed heap surpasses the acceptable threshold or ‘GC.Collect()‘ function is called, triggering the collection.

Generations

The managed heap is further segregated into a large object heap LOH and a small object heap. The small object heap is split into 3 generations: Gen0, Gen1 and Gen2 however this depended on the platform, for example if you’re developing using Xamarin for Android you only have 2 generations.

The generations dictate how often the garbage collection is performed.

Generation 0

This is the youngest generation almost all objects are initially allocated in this generation, unless they are larger than 85,000 bytes in which case they are allocated in LOH.

Most objects are reclaimed by garbage collection in generation 0 however the ones that do survive move on to generation 1.

Generation 1

This generation is a buffer between short lived objects and long-lived objects

Generation 2

This generation contains long lived objects. GC tends to collect objects in this area of memory space quite infrequently.

What Happens During Garbage Collection

A garbage collection has the following steps:
· Marking phase – traverses the graph of all objects and marks them as alive. Each object (‘class’) has a special flag that enables this. Structs in turn do not have this field since they don’t live on the heap

· Relocation of the references to the objects that will be compacted. It should be noted that the GC works around the pinned objects. Pinned objects are usually the ones following by the fixed(…) statement. This pins the object in memory thus allowing to pass the pointer to that object to unmanaged code and at the same time guarantee that the pointer wouldn’t change. This is critical to the correct operation, it also hinders the performance of the garbage collector, hence the use of pinned objects should be minimal.

· Compacting phase that reclaims the space occupied by the dead objects. It moves all objects to the beginning of the memory segment and makes next object pointer, point to the end of the segment.

Originally the LOH was never compacted, which lead to fragmentation and excessive memory usage; however since version of 4.5.1 CLR provides the ability to defragment the LOH by setting ‘GCSettings.LargeObjectHeapCompactionMode

The garbage collector uses the following information to determine whether objects are live:
· Stack roots. Stack variables provided by the just-in-time (JIT) compiler and stack walker.

· Garbage collection handles. Handles that point to managed objects and that can be allocated by user code or by the common language runtime.

· Static data. Static objects in application domains that could be referencing other objects. Each application domain keeps track of its static objects.

Finilizers and Managing Unmanaged Resources

If your object uses any of the unmanaged resources it has to provide the ability to free them. The common pattern is to use ‘IDisposable’ interface to deterministically dispose of unmanaged resources. In case the ‘Dispose’ method isn’t used, the developer should provide a backup, in the form of a Finilizer. The finalizer should only be called if the client code never called the ‘Dispose()’ method. So you logic should handle the case that if the dispose method was called the finaliser never executes. You can achieve that by calling ‘GC.SuppressFinalize(this)’. If you don’t the object will be in freechable queue, then in finalisation queue and even though your object is in Gen0 it will be cleaned up after several garbage collections. ‘GC.SuppressFinalize(this)’ removed the object from the freechable queue eliminating almost all performance drawbacks of having a finilizer. Why almost? Well there is a small catch. When an object has Finilizer it adds itself into freechable queue. If the objects are allocated from different threads and add themselves to freechable queue, that creates contention, theoretically slowing the allocation down due to synchronisation.

To be continued…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: