6

I understand that in a managed language like Java or C# there is this thing called a garbage collector that every once in a while checks to see if there are any object instances that are no longer referenced, and thus are completely orphaned, and clears then out of memory. But if two objects are not referenced by any variable in a program, but reference each other (like an event subscription), the garbage collector will see this reference and not clear the objects from memory.

Why is this? Why can't the garbage collector figure out that neither of the objects can be reference by any active part of the running program and dispose them.

Eric Anastas
  • 20,455
  • 32
  • 134
  • 223

5 Answers5

24

Your presumption is incorrect. If the GC 'sees' (see below) a cyclic reference of 2 or more objects (during the 'Mark' phase) but are not referenced by any other objects or permanent GC handles (strong references), those objects will be collected (during the 'Sweep' phase).

An in-depth look at the CLR garbage collector can be found in this MSDN article and in this blog post.

Note: In fact, the GC doesn't even 'see' these types of objects during the mark phase since they are unreachable, and hence get collected during a sweep.

Jason
  • 26,250
  • 10
  • 62
  • 62
  • Well ok that makes things a lot easier. So my program has a tree structure made up of Nodes, each of which is displayed through a custom TreeNode in a treeview. If nothing references the Node or the TreeNode then I don't have to worry about detaching the events or references between them. – Eric Anastas Aug 03 '09 at 20:04
7

Most GCs don't work with reference counting anymore. They usually (and this is the case both in Java and .NET) work with reach-ability from the root set of objects. The root set of objects are globals and stack referenced instances. Anything reachable from that set directly or indirectly is alive. The rest of the memory is unreachable and thus prone to be collected.

David Rodríguez - dribeas
  • 192,922
  • 20
  • 275
  • 473
  • Actually, i think it is wrong to call Reference-Counting a form of Garbage-Collection. They are 2 separate forms of memory management. – Henk Holterman Aug 03 '09 at 19:52
  • Reference counting (RC) is the form of garbage collection (GC). Simple RC algorithm cannot deal with cyclic references, but there are Bobrow's technique, weak pointer algorithm and partial mark-sweep algorithms that can. – mtasic85 Aug 05 '09 at 11:39
2

I would like to add that issues surrounding event subscription usually revolve around the fact that the subscriber & publisher have very different lifecycles.

Attach yourself e.g. to the App.Idle event in Windows Forms and your object will be kept alive for the remaining application lifetime. Why? That static App will hold a reference (albeit indirectly through a delegate) to your registered observer. Even though you may have disposed of your observer, it is still attached to App.Idle. You can construct many of those examples.

flq
  • 20,838
  • 4
  • 49
  • 72
2

This is a major drawback of traditional reference counting garbage collection. The property of a garbage collector describing this behavior is an incomplete collector. Other collectors largely fall into a category called tracing garbage collectors, which include traditional mark-sweep, semi-space/compacting, and generational hybrids, and don't suffer from these drawbacks (but face several others).

All of the JVM and CLI implementations I'm aware of use complete collectors, which means they don't suffer from the specific problem you are asking about here. To my knowledge, of those Jikes RVM is the only one supplying a reference counting collector (one of its many).

Another interesting thing to note is there are solutions to the completeness problem in reference counting garbage collection, and the resulting collectors demonstrate some interesting performance properties that are tough to get out of tracing collectors. Unfortunately, the highest performing reference-counting garbage collection algorithms and most completeness modifications rely on compiler assistance, so bringing them to C++'s shared_ptr<T> are difficult/not happening. Instead, we have weak_ptr<T> and documented rules (sorry about the sub-optimal link - apparently the documentation eludes me) about simply avoiding the problems. This isn't the first time (another mediocre link) we've seen this approach, and the hope is the extra work to prevent memory problems is less than the amount of work required to maintain code that doesn't use shared_ptr<T>, etc.

The mediocre links are because much of my reference material is scattered in notes from last semester's memory management class.

Community
  • 1
  • 1
Sam Harwell
  • 92,171
  • 18
  • 189
  • 263
1

The other answers here are certainly correct; .NET does it's garbage collection based on reachability of an object.

What I wanted to add: I can recommend reading Understanding Garbage Collection in .NET (simple-talk article by Andrew Hunter) if you want a little more in-depth info.

Thorarin
  • 43,849
  • 11
  • 70
  • 108