Garbage collection is hard (GObject memory management)

I'm beginning to understand why garbage collection is quite a hard thing to implement.

I'm working on memory management in cl-gtk2-object (I'm fixing bugs and trying to make more clear) and keep bumping into various issues.

GObject uses reference counting to manage memory. And it also has 3 types of references:

  • strong references, which is the normal kind of references;
  • weak references;
  • toggle references.

The first two are more or less clear, but the “toggle references” is a more exotic beast. Its main use is to integrate GObject's reference counting with garbage collectors in other languages which is very useful when writing bindings to Gtk+ for garbage-collected language.

“Toggle” references differentiate between two states:

  • the object is only referenced by a toggle reference; and
  • the object is referenced by at least one other reference.

In the first case, when the object is referenced only by the toggle reference, this means that only the host language (that embeds the GObject runtime) may contain references to that object. This fact lets the language runtime be sure that the object can't be ressurected. If we wouldn't have such kinds of references, a GObject binding could not be sure that it is deleting the last reference to an object and it would be possible for the same reference to resurface sometime later which would the GObject binding to attempt to recreate the object proxy for a non-existent object. The object proxy that was gone could contain some other state (for example, it could be of a derived type not known to the GObject class system) - and if that object would be deleted, then the GObject binding would have no choice but to return an instance of GObject class. In a nutshell, “toggle” references let us hold to a reference to a Lisp-side object to ensure that it is available when necessary. And when are no “foreign” references, we are allowed to downgrade our reference (contained within a hashtable that maps from GObject pointers to Lisp-side object references) to a weak one.

It is quite easy to deal with this kind of reference. Basically, we have two hashtables - one for strong references and one for weak references. When searching for an object by foreign pointer, we look in both tables. And when the “toggle references” is being toggled - transfer the corresponding pointer-object pair from one hashtable to another.

During the course of this, I've discovered a bug in object instantiation: sometimes extra references for objects were retained.

Another discovered issue is more insidious. To store references to event handler callbacks the technique named “stable pointers” is used. “Stable pointer” is just a fancy name for an integer index in an object reference array. The trouble is that the references in that array are always strong ones. And if the event handlers reference an object whose event it handles, then the object would never be collected as it contains the strong reference from the event handle which in turn is held by the stable pointers array.

The object graph looks something like this: image.

The solution to this issue is to make the corresponding signal handlers weak when the object reference is toggled into the weak state. In this case, the object would not be strongly referenced by the signal handler and it will be collected by the garbage collector (just as the handlers of its events). Now it takes just to implement this. Or, as another option, store the event handlers in the object itself:

image