Gagallium : Thoughts about subject/observer, publisher/subscriber, and self types in Java

I am neither a Java aficionado nor a Java guru, but I use it as a vehicle for teaching programming at an undergraduate level.

In this post, I describe a simple situation where the need for a self type arises in Java. I present a way of simulating a self type in Java, and also suggest that in this case, by changing the code slightly, one can avoid the need for a self type in the first place. None of these ideas is new, but perhaps they deserve to be more widely known.

The Java library offers a simple implementation of the subject/observer design pattern. It takes the form of an Observable class, which maintains a list of observers, and an Observer interface, which states (in short) that an observer must be able to receive a message.

A subject is essentially a publisher

In the subject/observer design pattern, an observer is supposed to be notified only when the state of the subject changes. Java’s Observable class provides a Boolean field called changed, together with getter and setter methods. The method notifyObservers does nothing unless changed is set, and clears it. This relatively simple logic is independent of the point that interests me, so I will omit it from this discussion.

As a result of this omission, the subject/observer pattern degenerates and becomes essentially a publisher/subscriber pattern, where a subject can decide at any time to send a message to all of its observers.

A key point of interest, though, is that the subject sends itself as the message (or as part of the message).

Java’s Observer and Observable are not generic

Have a look at Java’s Observer interface. The update method expects two arguments: the subject that sends the message, and the message itself.

public interface Observer {
  void update (Observable subject, Object message)
}

This is coarse, and slightly unsatisfactory. When someone implements the Observer interface, they will have in mind a specific type of subjects (a subclass of Observable) and a specific type of messages. Thus, they will be forced to use an inelegant and potentially unsafe downcast instruction.

A generic Observer

In order to avoid this, it seems obvious that one should create a parameterized version of the Observer interface.

public interface Observer<M> {
  void notify (M message);
}

I have slightly over-simplified the interface by deciding that notify takes a single parameter: a message. In principle, this is sufficient. If one wishes to convey the identity of the subject to the observer, then one can send the subject itself as the message. If one wishes to convey both the identity of the subject and some piece of data, then the message can be a pair of these two values.

Of course, parameterizing the Observer interface does not solve the problem. It only moves the problem to the implementation of the Subject class.

A basic Subject

We can now implement a basic version of the Subject class. In the definition of notifyObservers, we decide that the message sent to the observers will be this, that is, the subject itself. Thus, it seems that every observer must have type Observer<BasicSubject>.

public abstract class BasicSubject {

  private final List<Observer<BasicSubject>> observers
    = new LinkedList<Observer<BasicSubject>> ();

  public void addObserver (Observer<BasicSubject> o)
  {
    observers.add(o);
  }

  public void notifyObservers ()
  {
    for (Observer<BasicSubject> o : observers)
      o.notify(this);
  }

}

This works, but is again not satisfactory. Someone who implements the interface Observer<BasicSubject> will again be forced to cast from the type BasicSubject down to some specific subclass.

What am I? or, the need for a self type

A Scala programmer would know how to solve this problem. We need a self type. That is, we would like the observers to have type Observer<Self>, where Self is the type of this. In other words, Self is an as-yet-undetermined subtype of Subject.

In Scala, one can introduce Self as a type parameter and constrain it to stand for the type of this, via a constraint of the form this : Self => ....

In OCaml, the same thing is possible. (Thanks to Gabriel Scherer for pointing this out.) The subject/observer pattern can be implemented as follows:

class type ['m] observer = object
  method notify : 'm -> unit
end

class subject = object (self : 'self)
  val mutable observers : 'self observer list = []
  method add_observer o =
    observers <- o :: observers
  method notify_observers () =
    List.iter (fun o -> o#notify self) observers
end

Scala and OCaml are cool, but I teach Java, so let’s go back to it.

Simulating a Self type

As of version 7, Java does not have this feature, but we can simulate it by declaring an abstract method, named self, whose return type is Self, and which we intend to implement (in a concrete subclass) by return this.

The code is now:

public abstract class Subject<Self> {

  private final List<Observer<Self>> observers
    = new LinkedList<Observer<Self>> ();

  public void addObserver (Observer<Self> o)
  {
    observers.add(o);
  }

  public void notifyObservers ()
  {
    for (Observer<Self> o : observers)
      o.notify(self());
  }

  public abstract Self self ();

}

We could add the constraint that Self extends Subject<Self>, but it is not required here.

When we later implement a concrete subclass of Subject, say Temperature, we implement the method self, as follows.

public class Temperature extends Subject<Temperature> {

  @Override public Temperature self ()
  {
    return this;
  }

}

This may seem a bit heavy, and it is indeed so, but at least we have been able to simulate a self type. One can now implement the interface Observer<Temperature> without a downcast.

Publishers are simpler than subjects

The need for self types arises because a subject sends itself as a message to an observer. If we did not make this decision at the level of the super-class, the code would be simpler, and we would still be able to make this decision at the level of the subclass.

Let’s see.

A subject becomes just a publisher, and the type parameter Self becomes M, the type of the message that is sent. The type M is entirely undetermined at this point.

public abstract class Publisher<M> {

  private final List<Observer<M>> observers
    = new LinkedList<Observer<M>> ();

  public void addObserver (Observer<M> o)
  {
    observers.add(o);
  }

  public void notifyObservers (M message)
  {
    for (Observer<M> o : observers)
      o.notify(message);
  }

}

When we later implement a concrete subclass of Publisher, say Pressure, we instantiate M with Pressure itself. Then, we implement a new version of notifyObservers, which does not take a parameter, by invoking the inherited notifyObservers with this as a parameter.

public class Pressure extends Publisher<Pressure> {

  public void notifyObservers ()
  {
    notifyObservers(this);
  }

}

The end result is the same as in the Subject/Temperature example. However, because we no longer need a self method, this version of the code is perhaps easier to explain to a non-expert programmer.