Category Archives: Vex

Processing Data in Tree Structures

In Vex there exist several tree structures, the most prominent are the XML DOM (semantics) and the box model (representation). Both structures consist of a heterogeneous set of node types which serve different purposes. The one thing that all nodes in a tree have in common is the responsibility to form the tree structure itself. Most tasks that are not related to the structure only involve a subset of node types.

The simplest thing that might work

Let’s start with a simple example: you have an arbitrary node and want to get the text it contains. This obviously works only for nodes of type IText because all other nodes do not contain any text. A very simple approach would be the use of instanceof:

public String getNodeText(INode node) {
    if (node instanceof IText) {
        return ((IText) node).getText();
    } else {
        return "";
    }  
}

This approach is using reflection, which has kind of a bad reputation, but in the end, it does what it should do. For this simple use case it is an appropriate solution.

There is one thing you should keep in mind: when using instanceof you need discipline. Since it is a very generic mechanism, it is easy to mix up different concerns in one place. Imagine a class TableCellNode which implements several interfaces INode, ITableCell and ITextContainer. There is nothing that prevents you from adding just another little if node instanceof... to handle things concerning the node structure in the same breath with things related to table cells or text containers. Clearly a “crack in the pane”.

Polymorphism

Going to the other extreme, you can use polymorphism. This means, there is this generic getNodeText method with a contract, that defines its external behavior. Each and every node type then has to implement getNodeText according to the contract. In our example, most nodes will just return the empty String.

Polymorphism is a very strong mechanism that comes with object orientation. It provides a way to diversify behavior by setting up a custom type system. The key is, that all members of this type system share the same responsibilities, but fulfill them in a different way.

In our example this would mean that each node type has a useful textual representation. But this is not true. The responsibility shared among node types is to build the tree structure. Only one type, IText, has the special feature to provide textual content. Code related to the textual content will only be interested in nodes of type IText. Putting the getNodeText method into the base of the node types breaks the SRP and is not the best solution for our problem. But if the required functionality was related to the tree structure (the common responsibility of all nodes), polymorphism would have been a proper way to go.

The Visitor pattern

We have seen so far that you can either implement the functionality inside your types using polymorphism, or outside using some kind of type checking mechanism like instanceof. What you do highly depends if the functionality you want to implement affects all of your node types, or only some of them.

Another, more object oriented approach to do type checking is the well known Visitor pattern. It allows you, to implement type specific behavior in dedicated methods. Ideally you implement a visitor class for one single purpose. So instead of adding more and more methods with different responsibilities to each node types, you get more and more visitor classes. Each one contains only the type specific behavior related to one responsibility.

How would our example look like, using the Visitor pattern? As a first step, we will implement it as simple as possible, then refine this solution step by step:

public String getNodeText(INode node) {
    final String[] nodeText = new String[1];
    nodeText[0] = "";
    node.accept(new INodeVisitor() {
        public void visit(IText text) {
            nodeText[0] = text.getText();
        }
        // ... here follows one empty visit method for each other node type
    });
    return nodeText[0];
}

Wow. Not really simple, isn’t it? Because we are using an anonymous implementation of INodeVisitor, we have to use a common trick to get information out of the anonymous class into the getNodeText method. This adds clutter. And we have a lot of empty method implementations (which I left out), that add even more clutter and obfuscate the original purpose of the method.

We will attack the superfluous empty visit methods first. To get rid of the, we just create an implementation of INodeVisitor that contains only empty implementations of all visit methods:

public class BaseNodeVisitor implements INodeVisitor {
    public void visit(IText text) {}
    // ... here follows one empty visit method for each other node type
}

This class is then used to derive an anonymous subclass in place of the anonymous implementation of INodeVisitor:

public String getNodeText(INode node) {
    final String[] nodeText = new String[1];
    nodeText[0] = "";
    node.accept(new BaseNodeVisitor() {
        @Override
        public void visit(IText text) {
            nodeText[0] = text.getText();
        }
    });
    return nodeText[0];
}

That was easy. But how to get rid of this array? Therefor we have to extend the vanilla visitor implementation a little bit. We need an interface that provides methods which return a result:

public interface INodeVisitorWithResult<T> {
    T visit(IText text);
    // ... here follows one visit method for each other node type
}

Of course to be able to use this, INode needs an accept method which also returns the result of the visitor, exemplary implemented for IText:

public <T> T accept(INodeVisitorWithResult<T> visitor) {
    return visitor.visit(this);
}

Equipped with this and a base implementation we can now remove the tricky array from our solution:

public String getNodeText(INode node) {
    return node.accept(new BaseNodeVisitorWithResult<String>() {
        @Override
        public void visit(IText text) {
            return text.getText();
        }
    });
}

So that’s it. Surely more complicated than instanceof. You need to understand the Visitor pattern, you need to understand anonymous implementation in Java, you need to know about the base implementation of your visitor interface. In exchange you get all the amenities of Java’s static type system. For example, with support of your IDE, you can easily find all places where type specific behavior is implemented. Casting is not necessary, because Java’s dynamic dispatching is taking care of routing your request to the right visit method.

Traversal

Traversal is a very common set of functionality needed when working with tree structures. Depending on the task at hands there are various ways to traverse a tree: breadth-first, depth-first in pre-order, in-order, or post-order, and a lot more. The Visitor pattern allows you to implement a base class that provides the generic traversal functionality you need (e.g. depth-first pre-order). To put it in use, you then just need to subclass it and implement the type specific behaviour by overriding the visit methods.

For a detailed example, see DepthFirstTraversal and how it is used in various locations to collect information from Vex’s box model.

switch..case

The well known alternative to the Visitor pattern is using switch..case in combination with a set of constants representing the possible types. For example the W3C’s DOM supports this kind of type checking:

public String getNodeText(org.w3c.dom.Node node) {
    switch node.getNodeType() {
    case Node.TEXT_NODE:
        return ((Text) node).getWholeText();
    default:
        return "";
    }
}

Here you need casting again, because, except the use of reflection, we are technically very close to instanceof. If you use an enum instead of int to define the type constants, you can make sure that nothing gets mixed up with implementing type specific behaviour for your polymorphic node type.

switch..case vs. Visitor

switch..case is much easier to understand than the Visitor pattern. You don’t need the anonymous implementations and all this small Visitor classes. The type constants replicate the polymorphic type system. This is, on the one hand, duplicate information (in the sense of DRY). On the other hand you get more flexibility. The Visitor pattern works only for the leaf types of type hierarchies. You cannot have a visit method for an intermediate type T and at the same time visit methods for subclasses of T. With type constants, any combination is possible.

Visitor instances are objects while type specific behaviour using switch..case (or instanceof) is usually implemented in form of a method. This means a Visitor instance is easier to combine with other patterns (like Chain of Responsibility or Command) to build behaviour dynamically at runtime. Visitor objects are also able to carry state information, if your algorithm requires this.

Choose wisely

As usual in software development, you should not rely on a golden hammer. Use the right tool instead. Having some alternatives and knowing their pros and cons always makes your life easier.

How the Cursor Learned to Move

While implementing the new box model for Vex, I came to the point, where I needed a representation for the cursor. I started with a very simple class that knew how to draw the cursor at a given offset. That’s easy to understand, one simple responsibility in one class: SRP check!

But then there happend what always happens when you have a simple and elegant solution in place: there was a new requirement: I needed the ability to move the cursor. Just add that damn feature

The first idea could be to add new methods to Cursor in order to extend Cursor‘s behaviour. The cursor should move, then let’s add move methods for that.

I know this is too easy and you don’t want to go for this solution. But I also know that there was a time where this would have been the only solution I would have seen. So I will at least mention here why it is not the solution to go for, just for the case.

Moving is a whole new responsibility for the Cursor class, that has nothing in common with the already existing responsibility of painting. And, there are a lot of different kinds of movement: left/right, up/down, page up/down, one word forward/backward, just to name a few.

Adding just new methods for each distinct movement will clutter the small elegant Cursor class. It will make the class harder to understand. It will lower the inhibition to add further features, which will make the class even harder to understand and in the end we have a Big Ball of Mud, a God Class or however you would call that mess.

There has to be a better solution.

Separate responsibilities

The SRP and “Composition over Inheritance” lead to another approach: put all the movement implementations into a separate class that takes the responsibilty for moving the cursor. This will leave the Cursor class as it is, isolating it against changes that are related to movement, which also makes the OCP happy – at least in this regard. So let’s try it and call this class CursorMovement for the further discussion.

When starting with the implementation of CursorMovement, it is clear very soon, that this approach also has some flaws. The different kinds of movement need different information about the current cursor position:

For left/right it is sufficient to know the current offset, because moving left or right is just a matter of decreasing or increasing the offset.

Moving the cursor in vertical direction requires detailed information about the spatial relation of boxes and the current graphical representation of the cursor. Having the implementation of the movement outside of the Cursor class means that this information somehow has to be made available through Cursor‘s public interface. This leads to a very tight coupling between Cursor and the CursorMovement class.

The need for access to the graphical representation revealed another flaw that was already there in the first approach. Information provided by the graphic framework (e.g. font sizes, text length in pixels) is only available during painting (see Double Buffering in SWT for more details), but the cursor movement is triggered by a key stroke – totally independent of the painting. To solve this in a clean way, vertical movement has to be postponed until the next paint event.

Content-related movement (e.g. one word forward/backward) requires information about the textual content of the document, totally independent of the graphical representation. This adds another dependency from CursorMovement to a new, so far unrelated entity.

All in all there is no cohesion in CursorMovement and no reuse of code. The class simply piles up unrelated methods and dependencies. We had this already. Together with the unsolved temporal dependency between the graphical information and the paint event, we can come to the conclusion that this approach is also not suitable.

At this point there are a lot of indicators that something is wrong, but it is difficult to see the way out of this. Some impulse from outside might be helpful.

Solutions have to be simple in order to work

Rich Hickey’s talk “Simple Made Easy” gave me this impulse: queues are simple and a great way to do temporal decoupling. This is the missing part in the puzzle. Now the solution is really straight forward:

  • We put each single move into its own class, implementing a simple functional interface (ICursorMove).
  • When triggered by the key strokes, those move instances are added to a queue in Cursor.
  • After adding the move to the queue, also a redraw of the widget is triggered.
  • In its paint method, Cursor first applies the queued moves and then does the actual painting.

This use of (some variant of) the command pattern in conjunction with the queue solved most of the problems:

  • The access to internal data of the Cursor class is restricted to the call of the execution method of the move implementations. No need to provide data through the public interface of Cursor.
  • Each move implementation can have only the dependencies that are needed. Dependencies of a certain kind of move are more explicit.
  • There is no single class that piles up unrelated move implementations and dependencies.
  • Moves are executed within the paint handler, this resolves the temporal dependency between the move and the availability of information from the graphic framework.
  • Adding new moves is just a matter of adding new classes, no changes in existing classes are required. It is even possible to have an extension point which allows other bundles to provide special move implementations for certain document structures (e.g. semantic tables).

Conclusion

It really pays off to know about clean code development. If you listen to your code and you are able to understand the signs, it will always show you the way to a better solution.

If you interested in more details about the implementation, just have a look into the package org.eclipse.vex.core.internal.cursor in Vex’s git repository.

Double Buffering in SWT

To draw the contents of a custom widget in SWT, you have to implement a PaintListener. The paint method is the only place where you have access to the widget’s GC (aka “Graphic Context”), which provides the actual drawing methods.Paint events are handled in the Display thread. This means, the UI is not responding while you are drawing. If you do complex calculations while drawing, this results in flickering.

To prevent this flickering and to keep the UI responsive, the usual approach is to use double buffering: you use two buffers, one is used for drawing and one is visible. When you are finished with drawing a complete frame, you swap the two buffers: the buffer for drawing is now visible and vice versa.

In SWT, you don’t have an explicit buffer you are working on, but you have the GC which provides the methods to manipulate the contents of the buffer. You get the GC instance for your widget visible on the screen as part of the PaintEvent that is provided to the paint callback method. To obtain an independent GC for drawing, you have to do a little trick: create an Image with the size of your widget. For this Image, you can create a GC instance which is usable independently of the UI thread.

Putting all pieces together, you use two Image instances which are used alternating for drawing and showing. The paint callback method just draws the one image that is currently used for showing to the GC of the widget.

SWT snippet 48 shows a basic version of this approach. For Vex, I started to implement a more gerneral version, applicable to Scrollable instances, that uses a dedicated rendering thread for drawing: the DoubleBufferedRenderer.

DoubleBufferedRenderer

The DoubleBufferedRenderer handles the paint events for an instance of Scrollable using double buffering as described above. It allows you to schedule a list of IRenderStep instances, a simple functional interface to implement the distinct steps of your drawing logic, that depend on access to the GC.

Internally, the execution of the render steps happens in a dedicated rendering thread. When all steps are executed, a redraw of the widget is triggered. The finished Image is then used in the paint callback method until a new image is ready.

Usage

Example: When the control is resized, we have to layout the content and paint it. Therefor we schedule two IRenderStep instances:

renderer.schedule(layoutContent(), paintContent());

To make it more readible, I put the creation of the IRenderStep into factory methods, which return an anonymous implementation:

private IRenderStep layoutContent() {
   return new IRenderStep() {
     public void render(final Graphics graphics) {
       rootBox.layout(graphics);
       cursor.reconcile(graphics);
       updateVerticalBar();
    }
  };
}

The anonymous class looks cumbersome. In Java 8 this can be expressed in a much more elegant way, but unfortunately the Eclipse platform is still on Java 6.

For more examples of how DoubleBufferedRenderer can be used, you should have a look at the implementation of BoxWidget.

Outlook

This all is still work in progress. For example the schedule method does not queue the lists of IRenderStep instances, it just knows about the currently executed list of steps and the following list of steps. Each call to schedule overwrites the latter.

Also the usage of threads is still very ineffective. If you have multiple widgets, each using its own instance of DoubleBufferedRenderer, each one will run its own thread. Using a thread pool or more sophisticated techniques will help here.

If you still want to give it a try, there is one dependency to Vex that you have to be aware of: IRenderStep provides an instance of Graphics, an abstraction of the graphics framework in use. For use outside of Vex, you can just replace this with an instance of GC.