Wednesday, April 3, 2013

Java Generics Trivia: A Tale of Syntax Discovery and Clean Code

I have spent years working with Java, although never as intensively as during my graduate school years. A significant part of my effort in graduate school was in gaining a deep understanding of the semantics of Java. There have been a lot of changes in Java since then—generics being among them. Today, while doing some programming, I came across something that I didn't know you could do.
I am writing a generic fixed-size grid class: it contains width*height elements that are indexed by <column,row> pairs. When faced with the problem of making a grid in a specific size, my first thought was to do it this way:
class Grid<T> {

  public static Grid create(int width, int height) {
    return new Grid(width,height);
  }

  private Grid(int width, int height) {
    ...
  }

}

However, I am also trying to follow Clean Code as closely as I can. In particular, I wanted to reduce all my methods to zero or one parameter, so I decided to introduce a builder here instead of the static factory method. In fact, I had just shown a similar example in CS222 last week—though, notably, without generics.

My refactored version, then, looked like this:
class Grid<T> {
  public static <T> Builder<T> width(int width) {
    return new Builder<T>(width);
  }

  public static final class Builder<R> {
    private final int width;

  private Builder(int width) {
    this.width = width;
  }

  public Grid<T> height(int height) {
    return new Grid<T>(width, height);
  }
}

Isn't it pretty? No method has more than one parameter, and the structure of the method calls enforces that I cannot get a Grid object without specifying its width and height. I returned to my calling code—a private createGrid utility method within my test suite for this class—and replaced the call to the static factory method with the following. (Note that Integer here is the data stored in the grid and is not related to the int width and height; the dimensions are always int regardless of the stored data since I am dealing with a discrete grid.)
Grid<Integer> grid = Grid.width(10).height(10);

I was surprised when the compiler complained. It sure looks a lot like the handy type-inferring methods of Guava such as Lists.newArrayList, that keep me from having to pepper new and angular braces throughout my code. Yet, the compiler insists that it cannot convert a Grid<Object> into a Grid<Integer>.

Clearly the problem is with the generics. To test my understanding of type inference, I converted the code to the following:
Grid.Builder<Integer> builder = Grid.width(10);
Grid<Integer> grid = builder.height(10);

This worked fine, but it obviously is not the kind of code I want to write when creating grid objects. If I was happy with this kind of awkwardness, I wouldn't have pedantically followed Clean Code in the first place.

I could not see a clear escape from here, but I began to wonder what kind of syntactic stunts I could pull to get this to work in fewer characters. I tried the following on a whim:
Grid<Integer> grid = Grid.<Integer>width(10).height(10);

If this kind of code showed up in Naftalin and Wadler's book on generics that I read six years ago, I certainly don't remember it. I pride myself on my mental Java compiler, but my brain still stutters when I look at this code.

The good news is that it works, and I learned something about the language in which I like to claim expertise. (I suppose I knew enough to know what to try, and that says something.) The bad news is that there is still a DRY violation in the generic types. If you can see a way to eliminate that, let me know!

10 comments:

  1. Indeed this is a side effect of generic methods. The assignment provides the implicit context and chaining the builder method requires the explicit type to be provided as "height()" is called before assignment.

    The solution is to provide a bounds for the generic method width() and if that is Number then only numbers can be passed in with the associated autoboxing. The neat thing is that whatever type is passed into width() then forces the same type to be applied to height() from the builder. You can't pass in an int and then a float for instance.

    The solution is here:
    http://pastebin.com/QCNySDE6

    I use a lot of advanced generics / generic methods in TyphonRT and it is definitely knowledge / application that I consider at least that has bumped me up to an expert understanding of Java.

    I'm sure you've seen the excellent Java generics FAQ; definitely good reading:
    http://www.angelikalanger.com/GenericsFAQ/JavaGenericsFAQ.html

    ReplyDelete
  2. I suppose I might add that the reason one can pass a primitive int / float / double, etc into width() and have the type inferred implicitly is because of autoboxing to the actual number object type which is subsequently chained through the further generics of the builder.

    A final fix for clarity as I left out making width / height of Grid public final and the height assignment in the constructor:
    http://pastebin.com/tsKWmxPa

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Hi Michael!

      (Previous attempt ate my angular brackets; let me try again.)

      Your post makes me realize that my original example is a bit misleading, since I'm using int as a my parameter and Integer as my data type. In fact, I always want int sizes, since it's a discrete grid, but I want the payload of each cell to vary. I'm using Integers because it's easy to test equality. The grid will actually contain some other type of game object, so a better example might be:

      Grid<GameObject> g = Grid.<GameObject>width(10).height(10);

      Long story short, the builder methods don't carry the payload type as an actual parameter, so it looks like I need to explicitly send it as a generic binding to width, as above.

      Delete
    3. Indeed a bit cryptic with the Integer as the original listed generic type.

      | Long story short, the builder methods don't carry the payload type as an actual parameter

      Unfortunately even passing the explicit type via the brackets before the method name is not going to be good enough because you will not be able to instantiate any of the objects for the grid. IE you can't do "T.class.newInstance()". You must explicitly pass in the type as a class reference and modify the builder accordingly.

      I've included a sample below for kicks since it was easy to pump out:
      http://pastebin.com/rRPWAwF8

      Note the use of the ThreadLocal to cache the builder utilizing type erasure to store one BuilderW reference per thread instead of creating new ones. Of course this may have made more sense with the Number / Integer / Float grid example.

      For kicks here is the Number example sealed up:
      http://pastebin.com/PaMtdLaP


      If this were TyphonRT I'd use the generic recycler for the builders as systems and the grid / game objects as data components filling it.

      Delete
    4. Very cool trick! I had not gotten far enough into this code for the grid to take care of initializing its contents: in my unit tests, I was just letting each cell start as null, and then plopping values in via mutator methods. Pretending "null" is a meaningful value, of course, violates Clean Code too---so I think my next step will be to refactor toward your solution! Thanks for the tip!

      Delete
  3. Cool, another thing to keep in mind regarding clean code and more advanced Java apps that focus on modularization is to not do bare reflection like my example does. IE you need to create new objects through whatever module framework one is using. I accomplish this in TyphonRT with a general module / class loading system interface instead of bare reflection usage.

    The example I posted does make me think about adding such a grid data structure class to TyphonRT because in the last month or so I figured out a way to store / retrieve typed components from a component manager, so it's possible to start making generic typed components. Also some insight on how I plan to transform the Grid class I wrote making it more component oriented is to make the Grid class a component manager. For the set / get methods and even the toString method along with equals and hashcode instead of only providing a default operation these methods will delegate to a special GridLifecycle system that can potentially provide specialization depending on the type of components stored in the grid and override the default behavior. IE the Grid is marked final so it can't be extended, but one stuffs GridLifecycle systems into it to specialize. For instance you can filter the get / set methods x & y values to offset them. IE a grid from -5 to 5 on x / y which properly offsets to the proper positive array indexes. It might also be interesting to see if there are some potential matrix like operations that can be performed on the grid across the objects / components contained. Another idea is to provide custom iterators for all the elements, row / column iterators, or other grid oriented iteration patterns.

    Making "game of life" with your example? Seems to be a good example of the grid concept. I suppose an interesting component idea in this regard is to create a customized event bus system (event bus pattern) and when a cell posts an event to the bus the event bus can look up the surrounding cells to figure out event delivery such that one doesn't use the observer / listener pattern and distribution is handled by an intermediate system.

    ReplyDelete
    Replies
    1. I am making a cellular automaton, like Game of Life but inspired by computational physics. I'm tinkering with a few ideas to see what games lie within; if I get something up and running, I'll post about it, though that probably won't be until Summer.

      I had not considered using an entity system framework for this system. I had some simple demo code in Swing that I wrote on a flight, and I ported that over the PlayN to see how it performed on Android and HTML5. Turns out the performance was quite bad, which made me very happy to have tried it early rather than build on a house of cards! I'm torn now between rewriting it in an ES architecture in order to gain the myriad benefits of that, vs. writing it quick-and-dirty to get performance increases and quickly get something testable---even if it means I have to throw it away. Not sure which approach carries more risk yet.

      Delete
    2. Neat on the cellular automaton direction. I decided to make the grid component manager in TyphonRT.

      A new demo / code lesson uses it for game of life.

      **"I ported that over the PlayN to see how it performed on Android and HTML5. Turns out the performance was quite bad..."

      Any chance you can qualify the above statement. Do you think it was a PlayN issue or did you do any profiling? It's hard to get a feel for how this could be slow on Android.

      It seems a quick and dirty (~week long) approach to get a demo up and running is a good idea then convert it to an ES approach. This also provides some metrics to test the ES version against for performance.

      With my efforts with TyphonRT the nice thing is that I have a well defined component architecture insofar that if the default manager / component classes are used (vs interfaces / bespoke implementation) the managers have a way to inject whatever via a package private method into the child component being added / removed. This is how the parent / child relationship is established fast; IE a manager injects itself into child components being added. Where this is also very useful is with the GridManager since it injects itself as a manager and the multidimensional array into child components added. This opens flexibility of any child component to directly manipulate the backing array. So say you have an "IMatrixOps" system interface. You can create an implementation of that interface and the system is injected with the backing array so the system can directly modify the grid elements in a custom way; IE do whatever "invert()" means for the potential data components being stored.

      One can do grid.getAs(SYSTEM, IMatrixOps.class).invert();

      It's super fast too. So I am curious what you find if / when you profile the code you created. For Android I just use DDMS for the info it provides and create memory dumps and such and use the YourKit Java profiler which has been quite good.

      Delete
    3. I did not do formal profiling. The test application was supposed to populate five random cells per second. It ran fine on my beefy development workstation, but once I put it on both my very slow tablet and my Nexus 4, it was very slow when using a 101x101 grid. Once I dropped that to 50x50, it was fine.

      The original implementation contained some intentional inefficiencies, or rather, design optimizations rather than execution optimizations. For example, each cell was indexed by a Coordinates instance rather than a pair of ints, (as in, "grid.at(Coordinates.x(5).y(7))"). I had not taken the time to pool the Coordinates objects: they were just created as needed. The backing data structure was a mapping of coordinates to cells, not a two-dimensional array. This allowed me to iterate over all the cells with "for (Cell cell : grid.cells()) ... " which looks great but is surely less performant than "for (i=0;i<width;i++) ..."

      The rendering code was visiting every cell every frame. I tried this with both an ImmediateLayer and with a Surface, with no major difference in performance. I was curious as to whether straight GL calls would make a difference, but I am a bit rusty there. I implemented a variant in which cells were filled in a surface only when their state changed, and that surface was rendered, and then the performance was fine, even with 101x101 with 10 random fills per second.

      Long story short: it wasn't designed to be production code, just some experiments. I was talking with one of my students yesterday about game design, and I mentioned that what I should really do here is stop tinkering with architecture and just get my interactive cluster growth system working, so that I can figure out if there's any "game" in there. Truth is, architecture can be relaxing for me when programming after long days of work. The semester will end soon enough, and then I can turn my eye to this problem with more gusto.

      Delete