Saturday, November 28, 2015

Two kinds of equality: Thinking about Java's equals method

My planning document for this coming Monday's CS222 class read something like, "Do something with equals/hashcode/toString." I took some time this morning to flesh that out into a real lesson plan, and I came across some interesting tidbits along the way. This started as a Facebook post, but I realized I wanted something more archival, so I've upgraded to the blog.

StackOverflow led me to EqualsVerifier, a recent open-source project to automate testing of the contract between equals and hashcode in Java. It looks interesting, although I have not plumbed into its documented possible false-positives. The same post led me to EqualsTester, which has been part of Guava for quite some time, although I never came across it before. I use Guava in practically every Java project, and so it looks like EqualsTester is going to have to go into my back of tricks. Writing tests for equals is kind of a drag and—in retrospect—clearly automatable, although I never really thought about automating it before, so I never looked for library support.

The most fascinating thing I came across, however, is this 2009 article by Odersky et al., which I understand to be an updated extract from Programming in Scala. The article describes four common pitfalls when writing equality-testing methods in Java, and as they point out, three of the four are covered in Bloch's classic Effective Java. That fourth one, though, hit me hard, forcing me to recognize that I had been conflating two distinct concepts in my Java programs. The first I will call content equality, which is when two objects should be considered equal because they represent the same concept in the problem domain. The second I will awkwardly call JRE equality, because it's the kind of equality that the equals method contract really specifies.

It's easy to illustrate this with an example. Ignoring all other design considerations for a moment, we can create a class like this:

public class Achievement {

    private String name;
    
    public Achievement(String name) {
        this.name=name;
    }
    
    public void setName(String name) {
        this.name=name;
    }
    
}


It might be reasonable then to expect a test like this to pass:
@Test
    public void testEquals() {
        Achievement a1 = new Achievement("Blogging");
        Achievement a2 = new Achievement("Blogging");
        assertTrue(a1.equals(a2));
    }

If you've been doing Java for a while, you know this will fail with the default equals() implementation. Let's take the same approach that most of my students tend toward: let Eclipse generate the equals and hashcode methods for us!
public class Achievement {

    private String name;
    
    public Achievement(String name) {
        this.name=name;
    }
    
    public void setName(String name) {
        this.name=name;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Achievement other = (Achievement) obj;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }
    
}

OK, now the test passes, so we are happy, right? Well, maybe not. What if we add this test?
@Test
    public void testCollectionsIntegration() {
        Set<Achievement> set = new HashSet<>();
        Achievement a = new Achievement("Blogging");
        set.add(a);
        a.setName("Writing");
        assertTrue(set.contains(a));
    }

This test fails, because we've hit what Odersky et al. list as Common Equality Pitfall #3: Define equals in terms of mutable fields. By changing the name of the achievement object, we alter its hash code, which means it "disappeared" from the set.

At the surface, it looks like there's no way out: you cannot make both tests pass. This makes it a beautiful error! The problem was never in the code, it's in how we think about the problem. The word "equals" is overloaded, as any student of programming languages knows, but that doesn't mean we can walk away from our fuzzy English understanding of the concept. The problem is that the seemingly innocuous unit test I introduced first assumes that Java's equals method should represent content equality, but that's not true. That is, the method is not about our tacit understanding of "equality": it's about making a complex runtime environment work in predictable ways.

It is fascinating that modern IDEs like Eclipse make it so incredibly easy to write these methods incorrectly. Indeed, the format of the equals method provided by Eclipse looks a lot like the template the Bloch himself provides in Effective Java. 

The solution suggested by Odersky et al. is simple and elegant: because there are distinct concepts for equals, have two implementations. The equals method inherited from Object will continue to do what it needs to do, and we introduce a new method to represent content equality. Following their example, we might introduce a method called equalContents, which we could test using code almost identical to our earlier misconceived test:

    @Test
    public void testEqualContent() {
        Achievement a1 = new Achievement("Blogging");
        Achievement a2 = new Achievement("Blogging");
        assertTrue(a1.equalsContent(a2));
    }

This leads to a simple implementation of our domain class:

public class Achievement {

    private String name;
    
    public Achievement(String name) {
        this.name=name;
    }
    
    public void setName(String name) {
        this.name=name;
    }
    
    public boolean equalsContent(Achievement other) {
        if (other==null) return false;
        if (other==this) return true;
        return this.name.equals(other.name);
    }
    
}

Nice and clean, without all that crufty tool-generated code to boot, and both testEqualContent and testCollectionsIntegration pass.

I know I cannot bring a cohort of sophomores with me on this adventure on Monday, so they will only be dipping their toes into it.

Wednesday, November 11, 2015

Quoting myself through another's notebook

I had a student last year who recorded many choice quotations from my courses. Often these were silly or whimsical thoughts I shared with my students to keep the spirit light, although she also captured bits of wisdom. Today, she shared this one with me:
Make opportunities, not excuses. In the real world, no one cares why you're late, or why your part of the project isn't finished. All they know is that you didn't do what you were supposed to. So, what they want to know, is how you plan to make it right—how you plan to contribute despite your failure, and how will that contribution move the project along?
That's a nice quotation, but there's a story behind it that's even more interesting. This student took two courses with me last year—my colloquium on game design in the Fall and my game development studio in the Spring. As she was reviewing her notes in preparation for a presentation, she found similar quotations each semester. However, according to her, the one from Spring was much better articulated than the one from Fall.

The idea represented by this quotation is one that I fall back on regularly when dealing with student teams, but I do not rehearse any particular articulation of the idea: when the time is right to bring this up, I talk about it extemporaneously. I wonder, then, was the difference in articulation simply random, or was there something about the environment or context that inspired me more in the Spring? Both were among my favorite teaching experiences, although one was more like a conventional course than the other. The Fall meeting was late in the day, while the Spring meeting was early in the morning. In Fall, everyone had their own projects, whereas in Spring we were one team.

I realize that the difference may be inconsequential, but I cannot help but wonder, and so I am glad she shared this story with me.