Friday, December 21, 2012

The Development of "The Underground Railroad in the Ohio River Valley"

I am just finishing my role as Technical Director of an educational game project, The Underground Railroad in the Ohio River Valley. My friend and colleague Ronald Morris received a grant from the Entertainment Software Association to produce a game on this theme, and he and his team worked through Summer on research and design. Early in the project, I offered the services of my CS315/515 Game Programming class to serve as the technical team, knowing that this would be a good immersive project for them.

Preproduction

Some of Ron's students did preliminary research and design before summer, but it was during the summer semester that most of the design happened. I was not heavily involved in this stage: I did a bit of consulting with the design team, primarily on what scope was appropriate for my team to complete in the Fall. At the end of the summer, I was given a set of design notes, which I transcribed and interpreted into a design wiki. The original format was a PowerPoint slideument, which was wholly inappropriate for the task, in part because there was no way to tell by looking at the printed copy that some of the data tables exceeded the size of the page. In retrospect, I wish I had remembered to point the design team toward Stone LeBrande's One Page Design approach—not that they necessarily need one page designs, but it certainly would have gotten them thinking more critically about how to communicate this information effectively. Regardless, by copying the information into a structured design wiki, I was able to build my own mental model of the game, codify nomenclature, and identify missing pieces. As usual, I used Google Sites as a lightweight wiki. The fact that the design was essentially static meant that I didn't have to deal with refactoring or maintaining the wiki: it was primarily for structured presentation to the technical team.

Lead designer Michael Smith demonstrates the paper prototype to community partner Jeannie  Regan-Dinnius of the Indiana Department of Natural Resources Division of Historic Preservation and Archaeology

Jeannie explores the physical prototype
My experience with the first semester of Morgan's Raid production taught me that wrangling 24 undergraduates took all of my time—time that would be better spent helping a smaller group to learn and make progress on the game. The chair of my department is very supportive of this kind of immersive learning work, and he permitted me to cap the enrollment of CS315/515 and institute an application process. This allowed me to communicate my expectations to students before the semester started, so every student who applied knew that we would be working in a studio space, with a client, and with the expectation of nine hours per week dedicated to production. Thanks again to the support of my chair, the team was given the use of our undergraduate research space, RB368.

RB368, before the start of the semester

Team logistics

My team of twelve gathered on the first day of the semester to begin work. I had prepped them over email and as part of the application process, and we used the first week to talk about project management using Scrum and the basics of Unity3D. I had used both of these in my semester at the Virginia B. Ball Center for Creative Inquiry and felt comfortable applying them to this new project. We set up a task board and got to work, planning seven two-week sprints to bring us to a successful completion.

The class was scheduled at 9AM Monday, Wednesday, and Friday, and this was when we held our "daily" stand-up meetings. The fact that many people had 10AM classes very quickly became an impediment, albeit an expected one. Unlike at the VBC, and unlike the fortunate scheduling of the Morgan's Raid Spring Team, these students had non-complementary schedules. As a result, there was very little whole-team collocated work outside of 9AM MWF. After-hours meetings were regularly initiated by the team, but I was rarely able to attend. Each student was on their honor to give nine hours of work per week to the project, but it would have been nicer to have more collocated time.

The technical team hard at work
The lack of collocation was a particular impediment with respect to art production. While the technical team was working for academic credit, the artists and musicians were hourly workers hired through the ESA grant. Ron was responsible for recruiting and managing them, although this caused some communication problems when the artists and musicians did not know to whom they were accountable—the professor paying their checks or the professor giving them work. I do not doubt that this would have been ameliorated had they worked in RB368 along with the technical team in crossfunctional teams, as was their direction; instead, each retreated to his or her own corner of campus. The predictable result was that a lot of time was wasted producing assets that were not usable.

After failing to complete any user stories in their first sprint, the technical team was able to pull itself up by its proverbial bootstraps and get productive. We conducted retrospectives at the end of each sprint to reflect on what was working and what wasn't working. The same issues tended to come up time and again, but I do feel like the team made progress in self-management and accountability during the semester. One of the perennial problems I have in dealing with students—which I suspect is a subset of dealing with humans—is their love of generality. Students will very gladly speak in platitudes and generalities, rather than nailing down specific people for specific problems. As a result, everyone tends to smile and nod and feel that a reflection was useful, but not necessarily hold each other accountable, since there's always a way to wiggle out. I got a copy of Patrick Kua's The Retrospective Handbook during the semester but have not had the chance to read it yet; I hope that I can find some practices within that I can bring to next semester's project (more on that in another post).

Managing and teaching

Managing these projects can be stressful. Each sprint, the team talks about trying to make steady progress, yet almost every sprint, it feels like there is no way they will complete their tasks on time. The team's retrospective notes show that they grow to desire steady progress. However, there is still a strong tendency to fall to the illusion of progress in the first half of the sprint, doing a lot of talking and hand-waving, and then rushing through the second half. As a result, validation was inconsistent, and tasks were rarely done done. Even at the end of the project, a major defect still existed in the build, and the response from the students working on this feature was, essentially, that the defect was enshrined in their implementation and we'd have to live with it. (NB: I fixed it in about half a day's work, but see the section on Expertise below.)

The stress this semester was compounded by my nagging doubt that I was not spending enough time sitting with the students and creating software with them. In the three hours per week that I saw the whole team, my time was spent on project management: clarifying design ideas, coordinating communication, pulling on the artists for sketches, and barely having time to look over shoulders at source code before people ran off to their next meetings. My personal schedule did not permit my coming in to after-hours meetings, in large part because of the welcome addition of my third son. I needed to prioritize family over working extra hours, and I have no regrets about this! Yet, I cannot shake the feeling like there had to be a way to influence the students more strongly toward best practices. I had put together a team programming manual in an attempt to codify collaborative practice, and though this document was distributed before the semester and reviewed in the first week, it went almost completely unheeded. Actually, it was a bit worse than that: the students religiously followed the parts that made life easier and ignored the parts that actually would have made the implementation better. In particular, I borrowed two bits of Robert C. Martin's Clean Code:
  • Don't use comments: they rot and indicate a failure of attention to careful design.
  • Don't repeat yourself: replace copy-paste code with better abstractions.
I'm sure you can guess which one of those two they followed. Hint: we had a lot of copy-pasted code with no comments to explain it. To be clear, I am not blaming the students for being novices. I think that a major factor here is that they could not see the master working his craft, and so I fear that many did not learn some of the deeper lessons I hoped for them. My principle contrast here is with the Morgan's Raid Spring Team, a team in which I was embedded for about nine hours per week in Spring 2011.

A found several missed opportunities for good software design as I've worked directly with the code this week. One of the most egregious is in the handling of percentage tables. The core game design involves tables of outcomes determined by percent likelihood, in the vein of a tabletop wargame. Code like this can be found throughout the implementation:

Random rand = new Random();
float roll = rand.nextFloat();
if (roll >= 0.0f && roll <= 0.05) {
  // do something
}
else if (roll > 0.05f && roll <= 0.10f) {
  // do something else
}
// etc.

This happens a lot in the code, including the redundant lower-bounds checking. A simple table abstraction, with methods to add entries and compute percentages, would have made this much more readable and maintainable. I'm not sure how to help students perceive these affordances, but it's something I want to try to spend more time on in both CS315/515 and CS222, which is the prerequisite.

There were several times when I was able to push students in the direction of better design, and some of these came up as significant learning outcomes in our semester retrospective. I worked with students to incorporate the Builder pattern in the domain model and frequently assisted with functional decomposition. I also guided the team to use a formal state machine analysis of the game and then use the state design pattern to implement it; this change impacted the entire team for the good. These interactions, in which I guided students toward great ideas of software design, showed up positively in course evaluations and the semester retrospective.
The team considers the task board and the state diagram
The picture above demonstrates the recurrence of a theme that shows up in my scholarship: the importance of dedicated space for these kinds of projects. The picture above shows how the whiteboard served as an information radiator: the tasks were always posted, and critical diagrams such as the state diagram stayed up for weeks. (In fact, the team seemed to really like the term "information radiator," incorporating it into their dialog and writing after my introducing it to them.) The team could have erased and reused the board for other purposes at any time—and they frequently did early in the semester—until this kind of information began being posted. Then, the team recognized the need to keep these models available for easy reference at all times. Consider also the handling of the game's various dialog boxes: they underwent many revisions before the team settled on a standardized way of handling them, and the visual designs occupied a side board for about half the semester.

Dialog box design standards

Working with clients and the community

Ron had a class at the same time as CS315/515, and so he could not attend our Sprint Review meetings. Knowing that we needed his feedback, we held Sprint Review Redux meetings at 10AM, and about four or five team members would stick around for it. It didn't come up until the course evaluations, but the students who stayed really valued this interaction. I had not thought much about the impact of this meeting on the students, primarily thinking of it as something I needed to ensure we were going in the right direction. After reading the evaluations, I realized that this such meetings are critical to the team's morale: these meetings helped them feel like valuable contributors to a bigger purpose rather than programming automata or hired help. I wonder if a different teaching schedule would have led to the team's being more consistent in completing tasks with highest quality, if they could have built more empathy for Ron's and the players' needs?

The technical team conducted two external playtests of the game late in the semester, knowing that the design team had already tested the core mechanics. These were after-hours, and participation was encouraged but optional. I think that, as with the meetings with Ron, these meetings helped the team to understand better the context and impact of their work. The conventional use of such playtesting is to identify problems, but in this case, much of the design was done before my students got their hands on it. I think that the primary benefit here was not in usability, but in helping my students remember what elementary school students are like, and that there were real end users for this project.

Team members Matt Brewer and Daniel Wilson playtesting at Motivate Our Minds

On expertise

The game is scheduled to "ship" on December 31, coinciding with the end of the ESA grant. Long ago, I had promised Ron that I would help out during the break if there was anything left to be done, and that's what I've been doing this week. Some of the artists are still on contract to finish assets, and so I continue to direct their efforts while also integrating their work into the project, fixing defects, and adding features. Working intimately with my students' code has forced me to think about the nature of novices and experts.

So far this week, I have spent twenty hours working on the project. This is more than the amount of attention I asked my students to spend on the project in a whole sprint, based on the number of credit-hours they earned. There is no definitive ratio that says how much more productive an expert is than a novice, and I've encountered claims from five to twenty. If we take the conservative estimate for the sake of discussion, then we can say that I accomplished this week what I could expect one of my students to do in about eleven weeks of work, or roughly 2/3 of a semester. If we further consider the productivity costs of interruptions—students having to manage four or five other courses, jobs, and relationships, whereas I'm working from home and still relying on my wife to wrangle the boys—then I've easily done more than I could expect a single student could do in an entire semester.

Putting it in this perspective, while I am frustrated to see some low quality code in the project, I think it's important to still qualify the learning experience as a success. Some of them have only been programming for two years, and that has been while enrolled as a full-time student! The students certainly learned a lot about working on an interdisciplinary team, with all the joys and pains that come with leadership, accountability, trust, empathy, and professionalism. They got to build a real software system, large enough that they were able to watch it nearly collapse under the weight of their own bad decisions yet still have time to fix it. They saw the need for something better: for formal state-based analysis, for design patterns, for object-oriented and functional decomposition, for having high standards. They got to see how hard it is to actually make a game, and that tools like Unity3D only barely manage the complexity—there are no silver bullets. They worked in C#, which was a new experience for many of them, and they got a taste for how it is like Java and yet got to tinker with delegates and properties. Sounds like a win to me.

Personal conclusions

I took the better part of the day to write this reflection because I know that by writing it, I would be able to better articulate what I learned from this experience. I have divided my conclusions into two sections. First, the ones that we all already know to be true, but it's good to be reminded:
  • Given the opportunity to make something significant, students will rise to the challenge.
  • Having a dedicated space is critical.
  • Collocation is important, especially across disciplines.
  • Face-to-face communication is always preferable.
  • Many of the outcomes of immersive learning would go unarticulated without making the time for reflection.
Here are some more specific notes that I should keep in mind:
  • I should encourage future teams strongly toward one-page design documents, if for no other reason then to get them thinking about how best to communicate their designs.
  • I need to notice when I perceive an affordance that my students do not. When this happens, I need to point out not just what action can be taken, but how I recognized it, so that they can learn to see them as well. 
  • I need to be conscientious about modeling professional behavior so that my students can learn to think, design, and communicate as professionals. Scheduling and executing code reviews is a good place to start.
  • To improve the impact of reflections and retrospectives, I should find a way to encourage specificity and accountability.
  • It is good to keep all the team members in regular contact with the community, including clients, playtesters, and other stakeholders.
  • It is better if all team members share goals and motivations. More specifically, it's better if everyone is working for credit or for pay, not a mixture of both. This leads to conflicts of priority within the team, particularly with the natural rhythm of the semester.
  • While I can lead a team in the production of someone else's design, I prefer leading teams of students through the holistic design process, from inspiration to finished product.

Current status

As of this writing, the game is in public beta and playable for free online. There is some work-in-progress art that comes up when you cross the Ohio River, but that should be replaced by the end of the day. A few team members have volunteered to do some QA over the break, and I believe all the defects have been ironed out.
The game's main title screen

Monday, December 17, 2012

Students' pride in their work: A CSEdWeek 2012 Post

Two weeks ago, I received an invitation from an undergraduate to come to the presentation of his independent study project. Turns out it was part of a series of presentations connected to a colleague's graph theory courses. Some of the presentations were connected with educational mobile applications, which piqued my curiosity enough that I decided to attend. I was one of three or four guests in the classroom, and the students demonstrated some decent technical prototypes. I provided a little bit of feedback from a HCI and design perspectives.

The experience made me think about the good work that my students are doing. The following Friday, I related the story to my CS222 class, who were a day away from the deadline on their six-week team projects. The teams and projects are self-selected, and this semester, I had three teams, each of which created a video game. I told the students, honestly, that their work was more interesting, and I asked whether they would like me to invite the rest of the department to come to their final project presentations. They responded immediately and excitedly that we should, and so I posted the announcement on the department's Facebook page.

Last Monday were the final presentations, and we brought in about eight outsiders—not bad for a class of twelve students! My recollection is that there were three faculty/staff there and around five undergraduates, all of whom had taken CS222 in the past. This is significant since these student-attendees came in with realistic expectations, having gone through the six-week project in a similar format themselves.

The three presentations went well. I had given them a presentation evaluation rubric ahead of time, and this rubric emphasized three categories: the executable release itself, achievement of milestone #2 objectives, and software architecture. Each of the groups covered these categories quite clearly in their allotted fifteen minutes. Note that game design was explicitly not part of the course: I made it clear to the students at the outset that they were welcome to create games in their six-week project, but that I would only formally evaluate them as software artifacts and not for their game design qualities, since that was outside the syllabus. Still, most of the questions from the audience were related to game design, and the teams provided responses that showed a good general understanding of game design; this is good for me, since hopefully I can recruit these students into my game design and game programming courses in the future!

It is well known that many people fear public speaking, and my students are not exceptional in this regard. I think that in most cases, these students would not volunteer to speak in front of a room of students, faculty, and staff; yet, when presented with the opportunity, they jumped at it. This speaks to the power of providing students with motivating contexts for their work. By giving students the freedom to choose a project that they wanted to complete, not only did I get them to commit to the requisite technical and collaborative tasks—they also gained the important social experience of public presentation.

I know I'm two days late in my CSEdWeek pledge to write a related blog post, but I hope that this post helps point in a fruitful direction. I've been using student-directed projects in my teaching for quite a few years with success, but I don't always make the opportunity for public presentation. Seeing how my students learned from this experience, I am going to try harder in future CS222 offerings to open up the final presentations. By putting it here on my blog, you can hold me to it next semester.

Tuesday, November 27, 2012

Applying Burgun's Lens of Game Design

I recently read Keith Burgun's Game Design Theory, an intriguing manifesto on game design and philosophy. The author argues that reasoned discourse about games requires a shared vocabulary, and to this end, he offers the following hierarchy of interactive systems.
Burgun's hierarchy of interactive systems
(Taken from What Makes a Game?)
Burgun's defines a Game as "a system of rules in which agents compete by making ambiguous decisions." More specifically, these decisions have to be endogenously meaningful in terms of the game mechanics. Whether or not one agrees with his definition is less important, in my opinion, than the fact that a vocabulary helps us move forward in the science of design.1 


The hierarchy of interactive systems permits Burgun to explain what he's not talking about: because he wants to address game design, he can cut out problems of interactive system designpuzzle design, and contest design. Many of his game design recommendations echo what others have to say. However, it's Burgun's zealotry that makes his work so valuable: he makes fewer and stronger recommendations. In the introduction, he discusses how he sees his work with relation to other books on game design:
For those who might defend these books by saying that they're only giving readers wiggle room or that they're allowing readers to come to their own conclusions about what games are: readers do not explicitly need to be given permission to do this. Thinking persons will come to their own conclusions, regardless of whether they read something wishy-washy, or something pointed... (Introduction, page xx)
This sets the tone for the rest of his book. He is unapologetic about his philosophy of game design and leaves it to the reader to decide whether they agree or not. In fact, I don't think anyone who is serious about game design could read the book without being either uplifted or offended.

In an attempt to better understand Burgun's philosophy, I decided to apply his lens to some of my work and my students' projects. The following analyses assume familiarity with his philosophy, and while this is best presented in the book, there is an overview in the freely-available Gamasutra article, What Makes a Game?


Morgan's Raid

According to Burgun's lens, Morgan's Raid is a Puzzle2 because there is no randomness: a play experience can always be replicated by repeating a series of decisions. The goal of the puzzle is to maximize score, which is themed in the game as Morgan's reputation. The impact of a player's raiding decisions on reputation are not immediately clear: a player must choose all of his orders prior to seeing their combined effect, although Basil Duke does provide thematic hints.
Basil Duke informs the player that he will help explain the puzzle.
There must be a series of decisions that maximizes reputation, but no one on the development team knows what it is. From our observations, groups of players will gladly make it a contest to see who gets the highest score, although their interest in the game wanes well before anyone finds the optimal path.

It is interesting to note that the original Spring Team design for Morgan's Raid involved more interesting behavior of the Union troops who were chasing Morgan, such that this would have made the project a Game. The original plan was for the Union's movement to be like Morgan's, making heuristic decisions in each town in an attempt to capture the player. However, working within our time constraints, we simplified the Union behavior to make them a fixed integer distance from Morgan. This distance is modified by player's decisions but in a fixed and predictable way.


Museum Assistant: Design an Exhibit

Museum Assistant is also a puzzle, albeit one with multiple solutions. Players get themed feedback based on the solution chosen; for example, creating an exhibit with African scientific artifacts from three different periods yields the generated exhibit title, "African Science through the Ages." The themes provide reason for players to try alternate paths, but from a mechanics point of view, one solution is as good as any other.


As with Morgan's Raid, Museum Assistant underwent a design change that resulted in its moving from Game to Puzzle on Burgun's hierarchy. Details of this design are described in my MeaningfulPlay paper, but to summarize, there were systems of input and output randomness that made it so the same series of game actions could produce different results. However, in the major redesign we agreed that we needed one good play experience, and that balancing the ambitious original design was outside of our scope. In terms of Burgun's hierarchy, the team decided to make a good Puzzle rather than a bad Game.


Equations Squared

While the previous two examples are student work, Equations Squared is my own, and it's certainly a Game. The player makes strategic decisions about placement of digits and operations, in terms of which to use and where to place them. Not all sequences are legal equations, and the scoring system rewards more complex equations. There is input randomness: the sequence of digits and operations you receive is different each time you play the game, so you very likely will never play the same game twice.



Auralboros

Auralboros is an experimental make-your-own-rhythm-game toy. You can make the experience as simple, as challenging, or as ridiculous as you want. To this end, Auralboros is simply an Interactive System.
Auralboros encourages players to make their own Contests out of matching keystrokes in rhythm. The system rewards such behavior with visual feedback. There's no ambiguous decisions: you either make and match rhythms or you don't. In fact, a successful strategy to seeing all the visual bells-and-whistles is to spam a single key—a useful debugging technique discovered by co-developer Ryan Thompson. However, this strategy is not much fun, as you end up just making a bad Contest.


EEClone

I still occasionally install and play Every Extend (though it seems the original download site is now gone), usually after explaining to students what an amazing experience it is. EEClone is my academic knockoff, designed to explore and teach how design patterns occur in game engine software.


Like its inspiration and namesake, EEClone is a Game. The timing and orientation of incoming obstacles is not known, and the player has to make meaningful and ambiguous decisions about maneuvering and the timing of explosions in order to succeed. Of all these analyses, this one is the simplest, but it also shows how things that are obviously Games fit nicely into the hierarchy.


Conclusions

Most of my effort the past few years has been on serious games. As I use the term, serious games are those that are designed to have a particular real-world impact on the player. For example, Museum Assistant is designed to encourage players to think about collecting and curating. Applying Burgun's lens to my students' projects gives rise to an intriguing contradiction: serious games need not be "games" at all. However, Museum Assistant is no less successful in meeting its design constraints for its being classified a Puzzle. This is because the real constraint for serious games is serious, not game. It is difficult to say whether or not these projects would better meet their goals (however one defines "better") if they were Games, because this would fundamentally change them. For example, we know that Morgan's Raid could be a better Game if the maps were randomized, but this violates the goal of familiarizing the player with actual Indiana geography.3

The Morgan's Raid and Museum Assistant teams recognized that there were opportunities to make a better game—or from a strict reading of Burgun, to make Systems and Puzzles into Games. Both teams eliminated randomness in the face of time constraints, knowing that balancing games would be much more time-consuming than testing puzzles. This was shared knowledge among the team, although they didn't have Burgun's concise language to communicate the sentiment. In a similar vein, the Auralboros team was aware that we weren't really making a game at all. It is interesting to note that Equations Squared is a Game by Burgun's hierarchy, but it is not a serious game by my own definition. For the player, it is simply supposed to be fun. The serious aspect of it is in the assessment of player's behavior, an assessment conducted by someone outside the magic circle but facilitated by score, badges, and demerits.

Applying Burgun's lens to these projects has helped me to understand his philosophy. However, since much of his philosophy is prescriptive, there is not much extrinsic value in applying the lens to completed projects. That is, I do not think I gained any new insight into these projects, but then again, as an academic, I've already studied them inside and out. I do look forward to having Burgun's philosophy in my utility belt for future design projects, particularly as a lens for identifying and discussing decisions that could alter a project's position in the hierarchy. Next semester, I will be leading an experimental six-credit interdisciplinary game design and development studio, and you can be sure I'll try to keep up my reflective practice here on the blog.


1 At MeaningfulPlay, I got into a bit of a debate with a gentleman over the definition of "fun." He argued that a friend's autobiographical game designed around the theme of depression and abuse, in which you decide whether or not to commit fantasized patricide, could not be fun. I said that if it was a game, and if I was using Koster's operational definition of fun as learning and mastery of a system, then it could be fun, though perhaps not in the informal "enjoyment" sense. My point was that if we defined these terms, we can be clear about our meaning and avoid the baggage. He didn't talk to me any more. I tell this story to demonstrate that I am surely in Burgun's design philosophy camp, and that there is a dangerous cultural divide in "game design" that prevents communication across traditions. See also Daniel Cook's excellent essay contrasting secular and mystical approaches to game design.

2 As a typographical convention, I will capitalize the layers of Burgun's hierarchy so as to distinguish the layer "Puzzle" from the general use of the word.

We are currently conducting an empirical study on the effectiveness of Morgan's Raid, and I will report the results here when we have them.

Tuesday, November 20, 2012

The PlayN Experience

In an online discussion, a student asked me the following with respect to Equations Squared:
How does PlayN compare to say Slick2D? Is development with it straightforward? Did you have any major issues along the way? Most of the example games listed don't even work in Firefox, but I noticed Equations Squared does. Was this extra work or are others just lazy? Answers to any of these questions would be greatly appreciated.
My summary of the design and development of Equations Squared mentions little about its technical architecture aside from the fact that I used PlayN. Briefly, PlayN is a game programming library that allows you to write games using a standard Java toolchain and then cross-compile them to several different platforms, including HTML5, Android, and iOS. It is based on the same fundamental technology as Google Web Toolkit, which permits development of AJAX Web applications from pure Java source; GWT is used to implement such powerful Web applications as GMail.

I got interested in PlayN because of the potential to build HTML5 applications while leveraging my knowledge of Java. I've done very little Javascript, and I know from personal and vicarious experience that browser-specific quirks are a special kind of hell. Yet, the browser is the common modern networked operating system, and I figured that a pure HTML5+JS solution would improve the chances of both longevity and adoption for a math assessment game. As much as I love standard Java and Unity3D, both require third-party plug-ins; my experience working with schools is that very often, if a teacher wanted to use such software, they often do not have permission or knowledge to install and configure it. The fact that PlayN is based on GWT was an easy sell for me, as I've been telling students for years that GWT represents the state-of-the-art.

Getting started with PlayN requires you to dive into Apache Maven. It took me some time to wrap my head around Maven, but now that I've used it, it's hard to imagine life without it. In a nutshell, it manages project dependencies and build lifecycles. Previously, I was used to this sequence:

  1. Find a library I want to use in a project.
  2. Download the binaries and documentation jars.
  3. Put these into lib and doc folders in my project.
  4. Adjust the Eclipse build path and, if necessary, native library locations.
Now, with Maven, the process looks like this:
  1. Find a library I want to use in a project.
  2. Edit my POM file.
Nice. While I was working on my game, new versions of some of my dependencies were released. Getting the new versions into my project was literally as easy as changing a version number in the POM.

It's not all milk and honey with Maven. The Eclipse integration allowed me to import my project without too much trouble, but I found the integration imperfect, and I ended up doing a combination of developing and compiling in Eclipse while building release candidates on the command line. Fortunately, you don't have to start by mastering all of Maven: the PlayN Getting Started Guide provides just enough to get you started. As with anything, follow those steps, and then take some time to look at what really happened with your project.

Perhaps the biggest non-technical hurdle to getting into PlayN is the rendering model. There is a complex system of layers, and I found myself having to refer to the documentation time and time again to ensure that I was using each properly. I ended up making extensive use of GroupLayers to handle the hierarchical nature of the game. For example, each of the dashboards is a group layer consisting of nested layers, and moving a digit from the right-hand side tray to the board ends reparenting it from one layer to another. Once I got into the flow, it was no trouble at all, but I'm not confident I could sit down right now and whip up a demo without re-reading the documentation again. Contrast Slick, for example, where I know I could whip up a little demo that makes a guy dance on the screen in almost no time; more on Slick later.

Screenshot of the game
I used the excellent TriplePlay, Pythagoras, and React libraries along with PlayN, along with my old favorite, Guava. All the logic of the game was written using test-driven development, which saved me several times from introducing regression defects, particularly in the parsing of various kinds of arithmetic expressions.

While TriplePlay made handling text much easier, I did find uncover cross-browser problems when using this library—some of which have been fixed since then. There are subtle problems with text alignment, which caused me trouble when the main tiles of my game were being dynamically generated. To avoid having to write krufty browser-specific code, I ended up replacing the dynamic tiles with static images from a sprite sheet. I'm guessing that the failure of some PlayN projects to work in Firefox is due to similar issues combined with lack of QA. Once again, thanks to all my testers!

The sprite sheet that replaced dynamically-generated tiles
The student's question asks about a comparison of PlayN to Slick2D, the latter of which was used to develop Morgan's Raid. All the important technical aspects of Morgan's Raid could have been done in PlayN; in fact, some may have been easier if I had known about Maven at the time, since we used a build server with continuous integration and test-driven development. Neither Slick nor PlayN save you from having to do some fiddling with project configurations and native libraries. Slick is almost certainly faster to get up and running, assuming you don't run into LWJGL version problems—which you almost certainly will at some point. Slick+Maven may be worth my investigating.

The real decision between PlayN and Slick2D comes down to target platform and choosing where you want to fight with it. Slick will always require the client to have Java installed, or you'll have to heavily invest in making custom installers. Remember that installing Java—or anything—is seriously scary or impossible for some people.  With PlayN, you can deploy straight to HTML5+JS, but in my experience, you still need to do platform testing even if you do everything the documentation tells you: some things still won't work right. You can also build to recent versions of Android: I tried this and it worked well for me, but I didn't have time to design the game for the various mobile screen sizes. I cannot comment from personal experience on the Flash or iOS support in PlayN, though there seem to be some problems with the Flash compiler based on discussions I've seen on the mailing list.

I hope that this summary of my experience with PlayN, and the comparison to my Slick experience, is useful to you. Feel free to leave comments below and I'll do my best to get back to you.

Friday, November 9, 2012

Student-created badges for milestone presentation evaluation

The students in my advanced programming class have their Milestone 1 presentations coming up next Wednesday. It's a three week increment, and so most of their attention the last two weeks has been on their projects. I was going to publish a rubric for the Milestone 1 presentations, but then I realized that it would be far more interesting for the students to come up with the assessment plan. It turns out that all three of my teams are making simple games as their projects—a fluke that's not happened in this class before—and so I decided to dovetail on that and work with them to do achievement-based assessment.

Normally, I spend the few minutes before 1PM setting up my laptop. I knew I wouldn't need it today, so at 12:55, I was standing at the podium and ready to begin. All but one of the students were already there, so I started an informal discussion about why classes are so tightly tied to time. It was just passing thought to me, the relationship of time-locked learning to the educational needs of the industrial revolution, when I realized that there was a path to achievements from here.

I challenged them to think about what education might look like if we weren't slaves to the clock. A student mentioned that he had a class where they covered a chapter per three-hour meeting, and if they got done early, they just went home. I noted that textbook-oriented learning is part of the same phenomenon, having emerged from a time when information was scarce and structured, but that this generation of undergraduates has lived through the transition to information's being abundant and unstructured. One of the students had been homeschooled in his earlier years, and he described how he had content-oriented tasks, and he could play whenever he was done with them. We agreed this was still basically the same as chapter-per-meeting, "content"-based design, but with more scheduling freedom.

When I challenged them to think about how learning was authenticated, a student mentioned portfolios as an alternative to transcripts, which I agreed was a good idea. Another mentioned that one way to demonstrate knowledge is to teach it. Aha! From there, I explained how there was a peer-to-peer connection here: the one teaching the material could verify that the learner had learned it, but the learner could also verify that the teacher knew it. How a third party would interpret such peer-oriented credentialing would depend on how much they trusted the person who signed it. That is, a network of trust supplants the hierarchy—an idea that was certainly inspired by my recent watching of Manuel Lima's RSA Animate talk.

This got us into a discussion of the Badges for Lifelong Learning initiative, as well as a brief overview of the idea of badge-creation as a form of reflective practice. This latter idea came directly from my discussions with Lucas Blair at MeaningfulPlay. I challenged the students to think of badges that we could use for the Milestone 1 presentations. There were some thoughtful looks but mostly confusion, so I led with an example, starting with the criteria, then polling for what to title it, and finally, plying my incredible chalk-art skills to make an icon—a sequence of development I encouraged them to follow.


I got the students into cross-team groups to come up with some of their own, inviting them to share their badges on the board. As they got started, I reminded them that I didn't care if one person gave the milestone presentation or if they all rotated speakers. Personally, I think speaker rotation is silly—and I told them this—but it seems someone has convinced them before coming into my class that speaker rotation is good.

Here's what we designed for achievements, in the sequence we discussed them.





Those icons are all student-created, except for the bald Mr. Clean guy. I can't believe they used a broom and not the iconic strong cross-armed bald guy. I was a little disappointed that one of the groups did not fully realize the value of icons, as I value the analogical thinking necessary to invent them. Note, for example, the visual pun in Home on the Range. (Assuming I'm interpreting that correctly.)

As they were designing these badges, a student asked if they could make "negative badges," and I told them to do so as they wished. When we got to talking about Train Wreck, I asked them what they thought about the negative badge. A student spoke up and said, "Well, it's still a learning experience." Que alegria! I told them that I could not have said it any better myself and we moved on.

When we got to Orator, I pointed out that it was significantly different from the rest, as it was competitive. They seemed to agree with me, though silently, that it was OK to have some competitive badges in addition to the ones anyone could earn.

With just a few minutes left in the session, I pointed out that what we had done for the last half an hour was exactly reflective practice, and that this was a major goal of the learning experience. I asked if they thought this set of badges would suffice for Wednesday, and they agreed. I told them I'd collate the badges and provide them with sheets for Wednesday's presentations, and that afterwards we would talk about the process.

As it turns out, I had an Ed.D. student observing that day as well. As we walked out of the room, he asked, "Is that normal? ... Are you normal?" I laughed and explained that certainly some days are better than others, but today, we had everybody deeply engaged, and it was a great meeting.

Sunday, October 21, 2012

MeaningfulPlay 2012, Day 3

(Day 1 | Day 2)

Saturday was the third and final day of the conference. The opening keynote was from John Ferrara, author of the recently-published Playful Design and creator of Fitter Critters, which took second place in the Apps for Healthy Kids challenge. He began with a discussion of the hype cycle and described how he believes serious games are nearing the peak of inflated expectation—a belief with which I tend to agree.

After a strong introduction in his talk, I was disappointed with Fitter Critters. I question the interface design, which was built upon showing detailed nutritional information to 8-12 year-olds. One of the features of which he was most proud was that players can scroll through the dozens of food choices and combine them to make new recipes, which if are healthy can be sold for more than the cost of the ingredients. I have to wonder, can you make chocolate-covered broccoli?

Ferrara claimed that realistic data is a critical element for serious game design, but I remain unconvinced. A question from the audience hit the nail on the head (although it was asked somewhat awkwardly and I don't think Ferrara caught this interpretation). Why should we believe that putting detailed nutritional information into a serious game will make children learn to make healthy decisions when we know that putting detailed military information into first-person shooters does not teach kids to shoot each other?

This train of thought—and conversations with my colleagues—helped me to articulate one of the problems I was having with the conference: there was much more talk about design justification than about design. Put another way, there was a lot of hype around ideas, but not much discussion of what players are actually doing. I could point to some specific examples, but I'm going to wait on that until I have time to write proper critical analyses. (I'm still sitting on some design frustrations from the game showcase at FDG!)

I missed the start of the next session, in part because the intersession break was reduced to 15 minutes. As I was considering which session to jump into, I was fortunate to run into Casey O'Donnell, who had asked a question after my talk that stuck in my head. The more I thought about it, the more I was convinced that I did not fully understand his question. We ended up talking for some time about game design, research, and higher education. I am looking forward to reading his work in which he conducted ethnographic studies of a commercial game studio, since this will provide an excellent counterpart to my and Brian McNely's study of the VBC environment.

I expressed to O'Donnell how I was still trying to understand who these "game studies" people were and what they valued. He described the ecosystem as involving game studies, industry, and makers as three parts of a Venn diagram, and that MeaningfulPlay was positioned roughly in between. I wondered at the lack of references to some of my favorite designer/writers—Koster, Cook, Burgun, Schell—and he helped me to understand something about "game studies" that I would never have thought of: many people in game studies intentionally separate themselves from makers, as a form of removing bias. In my mind, as a maker, I had been mistakenly characterizing their approach as unaligned with design, but in fact, it might be that it's more orthogonal. I'm going to have to read more of the non-maker game studies literature to try to build empathy for them, since as I mention below, we seem to value vastly different things.

But first, a few words about Lucas Blair of Little Bird Games. His talk Friday about achievements was very interesting: he designed a study around measuring the effectiveness of achievements, in various combinations, on player performance and retention in a serious game about PTSD. He presented his original hypotheses, many of which turned out to be false, and then described how he dove further into the research to try to understand why. I love it when scholars admit that it's not a rosy ride each time: in fact, if you don't report mistakes, I suspect you're either not honest or not doing challenging enough work!

In any case, I had a fascinating discussion with Blair and his colleague, Danielle Chelles. He provided more context about his BadgeForge project, a fascinating part of the Badges for Lifelong Learning initiative that is devoted to user-created badges. Blair's background in instructional design was clearly an asset in this endeavor, and I found myself in deep agreement with his fundamental premise: that the best way to promote lifetime learning was to allow learners to craft their own badges. The margins are too small to include all the details of our conversation, but it gave me a lot to think about—both as a professor and as a parent—and I look forward to following up with the good folks at Little Bird.

The closing keynote was given by Michael John of EA, who has had a long career in the games industry and, notably for this conference, has recently transitioned into a leadership position in GLASS, an attempt to throw AAA-commercial-scale resources at the problems of serious games. His presentation was engaging, as he told his story of growing up with coin-op arcades, being a 3D level designer in the dawn of commodity 3D graphics, and then moving into a leadership position at EA.

After the keynote, there was a short closing ceremony, including the awarding of prizes from the games showcase on Thursday. Once again, I don't want to point fingers—at least not yet—but some of the judges' choices for winners and runners-up really astounded me. Some of them had such bad interfaces as to be nearly unplayable, and not because of the complexity of the game, but simply because of a failure of human-centric design. Others struck me as pretentious concept pieces more appropriate for Ludum Dare or Global Game Jam. (I should know; I made The Escape.) Yet others struck me as clearly leading to goals completely contrary to the designer's intention, although with a lot of glam and glitz, earned thumbs-up from the judges. On the positive side, at least TiltFactor was recognized for their good work. 

To be clear, I'll be the first person to point out the flaws in my students' designs as well. That's my job, after all. I wonder what the job is of the judges, or what it is they value: concept? execution? packaging? To me, the only important thing for serious games is this: is it fit for the purpose? That is, does it work, does the play experience lead to the desired outcome? To this end, one has to consider what the player does, since this is what drives the learning. That certainly syncs up with everything I know about the science of teaching and learning, anyway. To this end, one who judges a game without playing is not judging the game at all but a marketing pitch.

I had a good time at MeaningfulPlay, although it did not match my expectations. Most disappointingly, and beyond disciplinary differences, some of the research presented simply did not match the criteria for scholarship presented in Glassick et al.—one of my favorite books on higher education. It was good for me to learn about the values of the community, especially with respect to the self-identified "game studies" scholars. The best part of the conference was that I met some really amazing people from all over the country, kindred spirits in industry and academia whose work I will certainly watch. I had a good laugh with a fellow associate professor about the post-tenure self-discovery process, and it's good to find role models in this space of academic makers. But reflections on which path I want to make for the next phase of my career is a topic for another day.

Friday, October 19, 2012

MeaningfulPlay 2012, Day 2

Hot on the heels of finishing my Day 1 notes, here's day 2.

The morning keynote by Phaedra Boinodiris was excellent. She talked about the history, design, and development of the IBM's Innov8 series of games, and then went into more recent work with the military. These case studies provided her a platform to express her vision of serious games and meaningful play.

As I understand it, her vision is that in the near future, we will see more multiplayer networked serious games that are built upon real live data. She pointed out that the scientific models underlying the games are carefully designed but not intended to be perfect: rather, part of the game is recognizing when it is unrealistic, peeling back the interface layer, understanding the problems with the model, and thereby identifying new scientific questions. This is a powerful model, and I note that it relies upon computational thinking on the part of the player/analyst. That is, to get the most of the game, one needs to understand how these systems are assembled and made to work together. This seems like a great opportunity for clever interface design, to empower the maximal number of players to interact most effectively with the game system.

Phaedra presented a five-step approach to serious game design that I found interesting. (Sorry no photographs—lighting was terrible.) She referred to these as a "Five step approach to saving a LOT of time and money. Here's her list and my notes.

  1. ROI. She mentioned more than once that, to sell the idea of serious games to any potential partner, one needs to focus on return-on-investment.
  2. Learning/Pain Points. That is, identify the problem to be solved through serious games. The ordering of these first two is opposite of how I usually frame the discussion: I like to talk about the problem I'm solving, and then (sometimes) about how it might save some money. Maybe I'll try turning this around next time I pitch a project to an industrial connection.
  3. Puzzles/Experience to Teach and Motivate. If I understood her correctly, this is how she described the core design process: matching the game design to the problem to be solved. Like one of yesterday's presentations, she referred to avoiding "chocolate-covered broccoli," but I find myself feeling bad for broccoli. I guess I tend to root for the underdogs.
  4. Genre. 
  5. Platform. 'nuff said on these two.
After a break was the session in which I was speaking. 
The Green Room, about two minutes before the start of my presentation
The paper I presented is coauthored with Brian McNely, and it is titled, "A case study of a five-step design thinking process in educational museum game design." This is the second paper in a series based on my VBC seminar, the first having been published and presented at SIGDOC. This paper traced one thread of design from initial inspirations through to the finished Museum Assistant game. I was happy with the presentation and I think it was well received. I won't say anything else about it here, since the point of the post is to talk about the rest of my experience, but feel free to email if you'd like to know more about the work.

One of the talks after mine was given by Konstantin Mitgutsch, whom I had seen present at FDG. At MeaningfulPlay, he presented an analysis of several interviews he conducted with serious games designers about serious games. I found this work very interesting, and three specific points stand out in my memory. First, more than one of his interviewees identified Dungeons & Dragons as an inspirational serious game from their own youths. Second, there was significant variance in how these designers defined "serious game," despite their acclaim in this arena. Third, a vast majority of those he interviewed claimed to have no interest in formal assessment. I asked Konstantin later if he thought that this disregard for assessment was related to a conflation of "assessment" with "attempts at quantitative measurement," and he thought that was certainly part of it, as the designers generally expressed desires that their games "work" on the player.

I caught lunch with a group including some of the other speakers from my session. Three other gentlemen and I had an interesting conversation, much of it rooted in the challenges of games and politics, especially with respect to uncomfortable historical topics. We started a conversation about the problems of semantics in words like "fun," but I feel like that dropped off because people didn't want to touch it, which I think is a lost opportunity: I would like to see more rigorous treatment of syntax and semantics from the game studies perspective. Discrediting the word "fun" because it's used in widely different ways is not useful without a plan for representing the concepts associated with it, and I refer specifically here to Koster's use of "fun" to mean that particular kind of enjoyment that comes from learning—clearly distinct and distinguishable from silliness or glee. 

In the afternoon, I went to a session on mechanics, where the highlight for me was a paper on hybrid games that respect the human nature of play. Gifford Cheung described how digital implementations of games generally don't allow for much of what happens in a human play experience, such as modifying rules, allowing do-overs, etc. His particular project involved augmenting a smart phone with NFC and then using chip-enhanced playing cards. It looked a bit clunky, but I love the idea and was glad to hear about their philosophical approach. He used automatic bowling scoring systems as an example of a good design, in that you can go in and change a frame right away if something is inaccurate— you don't have to accept the error or wait until the end of the game. This was interesting to me, as I find these devices an abomination when the solution to this problem already exists: paper and math. Call me old fashioned, but to me, a big part of bowling is drawing on the scoreboard and filling in X's and slashes.

In the same session was an interesting paper about energy consumption challenges. I had not heard of this before, but they are games where buildings on college campuses compete to reduce their energy usage. The authors identified several problems with naive approaches to scoring such competitions. Not much else to say here except that it was interesting and well-presented, and you can find out more at the KukuiCup site.

Next afternoon session, the best bit was Lucas Blair's paper investigating the impact of achievements, and specifically how they integrate with gameplay, on what players get out of the play experience. I respect a scholar who is happy to tell the audience that his hypotheses were wrong, because this can be as interesting as if they are right.

I was hoping to talk to Lucas about his taxonomy of games, as I was surprised to hear him start by referring to achievements as metagame despite their clear integration with the game design process and particularly their utility as a feedback mechanism when, on a bit of a whim, I struck up a conversation with Scott Nicholson of Syracuse University. We ended up having a wide-ranging conversation, including a detailed discussion of what he's doing at Syracuse to get people involved in game design an development without creating a new academic program, and involving the wider community. I'll definitely be following up with him.

We talked with Scott for so long that this took us to dinner time, so my colleagues and I went to get some dinner. After a bit of searching, I was able to find the place where I met some MSU colleagues a year or two ago for beer rhetoric—Beggar's Banquet, where I had an excellent burger and a smooth Left Hand Milk Stout.

I would be remiss if I did not say that I started the day with coffee and Wi-Fi at COSI, a delicious soup and sandwich for lunch at COSI, and after dinner, tea and Wi-Fi at COSI. The staff have been very friendly, and the music is perfectly unoffensive and unobtrusive to both conversation and blogging. Thanks, COSI!

MeaningfulPlay 2012 Day 1

I am writing from the MeaningfulPlay 2012 conference, a biannual conference held at Michigan State University. I have two travelling companions: Nick, an undergraduate who was on my VBC team, and Michael, graduate student and president of the BSU Society for Game Design and Development.

The opening keynote was by Donald Brinkman of Microsoft Research. It was an interesting presentation, although there was not much new there for me. It was an excellent presentation of ideas that I think are important for anyone in serious games to know, with a focus on his participation in OpenBadges. I have been on the fence regarding badges, originally dismissing them as extrinsic motivators, but since then I've realized that it's a tool that can be used for good or evil.

Donald Brinkman discussing the continuum of gamification from Bogostian to McGonigalian.
The best point Brinkman made was that badges give us data for longitudinal study. If a kid has an awesome learning experience, he should feel intrinsic satisfaction in the process; giving a badge simultaneously allows us to track how this particular person uses (or doesn't use) this learning in the future. Since our current system only tracks numbers, which we know don't actually represent learning, this gives a powerful new vector for information processing at Internet scale.

He also included an excellent distinction between education and training, a presentation of the contrast that I had not previously seen. It comes from James Carse's Finite and Infinite Games.


To be prepared against surprise is to be trained.
To be prepared for surprise is to be educated.
Education discovers an increasing richness in the past, because it sees what is unfinished there.
Training regards the past as finished and the future as to be finished.
Education leads to a continuing self-discovery; training leads toward a final self-definition.
Training repeats a completed past in the future. Education continues an unfinished past into the future.
Along these lines, he raised the evocative question of "nomic badges" for the infinite learning game. His own guidelines for infinite games are that (a) participation has to be optional, (b) failure is certain, and (c) progress is angular. I certainly feel like Brinkman is a kindred spirit, but I did not have the opportunity to find him later in the day. I got the impression he was only going to be around for the first day of the conference, but I'll keep my eyes open. I would like to talk to him particularly about the plan for dealing with the overwhelming bureaucracy of public education.

Good quotation: "... High stakes assessment. Get the hell away from it."

Turns out that Brinkman has also been fighting the good fight against tchotchkes and has been pushing Microsoft toward more ecologically-responsible solutions. He gave away a dozen epiphytes to those who asked questions, complete with Microsoft-branded misters. In fact, Nick got one, shown below.



After the keynote, I went to a panel organized by the people who write the PlayThePast blog. It was good to see these people in person, as I am a regular reader and even a contributor. I was most excited to see Ethan Watrall, who I have met before, and Roger Travis, who I had never met. Turns out Ethan was double-booked and couldn't make the panel, and Roger was being beamed in via Skype. Still, I got to chat with Roger a bit after the presentation by talking loudly and awkwardly into a computer screen, but it was still better than nothing. He had some interesting ideas on how to tie classics and humanities into a nascent paper I'm writing on games and software development... stay tuned.

The panel itself had interesting content, but it was too much "panel talks to you" and not enough "panel talks with you." There were only five minutes for questions at the end, at which point it may as well have been an ad hoc paper session. Still, I like the content, especially James' deconstruction of both published and fan-modded versions of historical board games. I need to think more about what this means from a designer's and academic's perspective.

After the panel and a not-quick-but-cheap lunch at a local Thai restaurant, I went to a paper session, which was OK. As with FDG, I was disappointed by the number of bullet points on slides. Maybe even worse, several presenters had complex data tables that they clicked through in under three seconds. It displays a disregard for the attendee, a failure to consider the needs of the observer. I hope that in my presentation, despite by 35 slides in 15 minutes (to be given day 2), people will be inspired to get more details from the paper. I was also disappointed in a particular presentation in which the presenter showed non-statistically-significant results, pointed out they were not significant, mentioned that they should not be presented because of the lack of significance, then referred to them again later, and brought them up in the conclusions, again pointing out that they weren't significant. Sounds like fishing to me.

The papers are given in 15 minute blocks, four of them in an hour session. This is a crazy pace. Questions are held until the end, at which point there's no time left. Malcolm Ryan was on a roll when he ran out of time, but I'm glad he took the extra minute to click through to his conclusions slide:

Well put.
The day's closing keynote was from Ann DeMarle, who talked about her project, BreakAway, a game ostensibly about preventing violence against women by changing the attitudes of youth. The game had an impossible set of constraints, among which were that it could not depict violence, it could not depict women as victims, and it had to be playable by anybody. For the narrative, they picked the universal language of soccer, which is really quite clever. DeMarle had a compelling story to tell, but we did not really get to see the core gameplay, and so I'm left wondering how the mechanics are tied to the learning objectives.

After a dinner break, there was a reception with a game showcase and research poster session. This is the session where my students presented Morgan's Raid and Museum Assistant. We got positive feedback from both, and once again, I felt great pride in my students and their work.

Nick is sporting a Root Beer Float Studio shirt, alternative design.
Michael is an pro at contextualizing the game in the history.
Also, there was an open bar. Note to myself: all conferences should have open bars.
A beer I drank, and a unicorn head I did not wear.
Also also, there was garlic & rosemary olive oil and dipping bread. Note to myself: not mandatory for all conferences, but highly recommended.

I spoke to a lot of interesting people, but the highlight was definitely talking to Geoff Kaufman from TiltFactor lab at Dartmouth. He had several games on his tables, the first of which I saw was Buffalo, a game that looks superficially like Apples to Apples and prompted me to ask if he was familiar with Bill Rapaport's famous sentence. (He was not.)

Geoff mentioned that the game is designed to reduce stereotypes. Now, there are a lot of people here making claims that are way beyond what's reasonable—especially at the showcase. I asked if they had studies that showed that the game worked, and to my pleasant surprise, they sure did. Turns out Geoff is a postdoc and actually runs these studies. We had a great conversation about how his lab operates, and he was eager to answer questions about all the games he brought. Long story short, I was blown away with how absolutely right these guys are doing it. By contrast, many of the other projects I saw clearly had spent more time on design justification than on design; maybe I'll write about them another time, but not now. Anyway kudos to TiltFactor.

Monday, October 1, 2012

The Story of Equations Squared

It was the middle of May when I first wrote about the ETS Math Assessment Game Challenge and my search for fun games that use traditional mathematical notation. At the time, I was contemplating which of several projects to undertake for my Summer of Professional Development. I had decided the previous Fall semester that I would not take on any extra obligations for Summer 2012, using it instead as a "summer sabbatical," knocking a few items off of my personal and professional punch lists.

As I considered where to invest my Summer efforts, I found my mind returning to the design of a math assessment game. I have been working with students and colleagues at Ball State for the last few years on educational games—that is, games that produce predictable learning outcomes—but I had not considered the challenges of making games for assessment.

There were two ways in which the contest sponsors made this more appealing to consider. First, there were two published learning progressions, based on sound educational research, so I knew I wouldn't have to try to invent a taxonomy of learning. Given the progressions, I framed the design problem as one of mapping the learning progression levels back to measurable in-game behaviors. The second factor was the appealing prizes: cash prize for first place, and top three get a trip to talk about games and education research with ETS staff. Cash is a powerful motivator when one has turned down work offers in order to focus on personal growth for the summer months, and professional networking is always in season.

I decided to pursue an entry to the challenge, figuring it would take about two weeks to generate a testable digital prototype. Spoiler alert: I had forgotten to take into account Hofstadter's Law.

First Attempt
I had been playing a lot of Flash Duel, and this inspired my first attempt. I worked out a game that embraced the board-as-number-line property of Flash Duel and Engarde, giving the player a hand of cards representing both numbers and numeric operations (represented by the Jack above). I wanted to take advantage of two-player competitive gameplay, perhaps even using combat as a metaphor, and this would also have allowed me to hit another item on my punchlist: make an online multiplayer game. However, I had a hard time making this prototype fun. I decided to put my Lego away for now and switch tactics.

Second Attempt
My wife had found out about the game Equate, which appears to marry Scrabble and algebra. I still have not played it, but I liked the idea of reinforcing well-formed mathematical "sentences" in the same way that Scrabble requires (and assesses) knowledge of well-formed English words. I became particularly interested in scoring systems that reflect increasingly mature understandings of algebraic relationships, particularly with respect to variables. I spent enough time with this prototype to try a digital version. This prototype had a nice feature, that it could be created as a single-player game in case there were problems with multiplayer.

I had recently become aware of the PlayN library for making cross-platform games. I love the technological wizardry of GWT, which allows you to write your application in one language and deploy to many, hiding the seamy side of cross-browser Web development; PlayN is essentially that, for games. Working with PlayN required me to learn Maven, which had also been on my imaginary punch list, through very far down. After having invested significant time into understanding how Maven works, it's hard to imagine doing another serious project without it, despite the headaches. Learning a new API and project automation platform took time and energy away from core game design and development tasks, but I have no regrets along these lines.

As I tinkered with technology, I also considered how to align learning outcomes with in-game actions. Achievements seemed a natural way to do this, and I developed several different achievement taxonomies over the subsequent weeks and months. The learning progressions necessarily combine evidence of learning and evidence of ignorance, and to capture this, I used a system of demerits and badges in my game. The complete details for these are provided on the "For Educators" section of the game's Web site.

Implementing the game was an opportunity to practice what I preach. I started with my domain model, developing it via TDD. Very quickly, I hit my first major impediment: in order to implement the achievements I had sketched, I would need to write my own expression parser. I started with an ad hoc approach that worked for simple, non-variable cases, and over time, I revised this to use the Shunting-yard algorithm to build an in-memory parse tree. This parse tree is traversed by various visitors to determine progress towards demerits and badges.

Debugging the parse tree
Without unit tests to save me from regression defects, I would certainly have not completed the project. Summer 2012 was supposed to be Summer Sabbatical Happy Learning Time, but it turned out to be Summer of Unexpected Interruptions. The good news is that we discovered that our master bathroom floor was rotting away before anyone fell through it. The bad news is that we had contractors in and around the house for about seven weeks to fix it, along with other planned maintenance and improvement projects. Combined with a death in the family and associated travel, it was not the focused and productive three months I hoped it would be. We got to spend some quality time with loved ones, which is really was more important than implementing a multiplayer mode for a summer project. In any case, keeping written a task list and a suite of passing unit tests allowed me to leave the project for over a week, then drop back into productive development very quickly. This worked much better than when, in the ignorance of youth, I tried to keep project plans entirely in my head—projects which never saw the light of day.

Unit tests cannot save one from bad user-interface design decisions. My first fully-playable prototype used a "drag and drop" motif for placing tiles on the board. As I manually tested each build, I was able to get the feature working properly, and I was quite pleased with myself. Then, I tried playing through a game or two. It is really tedious to drag and drop tiles with the mouse. It wasn't quite as bad on the Android tablet, but it was still awkward, especially because mine is pretty clunky and unresponsive. I decided to revise the user interface to use a click-select, click-place model. The result was a much improved user experience. Unfortunately, the drag-and-drop assumption had been buried very deeply into my software architecture, and I had to rewrite nearly all of the user-interface system.

Screenshot of the final version

The working title for the game was Algebra Game. Catchy, I know. Brainstorming with my wife helped me to realize I really wanted to highlight the fact that the game is about equations. Since the game takes places on squares, I ended up with the slightly punny Equations Squared. I hope no math zealots are upset that one cannot actually create quadratic equations in the game.

I had decided from early in the project that even though PlayN supported many different platforms, I would focus on the HTML target. Everybody has a browser, so I hoped this would give the biggest impact; I'm still interested in how the game can be adjusted for Android and iOS, but right now, it's HTML-only. PlayN builds upon the GWT compiler technology, and it has similar dependencies on browser implementation of CSS. Web developers: you can see where this is going. Alignment of symbols and handling of canvas were predictable on chromium-based browsers and Firefox, but IE had problems. For the alignment of text on tiles, I ended up replacing the dynamically-created tiles with an image sheet, which is all but guaranteed to work on all modern browsers. After my project was submitted to the challenge, I got a very helpful fix for other IE problems, but I decided to leave the code alone during judging; I ended up putting up a little warning message regarding IE not being fully supported, which hopefully does not drive too many people away.

Example boards
Late in the Summer, I started work on the Website for the game. I knew I wanted to host it on my departmental Web server because it was easy to integrate secure file transfer into an automated build process; this could have been done with other hosts, but I already had SSH set up for these machines. After a bit of fumbling around, I came across JQueryUI, which was a joy to use. There are a few niggling details that I wanted to fix on the site, but I'm happy with the result. The best part of it is under the hood. I wrote an internal domain-specific language in Javascript to abstract the construction of sample boards. For example, that first board is created with the following call:

new Board(5).horizontal(0,2,'1+1=2').caption('Horizontal')

This allowed me to abstract the construction of sample boards from the code that creates the table of CSS-colored content. With a little more TLC, I could have factored out the first two arguments that specify the starting coordinate of the equation, yielding a Clean DSL, but despite this, I'm pretty happy with the approach. "If it can be automated, it should be automated!" In case you're curious, this is similar to, but exactly the same as, the Java-based DSL I use in the production code to create and test game board configuration.

Here's a little quantitative breakdown. By the end of the project, I had 710 changesets in my Mercurial repository. According to SLOCCount, I ended up with 7,086 lines of platform-independent production code and 2,559 lines of unit tests, plus about 200 more to handle platform-specific idiosyncracies.  (A brief aside: I've never used SLOCCount before, but it installed easily from the Ubuntu repositories. It claims the COCOMO total estimated cost to develop this software was $293,470. I guess I work cheap.)

A few days before the submission deadline, I put together an introductory video, as required for the competition. I spent hours searching for and tinkering with video editing software, but nothing seemed to work the way I wanted. I ended up doing this with RecordMyDesktop and a hand-held mic. This was about the thirtieth take.


The project took much longer than planned, but I did end up with a complete submission. I am happy with both the results and the process. I hope that the results are useful to others as well, and if you do end up using Equations Squared in a learning environment, please do let me know. I tell my students regularly that shipping is hard. It's easy to have a few ideas and write some throwaway code; it's a different matter altogether to actually build something of value. There are several ways in which I've considered extending the project, including multiplayer modes and mobile-native versions. Whether any of these are undertaken really depends on demand.

A big and public Thank you! to my alpha and beta testers. They provided invaluable feedback on platform issues, usability, and defect detection. Finally, thank you to my wife and kids for indulging my desire to spend the summer creating.

Author's Note: At the time of publication, there are less than 2.5 hours until the winner of the ETS Math Assessment Game Challenge is announced. In the spirit of unbiased personal reflection, I wanted to get this post up before the winners are announced. There are a few more details I was hoping to write about, such as the use of Google Web Fonts to do some nifty dynamic font loading, but I will delay that for another day.

Update: Good news, everyone! Equations Squared won the grand prize! The contest entry page now has a pretty gold ribbon proclaiming the news.