Wednesday, January 2, 2013

Fog of War: An obfuscated defect

One of the major technical contributions I made at the end of the Underground Railroad project was to fix the fog of war feature. The designer's intent was that a player would only see the current county and one county away; if the player was at a depot of the Underground Railroad, they could see one farther. The team had already implemented a map and an invisible overlay to handle movement. The overlay had the same topography as the map that the player sees, but it was color coded. When clicking on the map, we look up the corresponding coordinates in the overlay to determine the county, as well as whether the countryside or county seat was selected. The same system is used for the popup hints: look up the mouse position in the overlay, and if its a legal target, inform the player.

The theory behind the fog of war, then, was simple. We added an opaque layer on top of the map, which I will call the fog layer.1 When the player's position changes, first determine which counties should be shown. Then, look for those counties in the color-coded overlay. Because the invisible overlay is the same shape as the actual map, turn the corresponding pixels transparent in the fog layer. Presto! We've cut out exactly the right shape in the fog layer to see only the counties that should be shown. Because the map was so big, the team came up with some appropriate optimizations, restricting the search for matching colors to those areas on or near the edges of the current camera view.

Starting the game in Jackson County, KY. The player only sees one county away.
(Freedom is North, so we don't let the player go deeper into the South.)
The team created a very domain model of the game counties, assembled using the Builder pattern in a fluent internal DSL. A representative line looks like this:

Add(Make ()
  .WithID (CountyID.DelawareIN)
  .Coordinates (40.211893f, -85.396077f)
  .SetDepot (1)
  .IsOnRiver ()
  .NorthernIndiana ())
.WithColor (35, 50, 0);

This is adding a new county to the registry. In particular, this is Delaware County, whose county seat has particular GPS coordinates, which had one depot, is on a river, and is in the northern half of Indiana. The last part constructs the color key in the county registry. Note that the actual visual color is arbitrary, since the player never sees it: all we're doing here is saying that the pixels colored as the RGB triple <35,50,0> correspond to Delaware County.

Here's where things get interesting. Unity3D has two color classes: Color and Color32. The former represents colors as four-dimensional floating-point vectors in the range [0,1], while the latter uses integer vectors in the range [0,255]. Each class has a directly corresponding constructor:

static function Color (r : float, g : float, b : float, a : float) : Color

static function Color32 (r : byte, g : byte, b : byte, a : byte) : Color32

Also, both classes support implicit conversion to the other. This means that if you have an object of one type but need the other, it will automatically do the sensible conversion—or the most sensible conversion it can.

I remember the students who worked on the domain model had some trouble understanding these different classes very early in the semester, but then I assumed that everything was taken care of. However, in trying to trace down defects in the fog of war feature, I saw some very strange color processing code. As I poked around the code, I noticed that both Color and Color32 were being used in different places. Whenever there was to be an implicit conversion, the same piece of bit-fiddling code appeared. That certainly shouldn't be necessary, since there's no way the conversion API was broken. After some exploration, I tracked it all back to the county registry builder. Here's the original implementation of WithColor:

public Builder WithColor (float r, float g, float b)
{
    Color key = new Color (r, g, b, 0);
    _county.Color = key;
    _builder.map.Add (key, _county);
    return _builder;
}

The astute reader will note that the error can already be identified. Spoilers ahead.

The call to WithColor sends in a triple of integers—35, 50, and 0 in my original example. These int values are silently converted into float, since that's what the method expects. Then, these floating-point values are passed to Color, which also expects float arguments. Remember how Color is defined? It's a four-dimensional vector of floating-point values in the range [0,1]. Yet, the Color class happily accepts the 35, 50, and 0. If you inspect this Color object or ask it to print itself, you find out that it's <35.0,50.0,0>, as you might expect. Then, if you convert it to a Color32, that object is <255,255,0>.

This also makes sense, if you think about it for a while. The Color class only really cares about values in the range [0,1], and it interprets anything above that range as being equivalent to one. So, if you drew the Color that is <35.0,50.0,0.0>, you would get bright yellow, not dull reddish green. If my students ever fully understood this, they certainly didn't articulate it or move to fix it: instead, they developed a kludge to pull the Color values out as floats and pack those back into a Color32 manually.

There are several lessons to this story. Here's what I got out of it:
  1. Read the API docs. There's no good reason to send floating-point values like 35.0 to the Color constructor.
  2. Make sure your API produces appropriate warnings when being used in such a weird way. There could have been a warning either upon sending the out-of-range values to Color or when they were used in the automatic conversion.
  3. Don't use primitive types when they are not semantically appropriate. Color should not actually take floats as arguments when what it really wants is values in the range [0,1]. So, make a class to represent that concept, and use that. Before you complain that this will impact performance, remember that premature optimization is the root of all evil.
  4. Refactor. The bizarre color-handling kludge showed up in at least two places, clearly the result of copy-paste coding. Had the original developer stopped and refactored this away, he would have at least made a better abstraction for handling the problem. In the best case, he would have recognized the problem and fixed the root cause.
  5. Automatic conversion is awful. You might think you're improving readability of your code, but only if you and the reader share the same mental model and expectations. Better to make it explicit. To me, it's similar to the desire for using static factory methods over explicit constructor calls: good naming can reveal your intention.
By the way, The Underground Railroad in the Ohio River Valley is now open to the public. Enjoy!



I was surprised that the students didn't understand "fog of war" as a metaphor. The original opaque image  looked like literal fog, and at first I couldn't understand why they had chosen it. I had used a technical term from game design, thinking it was common vocabulary, and this led to unexpected team behavior. Usually when this happens to me, it's computing jargon, not game design jargon!

Tuesday, January 1, 2013

Reflecting on the Fall 2012 Game Design Colloquium: Expectations and Reality

Background

In Fall 2012, Ronald Morris and I team-taught an honors colloquium about serious game design. At the start of the semester, I wrote about the achievement-based grading system we intended to use and how it fits into a larger academic-year game production project. To make a long story short, we had a team of students from a variety majors, and each was expected to make an educational non-digital game for the Indiana State Museum. I took the day today to compose my reflection on the course, which is now shared with you in four parts: my expectations when we designed the course; what actually happened when we ran the course; insights gleaned from the final exam; and what I might do differently next time.

This is my third attempt at composing this reflection. The first was a collection of notes and observations that were too raw to share. The second was a narrative that attempted to map expectations, execution, and lessons learned into nice triads. This ended up being a mess, as the relationships are much more rich than simple one-to-one mappings. I mention this because it strikes me as significant to my reflective practice: in taking the time to write carefully, I have been able to understand the semester a bit better and come to peace with some things that had been bothering me.

Expectations

Students would enroll because they were interested in either the topic or generally in immersive learning experiences. By explaining all of the expectations in the course description, including an explicit statement regarding nine hours of attention per week, students would know that they needed to take the course seriously and commit themselves appropriately. Achievement-based grading would motivate students to engage with the material. Given that this was an honors colloquium—and hence only open to students in the Honors College—they would be responsible enough to set a good pace of achievements throughout the semester. The achievements leaderboard was posted in a shared location, which would lead to positive peer pressure for everyone to keep up. Some achievements rewarded students for reading and commenting on each others' essays, and this would lead to interesting discussion, deeper thinking, and better critical analysis.

Since the students have set up their own schedule based on interest and other personal commitments, their independent activity during the first third of the semester would complement the in-class discussion of game design fundamentals. This activity would then strengthen their work in the next third of the semester, which would be devoted to iterative prototyping. By the final third, the students would be focused wholly on producing their games, with all the other achievement-based background work done.

To help students with the Socializer achievement—as well as getting through the canon of genres—I would announce all of the meetings of the Society for Game Design and Development as well as the Game Days at the Muncie Public Library. Students would approach these events with some expected trepidation, but once they realized the friendliness of the community, they would continue to attend these events, both for personal pleasure and to gain exposure to new game design ideas.

At the end of the semester, we would showcase our games to the Indiana State Museum. Some of the games may be usable right away, but all would serve as media for communicating important design ideas. Given their expertise in production, the ISM could very likely produce better looking or more durable game bits. This means that clarity would be a key concern in the students' designs.

Each student would make their own game deliverable, but the designs themselves would not be graded. This could be too subjective as well as dependent upon prior experience and extracurricular digital production skills. Instead, we would give students a mark based on their following a good iterative process. As explained in the course description, we expected each design to go through several iterations, and failure to present a prototype in each iteration would result in grade reduction. Hence, the only source of grades for the student were the achievements and the iterative design process, meaning that students could really choose their grade based on their level of commitment. Because we did not need any other grades, the final exam would be a reflective exercise.

Reality

We have little information about why the students enrolled in the colloquium. Most of the students had little to no background knowledge in game design, and little passion for the topic. One student told me that she enrolled because "design" was outside of her major and experience: she identified it as an area for personal growth during her undergraduate studies. This is a great reason to take a course, and she proved herself capable during the semester. Another confided that he took the course only because of the time it met and to satisfy the Honors College's two-colloquium requirement.

For the first third of the semester, almost no achievements were completed. Ron and I gave the students some class time without us to collectively develop a plan, and this partially worked: they started holding meetings with their classmates to play the games of the games canon. Unfortunately, this happened after the transition from fundamentals to prototyping in the course schedule. Because they were completely unfamiliar with many established genres, they could not draw upon these in their own prototypes.

As we got into the prototyping phase of the semester, we started by formally scheduling students every two weeks or so. The students preferred a more ad hoc approach in which individuals would bring prototypes as they were ready. This caused a conflict with the course description, in which we intended calendar-based iterations with students presenting a new prototype each cycle. The students were right, however, that some prototypes took longer than others, especially in those cases where students completely changed topics. We agreed to let the students follow this process, entrusting them to hold each other accountable, even though it meant our iteration-based grade penalty system would become unenforceable. 

As the semester went on, we witnessed design stagnancy. Many students brought essentially the same prototype after having had a week or two to work on it, and it became clear from their discourse that there was little to no playtesting happening outside the class meetings. A few students were able to recover from this with some not-so-gentle pushing by Ron and me, ending up with good designs but only after having fought against our feedback.

At midsemester, there was still very little progress on the achievements, though with a few exceptions. Ron and I held individual conferences with the students, during which we provided some feedback on their designs and also encouraged them to enact a plan for meeting the achievements. There was a lot of smiling and promising, but very little material evidence that these meetings were worthwhile. Most of the achievements ended up being completed in the last two weeks of the semester, many appearing rushed and certainly not contributing to the students' all-but-finalized designs. Because we had these individual meetings, though, we know that it was procrastination and a failure to plan on the students' part that led to this situation, as opposed to a fundamental flaw with our expectations.

Some of the achievements simply took a long time to complete: students made progress on playing the games in the canon for several weeks before marking the achievement as complete. However, the one that was delayed the longest, which caused me the most frustration, was the Socializer. Only one student completed this achievement before the last two weeks of the semester. In their reflective essays, students acknowledged—after the fact—that it would have been better to go earlier and more often.

Near the end of the semester, we engaged in two rounds of external playtesting. The first was with Motivate Our Minds, and the second, College Mentors for Kids. Our students highly regarded these opportunities, even in those cases where the young playtesters were not quite in the target age group for the game. I would have liked to see more preparation on the part of my students: some had not defined all of the terminology in their games, and others were missing key physical pieces of their games—which, again, betrayed a lack of playtesting.

We had three in-class presentations in the last two weeks of the semester for the Scholar achievement, one each on A Theory of Fun for Game Design, The Art of Game Design: A Book of Lenses, and Homo Ludens. I was surprised by how easily the students slipped into discussion mode, engaging each other and voicing opinions. At the same time, it was disappointing at the end of the semester that some of them were still falling prey to very naive notions of game design and learning. I opted against steering these conversations, instead listening for evidence of who had integrated knowledge from their experience and who had not. I would like to claim that those in the former group made better games as well, but I think my recollection of the conversation is too colored by my post hoc opinion of their games.

The students made an impressive display of their games for the community partners. We invited representatives from the ISM to campus, and each student explained their game's theme and demonstrated gameplay. The ISM staff responded most positively to those that were simple and effectively described. This is predictable but unfortunate for the students, since the games that actually contained the best designs were not necessarily given the attention they deserved. What I mean by "best designs" are those that reflect what is known about games and learning, and not coincidentally, these were mostly the ones that went through the most revisions.

Most of the games do good design elements, and I believe that we can call this a successful outcome. Comparing them to the students' first critical analyses and initial designs, we can see a general transition from novice toward advanced beginner. However, there are also those corresponding design decisions that appear to have been left without critical analysis, but of course, learning how to ask these questions is part of learning the craft. We are currently considering which of the designs to move forward with into digital production in the Spring, and there are several that I think would make great starting points.

Final exam reflection

The final exam consisted of the following questions:
1a: Pick a specific exhibit at a museum you visited this semester, not including the one for which you made a game already. Now consider that the museum approaches you to create a game for use in a summer camp, where students engage with thematic ideas (history, science, etc.) in a day camp format. Describe the process by which you would deliver such a game. Note that we are not asking you to design the game, but to describe the design process for the game.
1b: Explain, in detail and drawing on your personal experience, why you believe the answer to 1a to be true. 
2: What was the relationship between the course structure and your learning this semester? Be sure to specifically address peer evaluation of prototypes and essays as well as the achievement system. You might also consider what elements you would change or add. 
3: How many hours did you spend on this class each week? Did this change during the semester, and why? 
4: Excluding yourself, who did an outstanding job this semester? Consider both their contributions to the class and their product.
The first was designed to get them thinking about what they did this semester and consider what they have learned. Part 1b is more important to the learning process, as this requires metacognition. The answers to these questions were all similarl: the students would do it the same way we did it this semester. While the students were able to summarize all the various steps we had taken, and they appeared to have faith that this was a good way to proceed, I was a little disappointed that there was not more critical discussion of what worked and what didn't work. In particular, only one student nailed the importance of research for serious game design. That is, to make a game that matches a museum theme, the designer needs to have a good understanding of that theme.

The second question was designed to get specific feedback on course structures. Almost every student mentioned that they liked the achievement system but that they would have liked some deadlines in order to help them manage their time more effectively. In particular, they wanted the Scholar achievement moved earlier in the semester to provide more foundation for their work. If we had one more meeting, I would have pointed out to them that this course of action was available to them all semester, and that it was their procrastination that pushed this to the end of the semester. I hope that they recognize this and that it was a good learning experience, but in the future, I would rather prevent the students from having to learn this lesson, so they can focus on course ideas instead. One of the best ideas shared by a student was to have achievements for playing more than one game in a genre, rather than just having them for playing different genres.

The third and fourth questions were Ron's ideas, and I am glad we added them. I assume that their answers were honest, since there was nothing to be gained at this point—remember that this "exam" did not contribute to the students' grades at all. According to their answers, about three students spent the expected nine hours per week on the course, and the rest spent less, some much less. This is interesting, as there was some complaining during the semester about how long it takes to learn and play some of the games in the canon. Several students admitted to spending no time on the course during the first third, as we expected. I think this shows that the level of rigor we expected of them was appropriate, even only some of them really rose to the challenge, and it indicates that students may do better with more scaffolding.

As for the fourth question, it reflected a similar phenomenon to the ISM's visit: the students seemed to value highly their peers who were most vocal, even in cases where—in my expert opinion—they had nothing to say. I take this as an indication that we needed more opportunity for the students to talk about specific design details: this would prevent the generalities and platitudes that students are so willing to share.

Next time

With an ounce of luck and a grant, I should be able to teach a similar course next Fall. With this in mind, here is a summary of what I would consider doing differently next time.
  • I need to plan on a class full of partially-interested novices rather than hope for a few with passion and background knowledge. Should such students show up, I'm sure I can capitalize on it, but I need to make sure I'm not quite so surprised when initial critical analyses are all about kids' games.
  • The students should do more smaller designs as a way to explore the design space and get used to the designer's mindset. In particular, I should have them make and share one-page design documents on shared themes, a trick I picked up during the semester from a conversation with Lucas Blair at Meaningful Play.
  • Keep the achievement system, but add deadlines. Scott Nicholson—who I also met at Meaningful Play—recommended a system in which students have a few dates during the semester by which they must complete a threshold number of achievements in order to earn a certain grade. For example, completing three achievements by each of the dates would earn an A. I will likely do something similar, probably sending Scott a friendly request for more details as the hypothetical future course gets closer.
  • Decompose and decouple achievements. Rather than have one that takes weeks and five designer games to complete, make this five different achievements. My colleague Evan Snider has explained to me how he chains together smaller achievements into "quests," which is something I might consider, although I do worry about taking the metaphor too far.
  • Only use positive grading, not punitive grading. Our original model had grade penalties based on student infraction of policies, but this implies that students have points to lose. I prefer a model where students cannot lose credit, only gain it.
  • Push them harder, individually if possible. The best products, and the best learning, came about when Ron and I contacted students individually to push them to do better. Each rose to the challenge.
  • Include a request for feedback as part of the final exam. Since the university moved to digital course evaluations, their usefulness has plummeted. Most students do only perfunctory evaluations and do not provide the kind of useful feedback as when done on paper and in class. The comments from the exam were much more useful to me than the course evaluations, even though they ostensibly asked the same questions.

Friday, December 21, 2012

The Development of "The Underground Railroad in the Ohio River Valley"

I am just finishing my role as Technical Director of an educational game project, The Underground Railroad in the Ohio River Valley. My friend and colleague Ronald Morris received a grant from the Entertainment Software Association to produce a game on this theme, and he and his team worked through Summer on research and design. Early in the project, I offered the services of my CS315/515 Game Programming class to serve as the technical team, knowing that this would be a good immersive project for them.

Preproduction

Some of Ron's students did preliminary research and design before summer, but it was during the summer semester that most of the design happened. I was not heavily involved in this stage: I did a bit of consulting with the design team, primarily on what scope was appropriate for my team to complete in the Fall. At the end of the summer, I was given a set of design notes, which I transcribed and interpreted into a design wiki. The original format was a PowerPoint slideument, which was wholly inappropriate for the task, in part because there was no way to tell by looking at the printed copy that some of the data tables exceeded the size of the page. In retrospect, I wish I had remembered to point the design team toward Stone LeBrande's One Page Design approach—not that they necessarily need one page designs, but it certainly would have gotten them thinking more critically about how to communicate this information effectively. Regardless, by copying the information into a structured design wiki, I was able to build my own mental model of the game, codify nomenclature, and identify missing pieces. As usual, I used Google Sites as a lightweight wiki. The fact that the design was essentially static meant that I didn't have to deal with refactoring or maintaining the wiki: it was primarily for structured presentation to the technical team.

Lead designer Michael Smith demonstrates the paper prototype to community partner Jeannie  Regan-Dinnius of the Indiana Department of Natural Resources Division of Historic Preservation and Archaeology

Jeannie explores the physical prototype
My experience with the first semester of Morgan's Raid production taught me that wrangling 24 undergraduates took all of my time—time that would be better spent helping a smaller group to learn and make progress on the game. The chair of my department is very supportive of this kind of immersive learning work, and he permitted me to cap the enrollment of CS315/515 and institute an application process. This allowed me to communicate my expectations to students before the semester started, so every student who applied knew that we would be working in a studio space, with a client, and with the expectation of nine hours per week dedicated to production. Thanks again to the support of my chair, the team was given the use of our undergraduate research space, RB368.

RB368, before the start of the semester

Team logistics

My team of twelve gathered on the first day of the semester to begin work. I had prepped them over email and as part of the application process, and we used the first week to talk about project management using Scrum and the basics of Unity3D. I had used both of these in my semester at the Virginia B. Ball Center for Creative Inquiry and felt comfortable applying them to this new project. We set up a task board and got to work, planning seven two-week sprints to bring us to a successful completion.

The class was scheduled at 9AM Monday, Wednesday, and Friday, and this was when we held our "daily" stand-up meetings. The fact that many people had 10AM classes very quickly became an impediment, albeit an expected one. Unlike at the VBC, and unlike the fortunate scheduling of the Morgan's Raid Spring Team, these students had non-complementary schedules. As a result, there was very little whole-team collocated work outside of 9AM MWF. After-hours meetings were regularly initiated by the team, but I was rarely able to attend. Each student was on their honor to give nine hours of work per week to the project, but it would have been nicer to have more collocated time.

The technical team hard at work
The lack of collocation was a particular impediment with respect to art production. While the technical team was working for academic credit, the artists and musicians were hourly workers hired through the ESA grant. Ron was responsible for recruiting and managing them, although this caused some communication problems when the artists and musicians did not know to whom they were accountable—the professor paying their checks or the professor giving them work. I do not doubt that this would have been ameliorated had they worked in RB368 along with the technical team in crossfunctional teams, as was their direction; instead, each retreated to his or her own corner of campus. The predictable result was that a lot of time was wasted producing assets that were not usable.

After failing to complete any user stories in their first sprint, the technical team was able to pull itself up by its proverbial bootstraps and get productive. We conducted retrospectives at the end of each sprint to reflect on what was working and what wasn't working. The same issues tended to come up time and again, but I do feel like the team made progress in self-management and accountability during the semester. One of the perennial problems I have in dealing with students—which I suspect is a subset of dealing with humans—is their love of generality. Students will very gladly speak in platitudes and generalities, rather than nailing down specific people for specific problems. As a result, everyone tends to smile and nod and feel that a reflection was useful, but not necessarily hold each other accountable, since there's always a way to wiggle out. I got a copy of Patrick Kua's The Retrospective Handbook during the semester but have not had the chance to read it yet; I hope that I can find some practices within that I can bring to next semester's project (more on that in another post).

Managing and teaching

Managing these projects can be stressful. Each sprint, the team talks about trying to make steady progress, yet almost every sprint, it feels like there is no way they will complete their tasks on time. The team's retrospective notes show that they grow to desire steady progress. However, there is still a strong tendency to fall to the illusion of progress in the first half of the sprint, doing a lot of talking and hand-waving, and then rushing through the second half. As a result, validation was inconsistent, and tasks were rarely done done. Even at the end of the project, a major defect still existed in the build, and the response from the students working on this feature was, essentially, that the defect was enshrined in their implementation and we'd have to live with it. (NB: I fixed it in about half a day's work, but see the section on Expertise below.)

The stress this semester was compounded by my nagging doubt that I was not spending enough time sitting with the students and creating software with them. In the three hours per week that I saw the whole team, my time was spent on project management: clarifying design ideas, coordinating communication, pulling on the artists for sketches, and barely having time to look over shoulders at source code before people ran off to their next meetings. My personal schedule did not permit my coming in to after-hours meetings, in large part because of the welcome addition of my third son. I needed to prioritize family over working extra hours, and I have no regrets about this! Yet, I cannot shake the feeling like there had to be a way to influence the students more strongly toward best practices. I had put together a team programming manual in an attempt to codify collaborative practice, and though this document was distributed before the semester and reviewed in the first week, it went almost completely unheeded. Actually, it was a bit worse than that: the students religiously followed the parts that made life easier and ignored the parts that actually would have made the implementation better. In particular, I borrowed two bits of Robert C. Martin's Clean Code:
  • Don't use comments: they rot and indicate a failure of attention to careful design.
  • Don't repeat yourself: replace copy-paste code with better abstractions.
I'm sure you can guess which one of those two they followed. Hint: we had a lot of copy-pasted code with no comments to explain it. To be clear, I am not blaming the students for being novices. I think that a major factor here is that they could not see the master working his craft, and so I fear that many did not learn some of the deeper lessons I hoped for them. My principle contrast here is with the Morgan's Raid Spring Team, a team in which I was embedded for about nine hours per week in Spring 2011.

A found several missed opportunities for good software design as I've worked directly with the code this week. One of the most egregious is in the handling of percentage tables. The core game design involves tables of outcomes determined by percent likelihood, in the vein of a tabletop wargame. Code like this can be found throughout the implementation:

Random rand = new Random();
float roll = rand.nextFloat();
if (roll >= 0.0f && roll <= 0.05) {
  // do something
}
else if (roll > 0.05f && roll <= 0.10f) {
  // do something else
}
// etc.

This happens a lot in the code, including the redundant lower-bounds checking. A simple table abstraction, with methods to add entries and compute percentages, would have made this much more readable and maintainable. I'm not sure how to help students perceive these affordances, but it's something I want to try to spend more time on in both CS315/515 and CS222, which is the prerequisite.

There were several times when I was able to push students in the direction of better design, and some of these came up as significant learning outcomes in our semester retrospective. I worked with students to incorporate the Builder pattern in the domain model and frequently assisted with functional decomposition. I also guided the team to use a formal state machine analysis of the game and then use the state design pattern to implement it; this change impacted the entire team for the good. These interactions, in which I guided students toward great ideas of software design, showed up positively in course evaluations and the semester retrospective.
The team considers the task board and the state diagram
The picture above demonstrates the recurrence of a theme that shows up in my scholarship: the importance of dedicated space for these kinds of projects. The picture above shows how the whiteboard served as an information radiator: the tasks were always posted, and critical diagrams such as the state diagram stayed up for weeks. (In fact, the team seemed to really like the term "information radiator," incorporating it into their dialog and writing after my introducing it to them.) The team could have erased and reused the board for other purposes at any time—and they frequently did early in the semester—until this kind of information began being posted. Then, the team recognized the need to keep these models available for easy reference at all times. Consider also the handling of the game's various dialog boxes: they underwent many revisions before the team settled on a standardized way of handling them, and the visual designs occupied a side board for about half the semester.

Dialog box design standards

Working with clients and the community

Ron had a class at the same time as CS315/515, and so he could not attend our Sprint Review meetings. Knowing that we needed his feedback, we held Sprint Review Redux meetings at 10AM, and about four or five team members would stick around for it. It didn't come up until the course evaluations, but the students who stayed really valued this interaction. I had not thought much about the impact of this meeting on the students, primarily thinking of it as something I needed to ensure we were going in the right direction. After reading the evaluations, I realized that this such meetings are critical to the team's morale: these meetings helped them feel like valuable contributors to a bigger purpose rather than programming automata or hired help. I wonder if a different teaching schedule would have led to the team's being more consistent in completing tasks with highest quality, if they could have built more empathy for Ron's and the players' needs?

The technical team conducted two external playtests of the game late in the semester, knowing that the design team had already tested the core mechanics. These were after-hours, and participation was encouraged but optional. I think that, as with the meetings with Ron, these meetings helped the team to understand better the context and impact of their work. The conventional use of such playtesting is to identify problems, but in this case, much of the design was done before my students got their hands on it. I think that the primary benefit here was not in usability, but in helping my students remember what elementary school students are like, and that there were real end users for this project.

Team members Matt Brewer and Daniel Wilson playtesting at Motivate Our Minds

On expertise

The game is scheduled to "ship" on December 31, coinciding with the end of the ESA grant. Long ago, I had promised Ron that I would help out during the break if there was anything left to be done, and that's what I've been doing this week. Some of the artists are still on contract to finish assets, and so I continue to direct their efforts while also integrating their work into the project, fixing defects, and adding features. Working intimately with my students' code has forced me to think about the nature of novices and experts.

So far this week, I have spent twenty hours working on the project. This is more than the amount of attention I asked my students to spend on the project in a whole sprint, based on the number of credit-hours they earned. There is no definitive ratio that says how much more productive an expert is than a novice, and I've encountered claims from five to twenty. If we take the conservative estimate for the sake of discussion, then we can say that I accomplished this week what I could expect one of my students to do in about eleven weeks of work, or roughly 2/3 of a semester. If we further consider the productivity costs of interruptions—students having to manage four or five other courses, jobs, and relationships, whereas I'm working from home and still relying on my wife to wrangle the boys—then I've easily done more than I could expect a single student could do in an entire semester.

Putting it in this perspective, while I am frustrated to see some low quality code in the project, I think it's important to still qualify the learning experience as a success. Some of them have only been programming for two years, and that has been while enrolled as a full-time student! The students certainly learned a lot about working on an interdisciplinary team, with all the joys and pains that come with leadership, accountability, trust, empathy, and professionalism. They got to build a real software system, large enough that they were able to watch it nearly collapse under the weight of their own bad decisions yet still have time to fix it. They saw the need for something better: for formal state-based analysis, for design patterns, for object-oriented and functional decomposition, for having high standards. They got to see how hard it is to actually make a game, and that tools like Unity3D only barely manage the complexity—there are no silver bullets. They worked in C#, which was a new experience for many of them, and they got a taste for how it is like Java and yet got to tinker with delegates and properties. Sounds like a win to me.

Personal conclusions

I took the better part of the day to write this reflection because I know that by writing it, I would be able to better articulate what I learned from this experience. I have divided my conclusions into two sections. First, the ones that we all already know to be true, but it's good to be reminded:
  • Given the opportunity to make something significant, students will rise to the challenge.
  • Having a dedicated space is critical.
  • Collocation is important, especially across disciplines.
  • Face-to-face communication is always preferable.
  • Many of the outcomes of immersive learning would go unarticulated without making the time for reflection.
Here are some more specific notes that I should keep in mind:
  • I should encourage future teams strongly toward one-page design documents, if for no other reason then to get them thinking about how best to communicate their designs.
  • I need to notice when I perceive an affordance that my students do not. When this happens, I need to point out not just what action can be taken, but how I recognized it, so that they can learn to see them as well. 
  • I need to be conscientious about modeling professional behavior so that my students can learn to think, design, and communicate as professionals. Scheduling and executing code reviews is a good place to start.
  • To improve the impact of reflections and retrospectives, I should find a way to encourage specificity and accountability.
  • It is good to keep all the team members in regular contact with the community, including clients, playtesters, and other stakeholders.
  • It is better if all team members share goals and motivations. More specifically, it's better if everyone is working for credit or for pay, not a mixture of both. This leads to conflicts of priority within the team, particularly with the natural rhythm of the semester.
  • While I can lead a team in the production of someone else's design, I prefer leading teams of students through the holistic design process, from inspiration to finished product.

Current status

As of this writing, the game is in public beta and playable for free online. There is some work-in-progress art that comes up when you cross the Ohio River, but that should be replaced by the end of the day. A few team members have volunteered to do some QA over the break, and I believe all the defects have been ironed out.
The game's main title screen

Monday, December 17, 2012

Students' pride in their work: A CSEdWeek 2012 Post

Two weeks ago, I received an invitation from an undergraduate to come to the presentation of his independent study project. Turns out it was part of a series of presentations connected to a colleague's graph theory courses. Some of the presentations were connected with educational mobile applications, which piqued my curiosity enough that I decided to attend. I was one of three or four guests in the classroom, and the students demonstrated some decent technical prototypes. I provided a little bit of feedback from a HCI and design perspectives.

The experience made me think about the good work that my students are doing. The following Friday, I related the story to my CS222 class, who were a day away from the deadline on their six-week team projects. The teams and projects are self-selected, and this semester, I had three teams, each of which created a video game. I told the students, honestly, that their work was more interesting, and I asked whether they would like me to invite the rest of the department to come to their final project presentations. They responded immediately and excitedly that we should, and so I posted the announcement on the department's Facebook page.

Last Monday were the final presentations, and we brought in about eight outsiders—not bad for a class of twelve students! My recollection is that there were three faculty/staff there and around five undergraduates, all of whom had taken CS222 in the past. This is significant since these student-attendees came in with realistic expectations, having gone through the six-week project in a similar format themselves.

The three presentations went well. I had given them a presentation evaluation rubric ahead of time, and this rubric emphasized three categories: the executable release itself, achievement of milestone #2 objectives, and software architecture. Each of the groups covered these categories quite clearly in their allotted fifteen minutes. Note that game design was explicitly not part of the course: I made it clear to the students at the outset that they were welcome to create games in their six-week project, but that I would only formally evaluate them as software artifacts and not for their game design qualities, since that was outside the syllabus. Still, most of the questions from the audience were related to game design, and the teams provided responses that showed a good general understanding of game design; this is good for me, since hopefully I can recruit these students into my game design and game programming courses in the future!

It is well known that many people fear public speaking, and my students are not exceptional in this regard. I think that in most cases, these students would not volunteer to speak in front of a room of students, faculty, and staff; yet, when presented with the opportunity, they jumped at it. This speaks to the power of providing students with motivating contexts for their work. By giving students the freedom to choose a project that they wanted to complete, not only did I get them to commit to the requisite technical and collaborative tasks—they also gained the important social experience of public presentation.

I know I'm two days late in my CSEdWeek pledge to write a related blog post, but I hope that this post helps point in a fruitful direction. I've been using student-directed projects in my teaching for quite a few years with success, but I don't always make the opportunity for public presentation. Seeing how my students learned from this experience, I am going to try harder in future CS222 offerings to open up the final presentations. By putting it here on my blog, you can hold me to it next semester.

Tuesday, November 27, 2012

Applying Burgun's Lens of Game Design

I recently read Keith Burgun's Game Design Theory, an intriguing manifesto on game design and philosophy. The author argues that reasoned discourse about games requires a shared vocabulary, and to this end, he offers the following hierarchy of interactive systems.
Burgun's hierarchy of interactive systems
(Taken from What Makes a Game?)
Burgun's defines a Game as "a system of rules in which agents compete by making ambiguous decisions." More specifically, these decisions have to be endogenously meaningful in terms of the game mechanics. Whether or not one agrees with his definition is less important, in my opinion, than the fact that a vocabulary helps us move forward in the science of design.1 


The hierarchy of interactive systems permits Burgun to explain what he's not talking about: because he wants to address game design, he can cut out problems of interactive system designpuzzle design, and contest design. Many of his game design recommendations echo what others have to say. However, it's Burgun's zealotry that makes his work so valuable: he makes fewer and stronger recommendations. In the introduction, he discusses how he sees his work with relation to other books on game design:
For those who might defend these books by saying that they're only giving readers wiggle room or that they're allowing readers to come to their own conclusions about what games are: readers do not explicitly need to be given permission to do this. Thinking persons will come to their own conclusions, regardless of whether they read something wishy-washy, or something pointed... (Introduction, page xx)
This sets the tone for the rest of his book. He is unapologetic about his philosophy of game design and leaves it to the reader to decide whether they agree or not. In fact, I don't think anyone who is serious about game design could read the book without being either uplifted or offended.

In an attempt to better understand Burgun's philosophy, I decided to apply his lens to some of my work and my students' projects. The following analyses assume familiarity with his philosophy, and while this is best presented in the book, there is an overview in the freely-available Gamasutra article, What Makes a Game?


Morgan's Raid

According to Burgun's lens, Morgan's Raid is a Puzzle2 because there is no randomness: a play experience can always be replicated by repeating a series of decisions. The goal of the puzzle is to maximize score, which is themed in the game as Morgan's reputation. The impact of a player's raiding decisions on reputation are not immediately clear: a player must choose all of his orders prior to seeing their combined effect, although Basil Duke does provide thematic hints.
Basil Duke informs the player that he will help explain the puzzle.
There must be a series of decisions that maximizes reputation, but no one on the development team knows what it is. From our observations, groups of players will gladly make it a contest to see who gets the highest score, although their interest in the game wanes well before anyone finds the optimal path.

It is interesting to note that the original Spring Team design for Morgan's Raid involved more interesting behavior of the Union troops who were chasing Morgan, such that this would have made the project a Game. The original plan was for the Union's movement to be like Morgan's, making heuristic decisions in each town in an attempt to capture the player. However, working within our time constraints, we simplified the Union behavior to make them a fixed integer distance from Morgan. This distance is modified by player's decisions but in a fixed and predictable way.


Museum Assistant: Design an Exhibit

Museum Assistant is also a puzzle, albeit one with multiple solutions. Players get themed feedback based on the solution chosen; for example, creating an exhibit with African scientific artifacts from three different periods yields the generated exhibit title, "African Science through the Ages." The themes provide reason for players to try alternate paths, but from a mechanics point of view, one solution is as good as any other.


As with Morgan's Raid, Museum Assistant underwent a design change that resulted in its moving from Game to Puzzle on Burgun's hierarchy. Details of this design are described in my MeaningfulPlay paper, but to summarize, there were systems of input and output randomness that made it so the same series of game actions could produce different results. However, in the major redesign we agreed that we needed one good play experience, and that balancing the ambitious original design was outside of our scope. In terms of Burgun's hierarchy, the team decided to make a good Puzzle rather than a bad Game.


Equations Squared

While the previous two examples are student work, Equations Squared is my own, and it's certainly a Game. The player makes strategic decisions about placement of digits and operations, in terms of which to use and where to place them. Not all sequences are legal equations, and the scoring system rewards more complex equations. There is input randomness: the sequence of digits and operations you receive is different each time you play the game, so you very likely will never play the same game twice.



Auralboros

Auralboros is an experimental make-your-own-rhythm-game toy. You can make the experience as simple, as challenging, or as ridiculous as you want. To this end, Auralboros is simply an Interactive System.
Auralboros encourages players to make their own Contests out of matching keystrokes in rhythm. The system rewards such behavior with visual feedback. There's no ambiguous decisions: you either make and match rhythms or you don't. In fact, a successful strategy to seeing all the visual bells-and-whistles is to spam a single key—a useful debugging technique discovered by co-developer Ryan Thompson. However, this strategy is not much fun, as you end up just making a bad Contest.


EEClone

I still occasionally install and play Every Extend (though it seems the original download site is now gone), usually after explaining to students what an amazing experience it is. EEClone is my academic knockoff, designed to explore and teach how design patterns occur in game engine software.


Like its inspiration and namesake, EEClone is a Game. The timing and orientation of incoming obstacles is not known, and the player has to make meaningful and ambiguous decisions about maneuvering and the timing of explosions in order to succeed. Of all these analyses, this one is the simplest, but it also shows how things that are obviously Games fit nicely into the hierarchy.


Conclusions

Most of my effort the past few years has been on serious games. As I use the term, serious games are those that are designed to have a particular real-world impact on the player. For example, Museum Assistant is designed to encourage players to think about collecting and curating. Applying Burgun's lens to my students' projects gives rise to an intriguing contradiction: serious games need not be "games" at all. However, Museum Assistant is no less successful in meeting its design constraints for its being classified a Puzzle. This is because the real constraint for serious games is serious, not game. It is difficult to say whether or not these projects would better meet their goals (however one defines "better") if they were Games, because this would fundamentally change them. For example, we know that Morgan's Raid could be a better Game if the maps were randomized, but this violates the goal of familiarizing the player with actual Indiana geography.3

The Morgan's Raid and Museum Assistant teams recognized that there were opportunities to make a better game—or from a strict reading of Burgun, to make Systems and Puzzles into Games. Both teams eliminated randomness in the face of time constraints, knowing that balancing games would be much more time-consuming than testing puzzles. This was shared knowledge among the team, although they didn't have Burgun's concise language to communicate the sentiment. In a similar vein, the Auralboros team was aware that we weren't really making a game at all. It is interesting to note that Equations Squared is a Game by Burgun's hierarchy, but it is not a serious game by my own definition. For the player, it is simply supposed to be fun. The serious aspect of it is in the assessment of player's behavior, an assessment conducted by someone outside the magic circle but facilitated by score, badges, and demerits.

Applying Burgun's lens to these projects has helped me to understand his philosophy. However, since much of his philosophy is prescriptive, there is not much extrinsic value in applying the lens to completed projects. That is, I do not think I gained any new insight into these projects, but then again, as an academic, I've already studied them inside and out. I do look forward to having Burgun's philosophy in my utility belt for future design projects, particularly as a lens for identifying and discussing decisions that could alter a project's position in the hierarchy. Next semester, I will be leading an experimental six-credit interdisciplinary game design and development studio, and you can be sure I'll try to keep up my reflective practice here on the blog.


1 At MeaningfulPlay, I got into a bit of a debate with a gentleman over the definition of "fun." He argued that a friend's autobiographical game designed around the theme of depression and abuse, in which you decide whether or not to commit fantasized patricide, could not be fun. I said that if it was a game, and if I was using Koster's operational definition of fun as learning and mastery of a system, then it could be fun, though perhaps not in the informal "enjoyment" sense. My point was that if we defined these terms, we can be clear about our meaning and avoid the baggage. He didn't talk to me any more. I tell this story to demonstrate that I am surely in Burgun's design philosophy camp, and that there is a dangerous cultural divide in "game design" that prevents communication across traditions. See also Daniel Cook's excellent essay contrasting secular and mystical approaches to game design.

2 As a typographical convention, I will capitalize the layers of Burgun's hierarchy so as to distinguish the layer "Puzzle" from the general use of the word.

We are currently conducting an empirical study on the effectiveness of Morgan's Raid, and I will report the results here when we have them.

Tuesday, November 20, 2012

The PlayN Experience

In an online discussion, a student asked me the following with respect to Equations Squared:
How does PlayN compare to say Slick2D? Is development with it straightforward? Did you have any major issues along the way? Most of the example games listed don't even work in Firefox, but I noticed Equations Squared does. Was this extra work or are others just lazy? Answers to any of these questions would be greatly appreciated.
My summary of the design and development of Equations Squared mentions little about its technical architecture aside from the fact that I used PlayN. Briefly, PlayN is a game programming library that allows you to write games using a standard Java toolchain and then cross-compile them to several different platforms, including HTML5, Android, and iOS. It is based on the same fundamental technology as Google Web Toolkit, which permits development of AJAX Web applications from pure Java source; GWT is used to implement such powerful Web applications as GMail.

I got interested in PlayN because of the potential to build HTML5 applications while leveraging my knowledge of Java. I've done very little Javascript, and I know from personal and vicarious experience that browser-specific quirks are a special kind of hell. Yet, the browser is the common modern networked operating system, and I figured that a pure HTML5+JS solution would improve the chances of both longevity and adoption for a math assessment game. As much as I love standard Java and Unity3D, both require third-party plug-ins; my experience working with schools is that very often, if a teacher wanted to use such software, they often do not have permission or knowledge to install and configure it. The fact that PlayN is based on GWT was an easy sell for me, as I've been telling students for years that GWT represents the state-of-the-art.

Getting started with PlayN requires you to dive into Apache Maven. It took me some time to wrap my head around Maven, but now that I've used it, it's hard to imagine life without it. In a nutshell, it manages project dependencies and build lifecycles. Previously, I was used to this sequence:

  1. Find a library I want to use in a project.
  2. Download the binaries and documentation jars.
  3. Put these into lib and doc folders in my project.
  4. Adjust the Eclipse build path and, if necessary, native library locations.
Now, with Maven, the process looks like this:
  1. Find a library I want to use in a project.
  2. Edit my POM file.
Nice. While I was working on my game, new versions of some of my dependencies were released. Getting the new versions into my project was literally as easy as changing a version number in the POM.

It's not all milk and honey with Maven. The Eclipse integration allowed me to import my project without too much trouble, but I found the integration imperfect, and I ended up doing a combination of developing and compiling in Eclipse while building release candidates on the command line. Fortunately, you don't have to start by mastering all of Maven: the PlayN Getting Started Guide provides just enough to get you started. As with anything, follow those steps, and then take some time to look at what really happened with your project.

Perhaps the biggest non-technical hurdle to getting into PlayN is the rendering model. There is a complex system of layers, and I found myself having to refer to the documentation time and time again to ensure that I was using each properly. I ended up making extensive use of GroupLayers to handle the hierarchical nature of the game. For example, each of the dashboards is a group layer consisting of nested layers, and moving a digit from the right-hand side tray to the board ends reparenting it from one layer to another. Once I got into the flow, it was no trouble at all, but I'm not confident I could sit down right now and whip up a demo without re-reading the documentation again. Contrast Slick, for example, where I know I could whip up a little demo that makes a guy dance on the screen in almost no time; more on Slick later.

Screenshot of the game
I used the excellent TriplePlay, Pythagoras, and React libraries along with PlayN, along with my old favorite, Guava. All the logic of the game was written using test-driven development, which saved me several times from introducing regression defects, particularly in the parsing of various kinds of arithmetic expressions.

While TriplePlay made handling text much easier, I did find uncover cross-browser problems when using this library—some of which have been fixed since then. There are subtle problems with text alignment, which caused me trouble when the main tiles of my game were being dynamically generated. To avoid having to write krufty browser-specific code, I ended up replacing the dynamic tiles with static images from a sprite sheet. I'm guessing that the failure of some PlayN projects to work in Firefox is due to similar issues combined with lack of QA. Once again, thanks to all my testers!

The sprite sheet that replaced dynamically-generated tiles
The student's question asks about a comparison of PlayN to Slick2D, the latter of which was used to develop Morgan's Raid. All the important technical aspects of Morgan's Raid could have been done in PlayN; in fact, some may have been easier if I had known about Maven at the time, since we used a build server with continuous integration and test-driven development. Neither Slick nor PlayN save you from having to do some fiddling with project configurations and native libraries. Slick is almost certainly faster to get up and running, assuming you don't run into LWJGL version problems—which you almost certainly will at some point. Slick+Maven may be worth my investigating.

The real decision between PlayN and Slick2D comes down to target platform and choosing where you want to fight with it. Slick will always require the client to have Java installed, or you'll have to heavily invest in making custom installers. Remember that installing Java—or anything—is seriously scary or impossible for some people.  With PlayN, you can deploy straight to HTML5+JS, but in my experience, you still need to do platform testing even if you do everything the documentation tells you: some things still won't work right. You can also build to recent versions of Android: I tried this and it worked well for me, but I didn't have time to design the game for the various mobile screen sizes. I cannot comment from personal experience on the Flash or iOS support in PlayN, though there seem to be some problems with the Flash compiler based on discussions I've seen on the mailing list.

I hope that this summary of my experience with PlayN, and the comparison to my Slick experience, is useful to you. Feel free to leave comments below and I'll do my best to get back to you.

Friday, November 9, 2012

Student-created badges for milestone presentation evaluation

The students in my advanced programming class have their Milestone 1 presentations coming up next Wednesday. It's a three week increment, and so most of their attention the last two weeks has been on their projects. I was going to publish a rubric for the Milestone 1 presentations, but then I realized that it would be far more interesting for the students to come up with the assessment plan. It turns out that all three of my teams are making simple games as their projects—a fluke that's not happened in this class before—and so I decided to dovetail on that and work with them to do achievement-based assessment.

Normally, I spend the few minutes before 1PM setting up my laptop. I knew I wouldn't need it today, so at 12:55, I was standing at the podium and ready to begin. All but one of the students were already there, so I started an informal discussion about why classes are so tightly tied to time. It was just passing thought to me, the relationship of time-locked learning to the educational needs of the industrial revolution, when I realized that there was a path to achievements from here.

I challenged them to think about what education might look like if we weren't slaves to the clock. A student mentioned that he had a class where they covered a chapter per three-hour meeting, and if they got done early, they just went home. I noted that textbook-oriented learning is part of the same phenomenon, having emerged from a time when information was scarce and structured, but that this generation of undergraduates has lived through the transition to information's being abundant and unstructured. One of the students had been homeschooled in his earlier years, and he described how he had content-oriented tasks, and he could play whenever he was done with them. We agreed this was still basically the same as chapter-per-meeting, "content"-based design, but with more scheduling freedom.

When I challenged them to think about how learning was authenticated, a student mentioned portfolios as an alternative to transcripts, which I agreed was a good idea. Another mentioned that one way to demonstrate knowledge is to teach it. Aha! From there, I explained how there was a peer-to-peer connection here: the one teaching the material could verify that the learner had learned it, but the learner could also verify that the teacher knew it. How a third party would interpret such peer-oriented credentialing would depend on how much they trusted the person who signed it. That is, a network of trust supplants the hierarchy—an idea that was certainly inspired by my recent watching of Manuel Lima's RSA Animate talk.

This got us into a discussion of the Badges for Lifelong Learning initiative, as well as a brief overview of the idea of badge-creation as a form of reflective practice. This latter idea came directly from my discussions with Lucas Blair at MeaningfulPlay. I challenged the students to think of badges that we could use for the Milestone 1 presentations. There were some thoughtful looks but mostly confusion, so I led with an example, starting with the criteria, then polling for what to title it, and finally, plying my incredible chalk-art skills to make an icon—a sequence of development I encouraged them to follow.


I got the students into cross-team groups to come up with some of their own, inviting them to share their badges on the board. As they got started, I reminded them that I didn't care if one person gave the milestone presentation or if they all rotated speakers. Personally, I think speaker rotation is silly—and I told them this—but it seems someone has convinced them before coming into my class that speaker rotation is good.

Here's what we designed for achievements, in the sequence we discussed them.





Those icons are all student-created, except for the bald Mr. Clean guy. I can't believe they used a broom and not the iconic strong cross-armed bald guy. I was a little disappointed that one of the groups did not fully realize the value of icons, as I value the analogical thinking necessary to invent them. Note, for example, the visual pun in Home on the Range. (Assuming I'm interpreting that correctly.)

As they were designing these badges, a student asked if they could make "negative badges," and I told them to do so as they wished. When we got to talking about Train Wreck, I asked them what they thought about the negative badge. A student spoke up and said, "Well, it's still a learning experience." Que alegria! I told them that I could not have said it any better myself and we moved on.

When we got to Orator, I pointed out that it was significantly different from the rest, as it was competitive. They seemed to agree with me, though silently, that it was OK to have some competitive badges in addition to the ones anyone could earn.

With just a few minutes left in the session, I pointed out that what we had done for the last half an hour was exactly reflective practice, and that this was a major goal of the learning experience. I asked if they thought this set of badges would suffice for Wednesday, and they agreed. I told them I'd collate the badges and provide them with sheets for Wednesday's presentations, and that afterwards we would talk about the process.

As it turns out, I had an Ed.D. student observing that day as well. As we walked out of the room, he asked, "Is that normal? ... Are you normal?" I laughed and explained that certainly some days are better than others, but today, we had everybody deeply engaged, and it was a great meeting.