Saturday, March 9, 2024

Painting ISS Vanguard

It has been a long time since I finished a full set of miniatures. I have been painting off and on, but it hasn't been complete sets. For example, my sons and I have painted a few Frosthaven and Oathsworn figures but not all of them. I have some Massive Darkness 2 figures primed that have been sitting on a box awaiting inspiration for some time. 

In any case, I'm here to break the streak with ISS Vanguard. My older boys and I really enjoyed Etherfields. It came with a promotional comic about ISS Vanguard that piqued my interest. Somehow, I heard about how Awaken Realms had re-opened the pledge manager for the game between Wave 1 and Wave 2, so I jumped in. I received my copy many weeks ago, but I knew I did not want to bust it open until we finished Oathsworn. We are now two chapters away from doing so, and so I have finished this core set of figures just in time.

ISS Vanguard Box Art

Most of the things I paint are based on concept art, and I like to match the colors of the figures to the artwork. This is especially important in board games so that figures are easily distinguishable. The eight human figures in ISS Vanguard don't have any published concept art, however, so I had to come up with a different way to handle them. Awaken Realms offers a service where they "sundrop" miniatures, which essentially means that they are given a high-quality wash. Their approach makes it clear that the figures are in pairs, two for each of the four playable sections: security, recon, science, and engineering. I liked the idea of painting them in matching pairs, and searching for inspiration online revealed that many others painters did as well. This Reddit post was my favorite, where the painter featured the distinct section colors on each model but otherwise used a high-contrast scheme with white armor and dark detailing. Coincidentally, a friend shared a video on Facebook last night called, "White plastic and blinking lights: the sci fi toys of the late 1970s and early 1980s." I didn't watch the whole thing, but it did get me reflecting on why I believed that white armor with bright colors should match a science fiction setting.

The figures were slightly frustrating to paint. There are many fiddly details on the armor, but not all of it is meaningful. It seems like the kind of thing that would look great when sundropping because that leaves it as monochromatic detail without having to be specific about what pieces logically or thematically connect. I used the aforementioned Reddit post regularly to try to plan out where I wanted splashes of color. There are a few parts that I would consider recoloring if I had the paints on hand, but a lot of the colors were custom mixes; it wasn't worth the risk of having a bad match to recolor it.

I used zenithal priming from the airbrush to prep the figures. I then painted all of them with a slightly warm off-white color, mostly white with a dot of grey and of buff. I used a wash over the whole figure to darken the recesses, then hit the highlights with the off-white armor color.

Let's look at the figures in pairs.

Engineering Section

All miniature painters know that yellow is a challenging color. Fortunately, a white undercoat made it manageable. The one on the left could probably use slightly more yellow, but I do like how it looks in isolation, and one will never have both of these out in the same mission anyway.

Security Section

I thought a lot about whether the little "pet" on the left should be white like the armor or a different color for contrast. I ended up keeping it white to suggest that it's made of the same stuff as the bulk of the armor. 

I like the poses of these two. These figures all make good use of scenic bases. Part of me prefers blank bases, since I can then decide whether or not I want to add features and suggest that the characters are in particular settings, as I did most elaborately in my Temple of Elemental Evil set... whose images sadly seems to have been eaten by a grue, in a horrible example of why you should not trust "the cloud." Here, however, we can see that a pose like that recon figure on the right would really not be possible any other way. The engagement with the scenery makes it worthwhile in a way that those engineering figures feel more like it's in the way. 

Recon Section

The Recon Section also has wonderful, dynamic poses. I was worried that the smokey jet trail of the one on the right might be too much, but I think it turned out fine. I chose yellow for the flowery thing on the left figure in part to complement the dark blue of the strap and mask details and partially so that it has similar colors to the jetpack character

Science Section

The yellow figures may have taken the most time because of the troubles getting yellow to be bright enough, but the Science Section was awfully close because of all the stuff in their scenes. I had some similar thoughts about the claw arm as the security section's pet, and I ended up going the same way here: if white plastic is what they're using to build lightweight rigid armor, then let's use it for the claw arm and the pets, too.

The alien biomatter being picked up by the one on the right looked fungous, so I picked out some colors inspired by that. Of course, a giant mushroom here on earth would not also have green leaves sticking out of it. 

ISS Vanguard (?)

The last figure in the box is a big space station. I presume it is the titular ISS Vanguard, but the parts of the rulebook that I have read don't actually reference it at all. It's not clear to me if this is used in play or not. I wish it did, though, since that would have given me some idea of how much effort I should spend painting it.

As with the human characters, I looked around online and found a few ideas for painting this piece. I kept the "nearly white with spots of color" motif. One of the challenges here is that the way a space station would be lit is quite different from how an away team would be, but I didn't want to paint it so starkly. I ended up using a cold off-white here to differentiate it subtly from the warm off-white of the characters. A wash deepened some of the recesses, then some highlights and spot colors. It's fine. I waffled a bit on whether to just paint over the silly translucent bit, but I chose against it in part because I have no idea if it is significant to the story. Who knows, maybe the campaign plot hinges on understanding that people are using pure translucent blue as a power source? I wanted the blue to match the beautiful tone used on the box cover, but I didn't quite get it.  It's not purple enough, but it does match the translucent parts.

All Eight Characters

Thanks for checking out the photos and the story here. I'll include some more individual pictures below for people who want to see more detail, including the backs. 



















Thursday, February 15, 2024

Reaping the benefits of automated integration testing in game development

This academic year, I am working on a research and development project: a game to teach middle-school and early high-school youth about paths to STEM careers. I have a small team, and we are funded by the Indiana Space Grant Consortium. It's been a rewarding project that I hope to write more about later.

In the game, the player goes through four years of high school, interacting with a small cast of characters. We are designing narrative events based on real and fictional stories around how people get interested in STEM. Here is an example of how the project looked this morning:

This vignette is defined by a script that encodes all of the text, the options the player has, the options' effects, and whether the encounter is specific to a character, location, and year. 

We settled on the overall look and feel several months ago, and in that discussion, we recognized that there was a danger in the design: if the number of lines of text in the options buttons (in the lower right) was too high, the UI would break down. That is, we needed to be sure that none of the stories ever had so many options, or too much text, that the buttons wouldn't fit in their allocated space.

The team already had integration tests configured to ensure that the scripts were formatted correctly. For example, our game engine expects narrative elements to be either strings or arrays of strings, so we have a test that ensures this is the case. The tests are run as pre-commit hooks as well as on the CI server before a build. My original suggestion was to develop a heuristic that would tell us if the text was likely too long, but my student research assistant took a different tack: he used our unit testing framework's ability to test the actual in-game layout to ensure that no story's text would overrun our allocated space.

In yesterday's meeting, the team's art specialist pointed out that the bottom-left corner of the UI would look better if the inner blue panel were rounded. She mentioned that doing so would also require moving the player stats panel up and over a little so that it didn't poke the rounded corner. I knew how to do this, so I worked on it this morning. It's a small and worthwhile improvement: a cleaner UI with just a little bit of configuration. 

I ran the game locally to make sure it looked right, and it did. Satisfied with my contribution, I typed up my commit message and then was surprised to see the tests fail. How could that be, when I had not changed any logic of the program? Looking at the output, I saw that it was the line-length integration test that had failed, specifically on the "skip math" story. I loaded that one up to take a look. Sure enough, the 10-pixel change in the stat block's position had changed the line-wrapping in this one particular story. Here's how it looked:


Notice how the stat block is no longer formatted correctly: it has been stretched vertically because the white buttons next to it have exceeded their allocated space. 

This is an unmitigated win for automated testing. Who knows if or when we would have found this defect by manual testing? We have a major event coming up on Monday where we will be demonstrating the game, and it would have been embarrassing to have this come up then. Not only does this show the benefit of automated testing, it also is a humbling story of how my heuristic approach likely would not have caught this error, but the student's more rigorous approach did.

I tweaked the "skip math" story text, and you can see the result below. This particular story can come from any character in any location, and so this time, it's Steven in the cafeteria instead of Hilda in the classroom.

We will be formally launching the project before the end of the semester. It will be free, open source, and playable in the browser.

Friday, February 9, 2024

Tales of Preproduction: Refining the prototyping procedure

I am teaching the Game Preproduction class for the second time this semester, and this time I am joined by Antonio Sanders as a team-teacher from the School of Art. There are already a lot of interesting things happening as we have a class that is now have Computer Science majors and half Animation majors. We have also extended the class time to a "studio" duration, so we meet twice a week for three hours per meeting instead of the 75 minutes I had with my inaugural group last year.

Given that quick summary of our context, I want to share a significant change that we made to the prototyping process from last year. Last year, the team adjusted the schedule because we didn't dedicate enough ideation time to prototyping, and so this year, we set aside five days for this. Each day, the students are supposed to bring in a prototype that answers a design question. I remember this also being challenging last year, and it wasn't until late in that process that we remembered the seven questions that Lemarchand poses about prototypes in A Playful Production Process

In an effort to get the students thinking more critically about their prototypes, we have required them to write short prototype reports that address Lemarchand's seven questions. The last of the questions, which Lemarchand himself typesets in bold to show its importance, is, "What question does this prototype answer?" What the reports help reveal, which was harder to see last year, were cases where the questions themselves were either malformed or unanswerable. That is, students are going into prototyping without a good idea of what prototyping is. Several times, I've seen students show their prototypes, and when I ask what design question they answer, the students have to look it up on their reports. This is pretty strong evidence that the questions were developed post hoc. What's most troubling is that, after having completed four of the planned five rounds, these problems are still rampant.

Early in the process, my teaching partner suggested students think about design questions in the form "Is X Y?" where X is a capability being prototyped and Y is a design goal. For example "Is holding the jump button down to fly giving the player a sense of freedom?" While this heuristic proved helpful, a lot of students struggled with it: in part, I think they didn't understand that it was only a heuristic, and in part because they haven't practiced the analysis skills required to pull a design question out of an inspiration. If I were to use this again, I'd follow the obvious-in-retrospect need to rename those variables, to something like "Does this player action produce this design goal?" (Unfortunately, the discussion of design goals comes up later in the book, so maybe even this idea is too fuzzy for the students.)

Many of the questions that students want to pursue are actually research questions. I mean this in both the colloquial and the academic senses. A question like, "Does adding a sudden sound make the player scared when they see the monster?" is obviously answered in the affirmative: one need only look at games that induce jump-scares to see that this is effective. Questions like, "Do timers increase player stress?" are simple design truisms that are not worth prototyping. In yesterday's class, I tried to explain to the students that if the question is generic then it's a research question, and that design questions are always about specifics. In particular, through science, we approach generic questions through specific experiments that attempt to answer the general question; through design, we answer specific questions through specifics directly.

Reflecting on these problems, it becomes clear that the earlier parts of the semester were not goal-directed enough. Students acknowledged after our in-class brainstorming session that they were not brainstorming game ideas (but that's a topic for another post). When the students did research, many of these were also not goal-directed. Now, in prototyping, we're more easily to see what students are interested in, and we can point out to them that their interests and issues can and should be solved by blue-sky ideation or by research. However, we haven't baked that into these first five weeks. Put another way, we took a waterfall approach to ideation whereas perhaps next year we should try an iterative one.

We're in the process of collecting summaries of all the students' prototypes. I put together a form that uses this template for students to self-describe prototypes that are viable for forming teams around:

This game will be a GENRE/TYPE where the player CORE MECHANISM to GOAL/THEME. 

I'm eager to see if this was a helpful hook for the students. I will have to ask them about it on Tuesday and then see if it's something we can use with next year's cohort.

Tuesday, January 2, 2024

Notes from "Grading for Growth"

I read portions of Grading for Growth as part of my preparations for the Spring's preproduction class. This book makes a case for "alternative grading," establishing four pillars for their efforts. These are:

  1. Clearly defined standards
  2. Helpful feedback
  3. Marks indicate progress
  4. Reattempts without penalty
I've been reading Talbert's blog for some time, and it's that last one that gives me some difficulty. I was hoping that reading the book would help me understand some practical matters such as grading management and dealing with skills that build upon each other. However, I found myself taking more notes about CS222 Advanced Programming and CS315 Game Programming than about CS390 Game Studio Preproduction.

I have read Nilson's Specification Grading and many articles on alternative grading, so I skipped through some of the content and case studies. The first case study is the one that was most relevant to me: a case of a calculus class in which the professor used standards-based grading (SBG). This was contributed by Joshua Bowman at Pepperdine University.

One of the tools that I had not fully considered before is the gateway exam. Bowman gives a ten-question exam early in the semester. Students must pass at least nine questions to pass the exam. Students get five chances to retake the exam, and a passing grade is required for a B- or better grade. This is potentially useful to deal with some of the particular problems I have faced in CS222, where students come in with high variation in understanding of programming fundamentals while also suffering from second-order ignorance. A formalized assessment could very well help with this. 

Another useful idea from the reading is the distinction between revision and new attempt. In my own teaching, I have allowed revisions, but I frequently in CS222 find myself suggesting that students begin assignments anew with new code or contexts. This was never a clear requirement but rather a strong suggestion. Separating these two ideas could increase clarity about the significance of error or misunderstanding. In particular, this could help with a particular error mode that I have seen in CS222: a student submits a source code evaluation, I critique the evaluation, and the student resubmits an evaluation that restates what I just pointed out in the critique. This masks the distinction between a student who has learned the material and one who can effectively parrot my commentary. The problem could be avoided if I required new attempts in cases where I am using my feedback to direct the student's attention to what they have missed rather than to point out small oversights.

Regular readers may recall that I experimented with specifications-based grading in my section of CS222 in Fall 2023. I only laid out cases for A, B, C, D, and F grades, similarly to how I have implemented specs grading in CS315. The reading suggested that +/- grades can also be laid out in a specification, using them for "in between" cases.

I regularly air my frustrations with the incoherent concept of "mid-semester grades," but a piece of advice from the book struck me as useful. There was a recommendation to only give A, C, or F grades at midsemester. This is probably the right level of granularity for the task. The alternative, which I also recently came across in a blog somewhere, was to have students write their own mid-semester evaluations as a reflective exercise.

Bowman and others separate their standards into core and auxiliary. This could be useful in both CS222 and CS315, where I tend to weave together content that students are required to know from the syllabus with those that I think are useful from my experience.

The authors directly address the problem that reassessments have to be meaningful. Unlimited resubmissions will inevitably lead to students' throwing mediocre attempts at the problem in hopes that it goes away. The authors suggest two techniques for ensuring assessments are meaningful. The first is to gate the possibility for reassessment behind meaningful practice, which probably works better in courses with more objective content such as mathematics courses. The other is to require a reflective cover sheet. I have required students to give memos explaining the resubmission, but I've never given them a format for what this entails. This has led to my accepting many "memos" that show little evidence of understanding, usually when my patience is exhausted. Formalizing the memo process would benefit everyone involved.

Those are all helpful ideas for this summer, when I will likely take elements of CS222 and CS315 back to the drawing board, but what about the resubmission rate issue that I was actually looking for? Well, I found quite a surprise. The authors suggest exactly what I have been doing for years: using a token-based system or throttling resubmissions. The real puzzle here then is what exactly they mean by "reattempts without penalty," since it's not what those words actually mean together. Only being able to reattempt a subset of substandard assignments is a penalty, since from a pure learning point of view, there's no essential reason to prevent it. That is, the penalty is coming from the practical matter that teachers cannot afford to teach every student as if they are their only responsibility. This finding was anticlimactic, but part of me expected that it would be what I had found. There's no silver bullet, and if I haven't seen nor invented something better in 20+ years of alternative grading experience, then it does not exist.

(It's funny to actually type out "20+ years of alternative grading experience," but it's true. It's also one of those things that's making me feel old lately.)

Monday, January 1, 2024

The Games of 2023

In 2023, I logged 408 plays of 71 different board games. I am surprised how much lower that is than the last two years, but I think it also points to playing more heavy games rather than multiple light games. My youngest son is almost nine, and he will join in any game that we invite him. Just this morning, we rung in the new year by playing Massive Darkness 2, and he did great with one of the most complicated character classes. We can probably unload some of the kids old games and to make room for... well, honestly, the games we already have that just don't have a home on a shelf.

Here are the games that I played at least ten times this past year:

  • Frosthaven (54)
  • Clank! Catacombs (32)
  • Railroad Ink (26)
  • Everdell (19)
  • Res Arcana (16)
  • Terraforming Mars: Ares Expedition (16)
  • Oathsworn: Into the Deepwood (12)
  • Cribbage (11)
  • So Clover! (11)
  • Ark Nova (10)
  • Thunderstone Quest (10)
We haven't had Frosthaven to the table in months, so it was a shock for me to see it so strongly in the top. My son and I have played through practically all the main storyline, though we have not unlocked all the characters. I am a little disappointed that, after the main quests are done, there's not much pull to go back into the game. It's not like the game mechanisms changed much, but once there's no narrative hook to move forward, it just stopped feeling like it mattered if we collected the materials to build the buildings to get more materials to save a settlement that was actually fine.

Cribbage is a game I played a lot as a kid and watched my parents play with their friends. It's a comforting game. A glass of wine and a game with my wife always makes for a good evening.

I did not think we played Ark Nova as much last year as we did. Maybe that expansion is in our future.

As of the start of the year, my game h-index is 33, meaning that there are 33 games that I have played at least 33 times. This will certainly go up this year, as it seems I only need one more play of Castles of Mad King Ludwig to increase to 34, and that's a game I love to play. My player h-index is 19, meaning that there are 19 players with whom I have played at least 19 games. This one seems much harder to increase! 

I'll conclude by sharing the most-played games in my collection, continuing a tradition for this blog series.
  • Race for the Galaxy (112)
  • Clank (102)
  • Thunderstone Quest (102)
  • Crokinole (88)
  • Kingdomino (82)
  • My City (67)
  • Gloomhaven (66)
  • The Quacks of Quedlinburg (65)
  • Arcadia Quest (61)
  • Frosthaven (61)
  • Carcassonne (60)
  • Animal Upon Animal (56)
  • Quiddler (56)
  • Camel Up (51)
  • Terraforming Mars: Ares Expedition (47)
  • Rhino Hero: Super Battle (43)
  • Cribbage (41)
  • The Crew (40)
  • Just One (40)
  • Mage Knight Board Game (40)
  • Runebound Third Edition (40)
It may look like Race wins, but if we tally together Clank, Clank Legacy, and Clank Catacombs, the Clank family dominates with 158 total plays (102+17+39, respectively).

Thanks for reading. Happy New Year and Happy Gaming!

Wednesday, December 20, 2023

Reflecting on CS222 Advanced Programming, Fall 2023 Edition

As you may have noticed, I tried something a little different this year and extracted topical reflections into their own posts [1,2] rather than embed them into a lengthier reflection about a class. Aside from those concerns already expressed, CS222 went quite smoothly this semester. I had a small section, and I feel like I had a good rapport with the students. 

I had most recently been teaching the course on a MWF schedule, but this semester it was back to my preferred Tuesday-Thursday schedule. This means more time in one session to dive into a topic, but it also meant I touched on fewer topics. This is a worthwhile exchange, but I didn't get to all the extras I like to cover during the semester. For example, we didn't get a chance to explore state management in Flutter as much as I would have liked. That particular context is where we get into the Observer design pattern, which this batch of students will not know by name. I also did not get a chance to talk at all about software licensing and intellectual property aside from a quick, hand-waving statement that the students own the rights to their projects.

I also added a fourth week to the pre-project portion of the class, cutting a week off of the final project to compensate. This gave more time in the early part of the semester, where students tend to struggle with the basics. Shortening the iteration lengths for the final project did have the anticipated positive effect that students worked more consistently. That is, reducing the time between deliverables gave students fewer opportunities to procrastinate.

The most surprising finding this semester was that the first Clean Code assignment was too easy. I've been giving this assignment for years: read the first chapter of Clean Code and write a paragraph reflecting on which definition of "clean code" most resonates with you. It is intended as a warm-up exercise to get students used to the unconventional method of documenting and submitting work for the course. One of my students pointed out that it gave him a false sense of what to expect from assignments, all of which take orders of magnitude more effort than this first one. I am thinking of simply dropping the assignment in favor of more meaningful ones.

I teach CS222 almost every semester, but I have a break next semester while I work on a funded research project. It will be good to have a little break from it, and I imagine I will be back on rotation some time next academic year. We also had a new faculty member teach the course this Fall, but I haven't made the opportunity to talk to him about the experience yet. I will do that in Spring.

Tuesday, December 19, 2023

On the ethics of obscurity

Years ago, I experimented with what is now called "specifications grading" in CS222. I set up a table that explained to a student how their performance in each category would affect their final grade. These are not weighted averages of columns but declarations of minima. For example, to get an A may require earning at least a B in all assignments, an A on the final project, and a passing grade on the exam. This gave a clarity to the students that was lacking when using more traditional weighted averages. While publishing weighted average formulae for students technically makes it possible for them to compute their grade for themselves, in practice, I have rarely or never found a student willing to do that level of work. Hence, weighted averages, even public ones, leave grades obscure to the students, whereas specification tables make grades obvious.

What my experiment found was specifications grading made students work less than weighted averages. The simple reason for this is that if a student sees that their work in one category has capped their final grade, they have no material nor extrinsic (that is, grade-related) reason to work in other columns. Using the example above, if a student earns a C on an assignment and can no longer earn an A in the class, they see that they may as well just get a B in the final project, too, since an A would not affect their final grade.

This semester in CS222, I decided to try specifications-based final grades again. It probably does not surprise you, dear reader, that I got the same result: students lost motivation to do their best in the final project because their poor performance on another part of the class. It's worse than that, though: the final project is completed by teams, and some team members were striving for and could still earn top marks while other team members had this door closed to them. That's a bad situation, and I am grateful for the students who candidly shared the frustration this caused them.

The fact is that students can and do get themselves into this situation with weighted averages as well. A student's performance in individual assignments may have doomed them to a low grade in the class despite their performance on the final project, for example. However, as I already pointed out, this is obscured to them because of their unwillingness to do the math. What this means—and I have seen it countless times—is that students will continue to work on what they have in front of them in futile hope that it will earn them a better grade in the course.

And that's a good thing.

The student's ends may be unattainable, but the means will still produce learning. That is, the student will be engaged in the actual important part of the class. 

Good teaching is about encouraging students to learn. That is why one might have readings, assignments, projects, quizzes, and community partners: these things help engage students in processes that result in learning. It is a poor teacher whose goal is to help students get grades rather than to help them learn. Indeed, every teacher who has endeavored to understand the science of learning at all knows that extrinsic rewards destroy intrinsic motivation. 

What are the ethical considerations of choosing between a clear grading policy that yields less student learning and an obscure one that yields more? It seems to me that if learning is the goal, then there is no choice here at all. How far can one take this—how much of grading can we remove without damaging the necessary feedback loops? This is the essential question pursued by the ungrading movement, which I need to explore more thoroughly. 

I also wonder, why exactly haven't we professors banded together and refused to participate in grading systems that destroy intrinsic motivation?