Tuesday, November 26, 2024

Bloom's Taxonomy, Teaching, and LLMs

Recent discussions of LLMs in the classroom have me reflecting on Bloom's Taxonomy of the Cognitive Domain. Here's a nice visual summary of its revised version.

Blooms Taxonomy of the Cognitive Domain
(By Tidema - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=152872571)

Bloom's Taxonomy, as it is called, is a standard reference model among teachers. The idea behind it is that a learner starts from the bottom and works their way upward. As far as I know, it has not been empirically validated: it's more of a thought piece than science. This is reflected in the many, many variations I've seen in the poster sessions of games conferences, where some young scholar proposes a play-based inversion that moves some piece into a different position on the trajectory. All that is to say, take it with a grain of salt. The fact remains that this model has had arguably outsized influence on the teaching profession. (Incidentally, I prefer the SOLO taxonomy.)

There's been a constant refrain the past few decades among a significant number of educators and pundits that technology has made obsolete the remember stage. Why memorize this table of values when I can look them up? Why remember how this word is spelled? Spellcheck will fix it for me. My skepticism of the concept has only increased as I have worked with more and more students who use digital technology as a crutch rather than a precision instrument.

LLM-generated code comes up in almost every conversation I have among teachers and practitioners in software development. There are ongoing studies into the short- and long-term implications of using these tools. My observations are more anecdotal, but it's no exaggeration to say that every professional developer and almost every educator has landed in the same place: LLMs can generate useful code, but knowing what to do with it requires prior knowledge. That is, the errors within the LLM-generated code are often subtle and require knowledge of both software engineering and the problem domain. 

From the perspective of Bloom's taxonomy, a developer with a code-generating LLM is evaluating its output. They come to their evaluation by building upon the richness of cognitive domain skills that undergird it. At the very fundamental level, they bring to bear a vast amount of facts about the praxis of software development that they have remembered and understood.

If Bloom is right, then among the worst things we could do in software development education is throw students at LLMs before they have the capacity for viable evaluation. Indeed, before LLMs, the discussion around the water cooler was often about how to stop students from just searching Stack Overflow for answers and submitting those. Before Stack Overflow, it was that students were searching the web for definitions rather than remembering them. My hypothesis for learning software development then is something like this:

  • Google search eliminates the affordance for learning to remember.
  • Stack Overflow eliminates the affordance for learning to understand.
  • LLMs eliminate the affordance for learning to apply.
This hypothesis frames the quip that I share when an interlocutor discovers that I am a professor and, inevitably, asks what I think about students using ChatGPT. My answer is that I'm considering banning spellcheck.

Monday, November 25, 2024

Walking away from a November game project: A reflection on NoGaDeMon 2024, Dart, Flutter, and Bloc

I would hate to make this a tradition, but it seems that I once again entered NoGaDeMon. National Game Design Month (NaGaDeMon) is November, and for several years, I created interesting little projects during the month. Last year, I was not able to pull a project together, and I'm afraid that's the case this year as well. However, I was able to learn a bit through the attempt, so I want to capture some of it here before it slips away.

Before November, I had been tinkering with an intersection of ideas related to posts in the last few months: interactive narrative games like my The Endless Storm of Dagger Mountain, which drew from the Powered by the Apocalypse tabletop RPG space, built around some concepts from Blades in the Dark and Scum & Villainy. I figured that, for November, I would try building a very small slice of the idea. For various reasons, I also wanted to try building and releasing a game using Dart and Flutter. I dug in and started making reasonable progress for a side project.

A few days into November, John Harper released Deep Cuts, a campaign and rules expansion for Blades in the Dark. I bought a copy and was quite surprised at the rules changes. I had expected little tweaks and balancing maneuvers, but Deep Cuts actually provides a complete overhaul of the most fundamental Blades action resolution system. This was too cool not to play with, so I rehashed my planned NaGaDeMon project, essentially starting from scratch to support some of the Deep Cuts ideas.

Before last week, I was able to get a very small version of the game working, letting the player experience a single, badly written game scene. The user-interface was just awful, so in order for the game to come together would have required adding a ton of content and a complete player experience design and implementation. Both of those would be tedious efforts, especially the latter, since I am not very fast with Flutter UI development. Part of the inspiration for choosing Flutter was to gain more practice with engaging UIs. 

About two weeks ago, the work of one of my committees exploded into taking most of my unassigned work hours, and this was not altogether unexpected. We also just got the good news that we will be hosting family for several days around Thanksgiving. This will be wonderful, although it also means these won't be hobby-project days. The result is that I've decided to put this project to rest. I did learn quite a bit going this far into the project, and that is the topic for the remainder of this post.

First of all, the obvious lesson is that if I wanted to really focus on learning to make a top-notch interactive Flutter UI, I should have chosen something with zero other design risks. I knew that the best I could do in one month was to make something just functional, yet I am not sure I was honest with myself about how ugly that would likely end up. Maybe I will find a game jam that will let me get a better handle on combining turn-based game timing with implicit animations.

Prior to November, I had been tinkering with some of these design inspirations in Godot Engine, which is of course the engine I used to build The Endless Storm of Dagger Mountain. I was using a rather conventional mutable-state object-oriented architecture. I found myself frequently frustrated by the lack of good refactoring tools for GDScript. This is a significant hindrance to evolving an appropriate design. This is part of what made me switch over to Dart, which is a joy to work with in part because of the excellent tooling support from Android Studio. 

A few summers ago, I spent a great deal of time studying Filip Hracek's egamebook repository. Nothing shippable came out of my efforts—I don't think I ever even blogged about it—but I did learn a lot. I was struck by how Hracek separated the layers of his architecture, and it was the first time I spent a lot of time in a game that used immutable data models. At the time, I had looked into the Bloc architecture and struggled to make sense out of it.

Approaching this November's project, I decided to dig deeper into Bloc. I spent a lot of time with the official tutorials and puzzling over this seemingly simple diagram:


The simple tutorials are simple, which is convenient, but the more robust ones separate the "data" component into a data provider and repository. It seemed clear that the game state could be conceived of as data, but I struggled to conceptualize where the game rules should live. The game rules can be considered part of the domain model, and as such, should be separated from the bloc. This would mean that a response from the domain model may be the modified game state, which then is echoed back through the bloc to the UI with a bloc state change. However, it's also reasonable to conceive of the game state itself as the data layer and the "business logic" as being the transformations of that state. Indeed, this seems to be the difference between the simple and more complex tutorials: the simple ones deal with simple in-memory state, and the more complex ones draw data from different sources and transform them in the data layer. 

Of course, there is no silver bullet. Given the tight time constraints on the project, I simply considered the immutable game state to be my data layer, and I put the game logic in a bloc. I also simply passed the game state along to the UI, but in a more robust solution, I would have had clearer separation between layers. Including a dependency between the UI and the data layers was a matter of expedience and the intentional incurring of technical debt.

My first pass at the implementation had me writing my game states and bloc states by hand. The Equatable package meant that I didn't have to fret over writing some of the boilerplate that's necessary to do state comparisons, and it was easy to integrate this in Android Studio using Felix Angelov's Bloc plugin. When scouring the Web for help with Bloc, one quickly also comes across discussions of Freezed, which library is also integrated into Angelov's plugin. I had tinkered with Freezed in my egamebook-inspired explorations, but I have not shipped anything that uses it. After having built up my understanding of Bloc using Equatable, Freezed was an obvious next step. Next time, I would jump right into using it for cases like this.

Writing a functional Flutter user-interface was straightforward using BlocBuilder. I found this to be a convenient way to conceptualize the game, especially since it had very clear states. For example, in my original explorations (before Deep Cuts), I had the player choosing an action from a list, then customizing the action with various options from Blades in the Dark, such as pushing yourself to trade stress for dice. After rolling the dice, the player is now in a different state of the game in which they are responding to the result, such as by resisting its consequences. This was elegant to express in the code, and I am confident that with enough effort, I could make a compelling user experience out of it. By contrast, Dagger Mountain used an architecture inspired by MVP but that depended too heavily on the undocumented, unenforceable behavior of coroutines. Both of these are "only jam projects," but they are helping me to conceive of how I would approach something more significant in this problem domain. The aforementioned coroutines were my solution to synchronizing the model and view states (for example, to finish an animation before continuing to the next step of the narrative); I'm fairly certain I understand how I can do that with bloc's events and states, but since the November project will remain unfinished, there is risk.

All this exploratory coding meant that I did not follow a test-driven process. I ended up not getting into the testing libraries specifically for bloc. It's possible that this would have helped me better to conceptualize the business logic versus the domain layer, but that remains future work. 

There are still a lot of questions about the game design itself. Indeed, this entire exploration is inspired by design questions around the adaptation of Blades in the Dark tabletop gameplay into a digital experience. Citizen Sleeper is the only project I know of that has worked in this space, and it's a fantastic interpretation. I only became aware of Citizen Sleeper after I started doodling my own ideas, and it's interesting to see where they converge and where they diverge. I hope to dive back into this design space later, but for now, my attention must go toward wrapping up this semester, planning for next semester, and enjoying the upcoming Thanksgiving break.

Wednesday, November 13, 2024

What people believe you need to do to be an independent game developer

Aspiring game developers are starving for advice. I recently attended a meetup of game developers where an individual gave a formal presentation about how to become an indie. The presentation was thoughtfully crafted and well delivered, and it was entirely structured around imperatives—the things that you, the audience member, need to do if you want to be a successful independent game developer. The audience ate it up and asked for more. They were looking for the golden key that would unlock paradise.

There are two problems here, one overt and one subtle. The overt one is that there is no golden key. There is no set of practices that, if followed, will yield success. I imagine most of the audience knew this and were sifting for gold flakes. However, it was also clearly a mixed crowd, some weathered from years of experience and some fresh-faced hopefuls. I hope the latter were not misled.

The subtler problem was made manifest during the question and answer period when it became clear that the speaker was not actually a successful indie game developer at all. Their singular title had been in development for three years and had just entered beta. They had no actual experience from which to determine if the advice was reasonable or not. The speaker seemed to wholeheartedly believe the advice they were giving despite not being in a position to draw conclusions about their efficacy.

Once I saw the thrust of the presentation, I started taking notes about the kinds of advice the speaker was sharing. 
  •  Document everything, and specifically create:
    • Story and themes document
    • Art and design document
    • MDA document
  • Have a strong creative vision
  • Be a role model for the work environment you want
  • Consider these pro tips for hiring staff:
    • Use a report card to score your candidates
    • Look for ways to get to know what it would be like to work with them
    • Try collaborating with them as part of the interview
    • Always have a back-up candidate, not a top candidate but someone you know you could work with
    • Being their best friend does not mean you should work with them
  • Thank people for their contributions and efforts
  • Use custom tools to help you work better
    • Use the Asset Store in Unity
    • Use tools to help you test
    • Automate as much as you can to save you time
    • Learn to prompt so you can use generative AI
      • It allows an artist to be a developer by removing coding barriers
      • LLMs can replace tedious use of YouTube, Google, Reddit, etc.
  • When pitching to publishers, have two versions of your slide deck:
    • pitch slides: the version you send
    • pitch presentation: the version you present
  • Take budgeting seriously
    • Budget for specific deadlines
    • Don't spend your own money if you can get money from someone else (e.g. publisher)
    • Get a job so that you can support yourself until you can get funding from someone else for the game project
      • Quoting one of his professors: "To make money, you need to spend money, and to spend money, you need money."
  • Don't get distracted by others (e.g. on social media)
These aren't the things you need to do to be an indie game developer. These are the things that an audience believed you need to do to be an indie game developer or the things that someone with a modicum of experience thought would be worth telling indie hopefuls. It seems to me that this is the advice you would get if you spent an afternoon collecting advice by searching the Internet. It's helpful for me to have a list of what people are likely to believe from consuming popular advice. Sometimes advice is popular because it is accurate; sometimes people tell you to make your game state global.

Three other things jumped out at me about the presentation. First was the unspoken assumption that one would be using Unity. There was no indication from the speaker that this was even a choice, and none of the questions reflected on it. Second, the speaker acknowledged the importance of automation and automated testing, which was great to see. Third, no one pushed back regarding the use of CoPilot or other LLMs to help with coding, whereas I suspect there would have been a riot had he suggested using the same tech to generate artwork. There's a study in there.

Tuesday, November 12, 2024

Serendipity

As mentioned in yesterday's post, I was at Meaningful Play 2024 a few weeks ago, and I'm finally processing the many pages of notes that I took there. 

Sabrina Culyba gave the morning keynote on that last day of the conference. She spoke about serendipity in game design, sharing a compelling story about the development of Diatoms. The talk was brilliantly prepared and executed. She summarized research findings around serendipity that shows that the following factors can affect its likelihood:

  • Having a prepared mind
  • Openness
  • Being connection-prone
  • Belief in serendipity
These are really interesting, and if I didn't have a pile of other research projects in the hopper, I'd be curious to dive into the literature here. The first item sounds like a variation on the maxim, "Luck favors the prepared." The second sounds to me like the eponymous Big Five personality trait that tracks with creativity.

I don't have much else to contribute to the discussion, but it's a neat idea that I don't want to waste away in my notebook.

Monday, November 11, 2024

Fantasy heartbreakers

 I am currently reading William White's Tabletop RPG Design in Theory and Practice at the Forge: 2001-2012 after having met the author at MeaningfulPlay. This excerpt from Chapter 3 made me shout with delight at having a name for a phenomenon.

A fantasy heartbreaker was [Ron Edwards'] term for an independent game that contained interesting innovations, usually without realizing that they were in fact innovative, but whose designers had failed to fully examine their underlying design assumptions—thus producing games that were highly derivative of D&D, whether or not that was actually a design goal of the game—and who were either naïve or overambitious in their expectations for success in the marketplace. (p.93)

Ron Edwards' original post on the topic is cited, but I haven't made the time to read the source yet. White's summary was enough to excite me and want to share it here.

Tuesday, October 1, 2024

Paper!

I have a pile of things to grade, seemingly unlimited committee work to complete, and major decisions to make. I am having a bit of a stressful week. But you know what I just did that made me so happy that it's worth taking the time to write a blog post?

I graded something on paper.

My new coworker Travis Faas shared with me a format he uses for peer critiques during his game programming class. It's something I want to draw into that class. Today, in CS222 Advanced Programming, my students were to showcase their two-week project submissions. I've traditionally done this in an unstructured way, something like an academic poster session. Just a few minutes before class, I thought to myself, "What if I tried out that crit format here?" I literally did not have time to lay out even the simplest of templates, so I just grabbed a stack of blank white paper and headed downstairs to class.

I told the students that, during their showcase, they had to write at least three outcomes from their discussions. I suggested (following Travis) that these could take the form, "I learned X," or, "Y is something I want to learn more about." I also foreshadowed that there would be a secret final step.

As always, they walked around with real interest in what each other had done. This time, however, they paused after each station and jotted little notes on their paper. What might otherwise be fleeting thoughts were tracked, held on to.

Once we were done—and gave out the Audience Choice award, of course—I gave them the final step: to write down some action that they plan to take next that relates to the outcomes of their discussion. I gave them two or three minutes to do this before collecting their papers.

Both of my Tuesday/Thursday classes had major deadlines today, so it was quiet during office hours. I sat down in my chair, grabbed my favorite pen, picked up the stack of papers, and read through them. On each, I gave a little, hand-written affirmation, encouraging students or providing tips on how they might move toward their goals.

Paper! Wonderful paper!

I am looking forward to turning back their papers on Thursday. I wonder when the last time was for them that they had such a human experience as handing a teacher their ideas and then waiting, waiting without a chance of hearing from me about them before our next meeting. No anxiety about checking grades. No notifications. Quiet, from which comes a chance for peace. 

Paper!

Tuesday, September 24, 2024

Grading rather than improving

I talked too much today. I had back-to-back 75-minute class meetings, first of CS222 Advanced Programming and then of CS315 Game Programming. Both times, I spoke almost the whole time. I would much rather have had structured exercises to help teach what I wanted to show. It wouldn't have been that hard to set them up, just an hour or two each of setting up a template project that demonstrates what I want to show. I don't have an hour or two for each meeting for each class. I have filled my allocated class time with grading. This is partially due to the new grading system I am using. I'm having a lot of back-and-forth with my students. Turns out that getting them to mastery is a lot harder than giving them partial credit. I believe it's bearing fruit. But it's also taking all or more of the time I can give to a class. 

I am not sure what the path forward is. I will do less grading later as both classes move from individual lessons to large project integrations. Then, however, it's too late: we will have passed the point in the semester where a strong introduction is better than 75 minutes of my talking.

Thursday, September 19, 2024

CS222 and CC17

It has been many years since I have required my CS222 Advanced Programming students to read chapter 17 of Robert Martin's Clean Code. This chapter is entitled "Smells and Heuristics," and it contains a wonderful collection of common code problems and potential solutions. This year, I had my students read the chapter just before starting our two-week project, and I gave them the challenge to pick three items from the reading that were particularly interesting to them. These were fun for me to read, displayed thoughtful reflection on programming, and to top it all off, were easy to grade.

Some of my favorites showed up in the students' responses, such as the advice to extract conditionals into named functions, to replace magic numbers with named constants, and to avoid selector arguments. Feature envy showed up more than once, which surprised me. Students recognized that some of their previous courses actually habituated them to these smells rather than their cleaner alternatives.

I need to remember to keep this assignment. I plan to ask my students today whether they think this chapter would have made a good introduction to our reading rather than a capstone on it. Because the chapter is so accessible, it's possible that reading it first might help them get better faster, and to do so before they get into the trickier distinctions such as SRP (Chapter 10) and the distinction between objects and data structures (Chapter 6).


Wednesday, September 18, 2024

Docs is code

Clint Hocking's birthday blog post led me to look at the EXP tabletop roleplaying game, and in turn, that got me looking at AsciiDoc and the Docs as Code movement. I understand completely the arguments that AsciiDoc makes against Markdown. Regular readers will recall that I experimented with converting my course plans to GitHub-hosted Markdown and almost immediately backed away from it: Markdown almost immediately requires a polyglot approach for anything significant. However, I don't see AsciiDoc nor Docs as Code as addressing what I consider the most important tool for technical writing: the ability to embed scripts.

I have been using lit-html for years (and Polymer before that). What it lets me do is separate the structure of my writing from its display. For example, when I write an assignment for my students, I might conceive of it as having a list of objectives. In Markdown, AsciiDoc, or even HTML, I could easily represent that information as an ordered or unordered list. Later, however, I might decide to change the representation, instead showing it as a definition list, or making sure the name of the objective is bold, or generating unique links to each individual objective. In any of those plain markup environments, I have to do this by hand or, worse, with a regular expression.

What I don't see from Docs as Code, although I admit I haven't done more than a cursory search through their materials, is the observation that docs is code. If I separate my model and my view, I gain a robustness that any journeyman programmer understands. For example, using lit-html, I can create a simple JavaScript data structure that represents a goal, with a name and a description. Either or both of these can be html templates, not just strings. With that structure defined, I can create a list of them for an assignment. Now, on my first pass, I show them as list by iterating through the list and dropping the data into list items in an ordered list. When my requirements change—as they always do—I can modify my script and make the same data into a definition list, section headings, etc. If I need to change the actual definition of an assignment goal, I can make that change explicit.

Of course, the whole thing is in version control with sensible commit messages.

I have taken a similar approach in the past to build documents using LaTeX, coordinating the execution of multiple scripts through GNU Make. That works when LaTeX is needed for document output, but it feels less elegant to me than being able to generate the HTML directly from the Javascript.

If you know of an approach in the AsciiDocs or Markdown vein that gives the same level of robustness as what I can do with lit-html, please let me know.

Tuesday, September 3, 2024

A Morning with Scum & Villainy

After writing about my first experience with Blades on the Dark, I heard from a friend who recommended that I also look into Scum & Villainy. It is a sci-fi interpretation of the Blades in the Dark rules following the Forged in the Dark license. I use "sci-fi" intentionally since the rules and setting lend themselves to space westerns or space operas—anything with scoundrels on spaceships—but it would be difficult to do science fiction with them. The rulebook makes it clear that it's drawing on the "rag-tag group of outlaws traveling across the sector" trope as seen in Firefly and Cowboy Bebop. Both are clearly space westerns.

This theme is a good thing, and those two shows are among my favorites. It is a shame, then, that one of the first things one notices on opening the book is that it doesn't mesh with these themes. Blades in the Dark sings out its theme in graphic design and illustration. In contrast, Scum & Villainy feels like it cannot decide what it wants to be. This is exacerbated by the initial impression given when the structure of the book and much of the copy itself are taken verbatim from Blades in the Dark. None of this is inherently bad, but it gave a negative first impression after having been given such a strong recommendation to read it.

The few mechanisms added to Blades are quite good. Developing your ship rather than your headquarters captures the theme well, also lending the feeling of an episodic series. Getting bonus dice from gambits is a welcome addition to my group, since they have a penchant for beating the odds by rolling consistently low.

My favorite addition consists of the three starting scenarios, one for each of the ships. Playing Blades in the Dark, or even just reading and imagining it, it wasn't quite clear where to start. Scum & Villainy gives more tightly scripted introductory scenarios. At least two of them boil down to simple chase sequences, but there is nothing wrong with that. Each of these scenarios has just enough background to fill in details as needed, and each one provides clear hooks into the next episode. On top of that, there three outlines for other, unrelated jobs that are fit for the theme of the selected ship. 

We got the game to the table yesterday morning, and I played with my three eldest sons. My third son had previously expressed disinterest in tabletop roleplaying games, having not enjoyed whatever fantasy game we had tried together once years ago. I convinced him to try this one, knowing that he's a storyteller at heart, and he and his two older brothers had a great time. They chose to be bounty hunters on a Cerberus ship, playing a Scoundrel, a Pilot, and a Mechanic. We played the recommended started mission: tracking down a member of the Ashen Knives gang with multiple bounties on his head. There was a little hiccup due to an ambiguity in the scenario description, but once we got into that, we had a blast. The two older boys had a handle on the action resolution protocol as well as the role of flashbacks. They used flashbacks much more successfully as part of the storytelling than in our two Blades games, using them to set up a two-pronged assault on the mark's location, and then using one to set up and soup up hoverbikes for big chase scene. A glorious failure by the Mechanic led to a potentially disastrous desperate situation for the Pilot, but he used a gambit and pushed himself to ace it. It was exactly the kind of thing one wants out of a chase scene.

There are a few places where Scum & Villainy falls short of its august predecessor, doomed perhaps by its own lineage. For example, in Blades in the Dark, there are sensible limits on how much coin (or value in coin) a person can carry or that one can stash. This is actually quite interesting, a point that I don't remember seeing before: you can only carry so much money, and you can only have so much liquid cash, particularly in a Victorian setting. Scum & Villainy borrows this mechanism despite it describing money as being kept as software on credsticks. Also, while both games admit that any given action may be sensible under multiple action ratings, the action articulation in Scum & Villainy feels more forced than Blades'. I suppose, due to my career, I am particularly puzzled by how the Doctor action rating is for "doing science" and the Study action rating is for "doing research." It makes me wonder about how I would take the Forged in the Dark idea and put it into a setting of my own choosing, as many others have done. For example, making Kapow! years ago was a great exercise in understanding how Powered by the Apocalypse ideas could apply to campy 1960's superhero action.

I find the setting of Blades in the Dark to be more intriguing, but the setting of Scum & Villainy appeals to me personally while also being a "safer" space to explore with my boys. One could do a sci-fi criminal gang drama, but maintaining a ship vs. expanding gang turf really pushes toward the Firefly vibe. Scum & Villainy certainly stands alone, although there are parts whose rationale would make more sense if one is familiar with Blades in the Dark. Reading the Blades book was enough to see why there has been so much excitement about it,  even though I know I'm late to the party. The Scum & Villainy book may have lacked some of this pizzazz, but the table doesn't lie, and we had a great time. 

Initial reflection on Bowman-style grading

I had my first batch of submitted student work last week, and I would like to share some reflections on exploring a new grading system. As I mentioned over the summer [1,2], I have revised two of my courses to use a new grading scheme. CS222 Advanced Programming and CS315 Game Programming are both using a technique that I have lifted from Joshua Bowman's work. This technique looks at each goal and assesses a student's contribution into one of four categories:

  • Successful
  • Minor revisions needed
  • New attempt required
  • Incomplete
The first and the last are clear, but I found myself tripping up between the middle two. I think this is in large part to an important distinction between this technique and Rapaport-style triage grading, which I have used for years. In that model, you have four categories as well:
  • Done and correct
  • Done and partially correct
  • Done and clearly incorrect
  • Not done
The distinction between "partially correct" and "clearly incorrect" is very clear to me, and these are the second and third categories for Rapaport. I started using that as a heuristic to differentiate between "Minor revision needed" and "New attempt required," but I don't think that's right. With Rapaport's approach, "partial correct" captures a huge category of errors that one would put into the "C" letter grade bin: such a submission has some elements of correctness but significant flaws. I think Bowman's "Minor revisions needed" is much closer to Rapaport's "Done and correct." Clearing up the differences between these two rubrics caused me to have to re-grade many submissions.

Bowman's philosophy, which I am also bringing to bear in my classes, is grounded in mastery learning. Hence, recognizing the affordance for resubmission is fundamental to understanding the system. I knew I wanted to throttle my students' resubmissions, so I set up a two-tier system. With minor revisions needed, students could make the necessary tweaks within three business days, then get full credit for their submission. With new attempt required, or if they didn't make minor revisions within three business days, they could resubmit at most one per week.

I switched to Bowman's model in an attempt to clarify evaluation, and I'm already confused. I think this kind of system could work brilliantly if there were any tool support for it, but every gram of this technique fights against Canvas. Not only does Canvas lack robustness to anything but the least interesting of point-based pedagogic models, it and its LMS ilk breed an intellectual laziness among the students. The student usage pattern is to look at how many points were earned and then ignore any formative evaluation. My conclusion so far is that doing this on paper would be a great improvement over using Canvas if it weren't for the fact that my students' submissions are often inherently digital and not just accidentally digital.

It is early in the semester, but I have yet to see that Bowman's approach is going to be any more clear that Rapaport's. I've been using Rapaport-with-resubmissions, and that fills the middle ground between a clear representation of points and clear feedback about which parts are wrong. I will have to give it another two or three weeks to see how students respond before I make any systemic changes: there hasn't been ample time to get complete submit-evaluate-resubmit-evaluate loops from enough students yet.

Last year, I experimented with EMRF grading and ended up quickly dropping it. Canvas had no clear way to express this system either, and I did not see any clear benefit from distinguishing between "excellent (E)" work and "meets requirements (M)". It's easy to blame the tool for its shortcomings, and in this case, that's exactly the right thing to do. I know folks who "make it work" with tricks and hackery, but in my mind, there is no excuse for having a system that demands that the only real part of a class is something that has points and contributes to a pool of points. It's not how learning works, and it's never been how teaching should work.

Monday, August 26, 2024

An Afternoon with Blades in the Dark

I heard about John Harper's Blades in the Dark tabletop role-playing game from a talented undergraduate student around 2018. He was creating his own RPG as part of a games research group I was running, and he regularly brought up Blades along with the Powered by the Apocalypse movement as inspirations. I came across it again when looking for information about non-hit-point damage systems, which Blades has, although not via inventory manipulation—the particular topic of my investigation.

I bought a copy of the rules, read them while on family vacation at the end of the summer, and found them quite inspirational. It made me want to run a session or two in order to see the systems in motion. As anyone who enjoys tabletop games knows, it's one thing to read the rules and another thing to try running them: the latter exposes ones incomplete knowledge from the former. I hesitated to invite my boys to play though because of the vicious nature of the game. Blades in the Dark is a game in which you play a scoundrel who is part of a criminal crew. You advance in the game through illegal and immoral activity. I prefer to encourage fantasies about heroic living. Yet, I found myself thinking about how one can learn from stories of heroes and of villains, and if nothing else, I knew my boys would also enjoy experience the game and exploring its setting. 

In my retelling of our experience, I want to highlight places where I felt unsupported by the book and online resources. This is constructive criticism meant that I hope myself and others can use to improve future sessions of this and other games. Note that in this blog post, I will be freely referencing rules and lore from Blades in the Dark. If you are the type who enjoys reading or playing role-playing games, I recommend you pick up a copy. That said, you can also get an overview of the rules from the public System Reference Document.

We played for about three hours on Sunday afternoon, during which time we created characters and crew and completed one score. I had downloaded and printed the recommended materials, and so we dove into character creation. I had also crammed a lot of lore into my memory, which made it harder to introduce the game at a high level. I would have liked a canned paragraph I could read to set up the experience for new players. It was also too late for coffee, which could have been a contributing factor. Also, my boys, because they are my boys, are probably not familiar with any of the cultural touchstones that are referenced in the rulebook: we just don't watch stories about criminals and antiheroes, and I myself had not heard of most of the things in the list.

The playbooks were useful for letting the boys start creating their characters. One picked a Leech (saboteur, tinkerer, alchemist) and the other, a Hound (sharpshooter,  tracker). Among the first decisions one makes in character creation is choosing a heritage, and here is another area where a handout would be useful. The book has short descriptions of each, but the playbook only lists the name. A simple handout that gives a single sentence about each would be sufficient for a table of players to pick the one they like; otherwise, the GM has to explain each while players hold the lore in their heads. When it came choosing vices, we had a similar problem: I explained that one had to choose the category of vice, then the particular vice and its purveyor (for example, "Poker at Spades' Tavern"). The boys looked at me rather blankly: without knowing more about the world, it was not at all clear what kind of creative boundaries they had for this. I remembered that there was a list of vice purveyors in the appendix, so I turned to page 299. They readily chose from this list, and this makes me think that this, too, should be a handout in the starting materials.

When we turned to creating the crew with the corresponding playbooks, I realized that we should have inserted a step before the character creation: a quick discussion about what kind of game we wanted to play. They didn't talk much during character creation, but when we got to crew selection, it became clear that the Leech wanted to do sabotage and the Hound wanted to do assassinations. I ended up encouraging them to compromise and create a Shadows crew, which leans more into Leech styles but should have room for the Hound as well. In part, I was thinking about an initial score that would resonate with their crew playbook, and I did not want to open with an assassination.

We had some trouble with defining the crew's hunting grounds. The district map from the downloadable resources was useful, and the players figured that their lair could be in Crow's Foot while their hunting grounds was across the river in Whitehall. After all, wouldn't a band of spies want to spy on something worthwhile? I looked up more details about the district in the rulebook, and I found that it was listed as having maximum wealth and maximum security. That doesn't seem like a reasonable target for a crew that is just starting out. Mechanically, the Shadows were Tier 0 but their targets would all be much higher tier. If there were a recommendation for new players, we could have just taken it. In the absence of this, setting up the crew felt overwhelming, being high-stakes and made almost blindly. We ended up shifting the hunting grounds to also be in Crow's Foot. 

A related complication came up in the required decision of how to deal with the faction that controls the turf containing your hunting grounds. Unfortunately, there is no concise summary in the rules about which factions control which hunting grounds. For Crow's Foot, I remembered that the Crows claimed control over the whole district, but that there were also smaller factions who were trying to take it over. When the crew thought they would use Whitehall as their hunting grounds, I had no idea who controlled it. Would the Bluecoats—the corrupt law enforcement officers—be the faction that gets paid off? This is another case where a simple reference or a table of defaults would really help new GMs who don't have the spare cycles to memorize the litany of factions. A GM can always override a default, but in the absence of a default, I felt stranded in 150 pages of lore.

The book gives a recommended starting score that brings in three competing factions and gives the player some choices about whom to trust and whom to target. I was intimidated at the thought of doing this because it introduces several important NPCs and multiple factions, and the scenario is still likely to require improvising believable within this complex setting of Doskvol. I had previously searched for tips about how to start a Blades game, and I had read an article by Justin Alexander about alternative starting situations. I liked the simplicity of his "Aim at a Clock" advice. The book provides long-term goals, with progress clocks, for each of the factions. Given that, Alexander recommends picking a faction whose goal plays into something the crew could do, then having them pick up a score on behalf of that faction. This felt more controllable, and so as the players were finishing up their crew details, I had already started pulling pieces together: the Red Sashes and the Lampblacks each want each other eliminated, the Lampblacks are pushing some new drugs in Red Sashes territory, and the Red Sashes want the players to stop the production of these drugs. This would advance the Red Sashes long-term goal to eliminate the Lampblacks, and it would give the upstart players a powerful ally. However, the players had already decided that their crew had paid off the Crows for their hunting grounds, which also introduces a little conflict, since the Red Sashes and the Crows both want control of the district. Also, while arson is hardly virtuous, I liked the idea of having my boys focus on stopping the manufacture of drugs rather than, say, assassinating a union leader.

Part of the art of running and playing Blades in the Dark is knowing how much planning is too much planning. It is so important that the designer put the planning constraints right onto the character playbooks: choose a plan and provide the detail. The crew knew where the drugs were being manufactured, but they knew they did not want to go in guns-blazing. They asked where the raw materials came from, which is a great question. In the moment, I decided that this was an Information Gathering move. That would make this the first dice roll of the afternoon, and in retrospect, I don't like it. Information Gathering is a roll without stakes. I would not have recognized how this put the wrong foot forward until reading Matthew Cmiel's thought-provoking (although hyperbolically titled) article, "The Unbearable Problem of Blades in the Dark." I like his heuristic that dice rolls should always be with stakes, but Information Gathering just gives you better results the higher you roll. Also, it wasn't clear to me if the fact that an action rating was being used for Information Gathering meant you could aid each other, take it as a group action, or push yourself. Looking back at it (in light of Cmiel's analysis and other reading), I think the intended answer is no. In any case, the players rolled, and they discovered that some materials come by carriage regularly and some come by ferry intermittently.

After having thought about it, I think this step could have been a small score of its own. It would have made a decent introductory mission to gather this information as part of a long-term plan to take down the factory. Indeed, this would have helped me meet my own goal of understanding the whole Blades in the Dark system, including downtime. As it is, we did not have time to wrap up the score or do the downtime actions since real-life obligations interrupted the session—including the need for the dinner table.

The crew decided that this would be a stealth mission, sneaking into the factory via the river, starting a fire, and then getting out. A few times, the players wanted to get into details such as whose gondola they could use, but I assured them that Blades wants us to get to the action. They made a standard engagement roll, and we picked up the action with them silently sliding their boat into the factory. I described an enclosed dock with several rowboats moored to it along with two thugs, chatting and smoking. The players and I had a good short discussion about how to use the game's rules to indicate the character's goal in a fiction-first approach, and they decided that their goal was to sneak past the guards and into the main body of the facility. They succeeded at this but with the complication that the factory floor was just beyond the crates. 

The players talked about trying to sidle up to the work tables and pretend to be laborers, and we discussed how a flashback could be used to set up an insider. Instead, they decided to go for broke, with the Leech tossing a vial of flammable oil into the midst of the work area while the Hound fired off a few rounds to cause a panic. I told them that this was a desperate move, and the Leech botched the roll. Here is where things started to go badly for our Shadows. The Leech badly failed the roll, getting all ones and twos. Since it was a desperate move, I described how he botched the throw and spilled the oil mostly on himself, inflicting Tier 3 Harm. This allowed me to introduce the rules for Resistance rolls as well as Armor, and by using both, he reduced the consequences to minor burns, Tier 1 Harm.

As part of my post-play reflection, I realize now that I violated one of the GM rules for Blades in the Dark: "Don't make the PCs look incompetent." I treated the roll like a critical failure in part because of the incredible number of ones that the player rolled. It was also funny, in a tragic sort of way. However, if I could do it all again, I would have had him throw the vial and have it hit something else, something dangerous to them but not immediately deadly, and certainly not something as incompetent as wandering onto a factory floor and setting himself on fire. Alternatively, since I had already established that the workers were dealing with open flames as part of the production process, they could have immediately followed a fire suppression protocol.

Their cover blown, the Hound decided to use his special ability to lay down suppressing fire to buy them some time. Unfortunately, despite having taken a Devil's Bargain that this action would anger the Crows who claimed control of the district, the Hound botched this roll, too. This was clearly a desperate move, and after accounting for armor, he took a bullet in the chest, Tier 2 Harm. 

At this, the crew decided to beat a hasty retreat while trying to start a fire near the docks. The Leech had plenty of fire oil to attempt this. They wanted to escape, but they also wanted to succeed, so I offered them another Devil's Bargain: they fling the fire oil recklessly and end up setting fire to the very boat they came in on. The Leech took it and got a partial success, so I described how this area went up in flames, but several Lampblacks from the work floor were charging at them, wielding pistols and clubs.

The crew charged at the two guards who were still standing by the rowboats, and the Hound incapacitated them with some quick shooting. The complication for this filled up the clock I had started for the Crows' tolerance. At this point, the big faction controlling the region was going to take action against our Shadows for causing such chaos. The crew was more concerned at this point about survival, so they tried to unmoor a boat and get out before the charging reinforcements arrived. You guessed it, they botched this roll, too, and both of them took a beating in the attempt (Tier 2 Harm). 

Faced with no other viable option, they undertook a desperate maneuver and dived into the water to swim away. This, dear reader, resulted in the first and only six that they rolled the entire afternoon. Despite their burns, bruises, and bullets, they swam out of the dock area and into the river. To me, the fiction demanded that some of the Lampblacks grabbed a boat and chase them, but I also realize that this was a place where we could have made their exact goal more precise in our discussion: did they think that their diving into the water was to get completely safe or to simply get out of the immediate scrap? I interpreted it as the latter, but we could have been more clear.

I started a four-slot clock for them to evade the Lampblacks and gave them one tick for swimming out into the river. The players thought their only choice was to swim for shore, and I pointed out that there were some other options, such as swimming out into the river, or pleading for their lives. That said, swimming for shore made the most sense in the moment, so they tried... and botched the roll. The Lampblacks in the rowboat got into the river and took a few shots at them. This was enough to max out the Leech's Stress, and he took the Trauma of being Unstable, which is completely understandable given how badly this mission had gone. 

Now we were in a strange situation. I had established a progress clock for the crew's escape, although it was down to just the Hound now. He said he would just swim for shore, but I recognize that this would violate the Blades GM advice, "Don't roll twice for the same thing." It felt like it would just be "Swim again, but better this time." That didn't feel right, so I retconned the previous situation so that the Lampblacks had brought their boat between the swimmers and the shore. Hence, the Hound could do something like swim out into the ocean (with his punctured lung and bruises) or do something else, like beg for his life. He chose the latter, and I don't blame him. I offered him a Devil's Bargain on this attempt to sway the ruffians: he could have an extra die on his attempt if they let out their bloodlust by killing the Leech. To my surprise, he took it. The Hound knew that they were both as good as dead anyway. It was better for one of them to live than for both of them to die. The Hound succeeded at a cost, so they beat him with Tier 3 Harm and left him for dead on the shore.

We completed neither the score wrap-up nor the downtime activities. Both seemed moot with half of the crew dead, and as I previously mentioned, there were real-world pressures to clean up the table and get one of the boys to a youth group meeting. I plan on reading through the rules regarding how to wrap up a score later today so that I can do a mental walkthrough of how it would go. If the three of us play again, I think we'll just start afresh with a better understanding of the world and the rules.

And I would play again. Despite the game going badly for the crew, we all enjoyed the experience. It was a little rocky at times when I had to reference rules or lore, but that's the way it goes when you learn a new system. Every review of Blades that I have read says that you have to play several sessions before you really get into its way of playing. Although I would play again, I do not currently have plans to play again. I think it would be great fun to play with an adult group with beer and snacks, but getting a bunch of fathers together for a game night is already a desperate move where the dice are loaded. In the meantime, my boys and I got to share a fun afternoon together, and now they have a story to tell about what happens to those who turn to a life of crime.

Thursday, August 22, 2024

User Stories, Being Able To Do, and Philosophy

I just encountered something so delightful that I wanted to share it. As I thought about who else would enjoy this, I realized that it may just me. I decided to share it here on the blog in hopes that I forget it, search for it, and find it again later. (It wouldn't be the first time that has happened.)

I guide students through a lot of user story analysis, including but not limited to games-related courses. Years ago, I noticed a tendency for them to write a user story statement like, "I want Mario to be able to jump." I am pretty sure I also used to write them this way, too. At some point, it dawned on me that players don't want Mario to be able to jump: players want Mario to jump. Once I realized that, I saw the strangely passive "to be able to" in practically all of my students' stories. I've been on the lookout for this structure ever since, finding it akin to passive voice in prose: best to be eliminated.

This morning, I found myself reading a part of the Summa Theologiae as part of research into classical definitions of vice and virtue. In it, Aquinas tackles the question of whether a vice is worse than a vicious act. His response, in Sum I-II, 71, iii, co., includes the following.

For it is better to do well than to be able to do well, and in like manner, it is more blameworthy to do evil than to be able to do evil.

There you have it: a classical argument against the passive "to be able to" in user stories. 

Tuesday, August 20, 2024

Notes from Cal Newport's Deep Work

Some time ago, I had two people recommend Cal Newport's Deep Work to me in the space of one week. In fact, one of the two people assumed I had already read it. This was enough for me to put it on my reading list, and I got to it this past summer.

One of the delightful surprises early in the book is that Newport is a Computer Science Professor at Georgetown. Knowing nothing about him nor the book before getting into it, this was fun to come across. Although I haven't met him, I am happy that he's been able to find success in both technical research work and mass-market work like Deep Work. That said, I find myself wishing that Deep Work was written with more academic convention, including specific references at the points they are needed. Instead, references are given as notes in an appendix, but I don't know what that gains. By contrast, I am currently reading Edward Castronova's Life is a Game, which is very accessible but does not shy away from being specific about its citations. (More on that book another time.)

The premise of the book is that real value comes from deep work, the kind of work that requires focused attention and time to make progress. Newport pulls in a foundation from performance psychology to support this, pointing to Ericsson's deliberate practice as critical scholarship in that area. Deep work requires expertise and insight that is idiosyncratic and cannot be automated nor replaced. Newport acknowledges the value and role of shallow work as well, but he recommends establishing a shallow work budget so that it does not eat into deep work time. He recommends 30-50% as a reasonable budget, especially since the research indicates one cannot get more than four or so deep hours in a workday.

Newport cites Hoffman and Baumeister's Willpower, which reports on their finding that people have a finite amount of willpower that is depleted as they fight desires throughout the day. They point to routines and rituals as methods of sustaining or automatizing willpower in the face of desire. This section of the book has stuck in my mind, and I find myself ruminating on willpower as a diminishing resource in my family life, my work life, and as a game designer. 

I am intrigued by Newport's discussion of David Dewane's architectural conception of an office space that maximizes deep work potential, which he calls the Eudaimonia Machine. Details can be found once you search for those terms. The idea is that the space is a progression of depth, including intentional movement through inspirational spaces, from communal toward individual. I am not surprised at its monastic qualities, but I do not know if Dewane has discussed this connection or not.

Newport bookends his own work days with startup and shutdown routines. The last few work days, I have tried his startup routine: blocking out the day's hours, populating regions from my task list, updating the schedule to deal with unexpected twists in the day, and annotating blocks where deep work has happened. So far, so good. I've been doing informal estimations like this for years, and so my days have worked out pretty much as intended. Doing this deliberately has made me consider prioritization more explicitly than I would otherwise. 

What I have not done, and what I have never done well, is have a shutdown routine. Newport's involves a final email check, managing the task list, making rough plans for the next few days, and then stepping away until the next day. Deep Work gives me a name for a problem I am sure we all regularly face: the Zeigarnik effect. This states that incomplete tasks dominate our attention. They sure do. Even in the few days I've been trying the startup routine, I have had more than one case where I "check my email" only to find messages that then impose a drain on my attention. (I need to take my own medicine here. I regularly tell my students never to check email, only to process it.) This leads to my most significant Achilles Heel: a unified inbox. I decided decades ago to manage one inbox for all my messages so that I could always find what I needed. The problem is that personal and business messages both end up in the same place. When I'm looking for an update on a Kickstarter board game, I don't want to find a request to serve on a committee. The other side of the coin, though, is that if I'm looking for information about that board game and one of my students has problem that I can mentor them through in two or three sentences, I don't mind helping them out. Unfortunately, there's no way to eat this cake and have it too. It is an experiment I wouldn't mind running, but changes in provider interfaces would mean there is no going back.

Several concepts in the book had me reflecting on the weirdness of higher education, particularly public higher education. (Newport is surprisingly mum about the problems of academia, something that must have taken great restraint.) One such concept is the Principle of Least Resistance: "In a business setting, without clear feedback on the impact of various behaviors to the bottom line, we tend toward behaviors that are easiest in the moment." The bottom line at a state university is so far removed from faculty's daily activity that it is rarely discussed. Newport also has a lot to say about the dangers of "network tools," which seems to mean any electronic communication medium but is especially focused at social media. Here, he describes the Any-Benefit Approach to Network Tool Selection, which argues for using any network tool if it will provide any benefit at all, regardless of cost. This is another trap where academia is particularly prone to capture and for similar reasons. 

Regarding network tools, it is an oversight when he lumps blogging in with the likes of Facebook and Twitter. He treats blogging as if it is to build an audience, but there are many of us who do it for ourselves. Someone told me years ago, and I have found it to be true, that doing writing work in public encourages quality and clarity that could otherwise be illusory. (Indeed, it took me a few minutes to get that very sentence how I wanted it.)

Newport mentions Covey's The 4 Disciplines of Execution, a book that addresses how execution is harder that strategizing. The titular disciplines are: focus on the wildly important; act on lead measures; keep a compelling scoreboard; create a cadence of accountability. ("Lead measures" are in contrast to "lag measures." An example of the latter is customer satisfaction, while of the former, free samples given out. You control the lead measures directly and the lag measures follow.) It is no surprise that these align with agile software development practices, but I should keep this in mind should I end up in conversation with someone from the College of Business and we're looking for common ground.

The author's "Law of the Vital Few" is his spin on the 80/20 Rule or the Pareto Principle. His advice regarding it is surprisingly practical: if you are working toward a goal, and you're not in the effective 20%, then you should consider choosing a different goal. There's a business orientation here that may be valuable, but as with the discussion of network tools and blogging, it is also dangerously reductionist. It may just be my strange position as an academic, but it seems to me that one ought to consider one's intrinsic motivation, satisfaction, and sense of purpose in addition to measuring external factors. Some of my best academic works have zero or few citations and won't move the needle on any discussions, but I am a better person for having written them.

The last nugget I would like to share is Newport's heuristic for distinguishing between deep and shallow work. It surprised me that this came so late in the book. Yet, he presents it as a self-test, and I struggled to find the right answers, so there must be a wisdom to his placement. His heuristic (spoiler ahead) is, "How long would it take (in months) to train a smart recent college graduate with no specialized training in my field to complete this task?" Knowing that experts can only maintain about four hours a day of deep work, this heuristic can be useful in scheduling to ensure enough deep work gets done.

These are my notes and not a review, but everyone I've talked to about the book has asked, "Is it worth reading?" I think that if one is looking for tips on improving personal efficiency, then it is worth it. The book is breezy, and this comes with the necessary caveat that complex ideas are treated superficially: one has to recognize that to really understand, say, willpower as a diminishing resource, one would need to take a deeper dive into it. Newport's motivating principle is true: that deep work is necessary and valuable, and that one can learn to do it better. I think it would be worthwhile to combine a reading of Deep Work with something to remind ourselves that joy will never be found in the pursuit of success.

Tuesday, July 23, 2024

Mulberry Mead

It's time again for What is Paul Drinking? Today's notes are from my latest batch of mulberry mead, which I mentioned in my notes about making lattes. I think this is my second batch, with my first having been made in 2022. I found a bottle of 2022 in my cabinet, and if I've made other batches, they are lost to time. 

I put about a quart of mulberries into a saucepan with enough water to cover them. I cooked them a while to soften them, then gently muddled the berries, turning the water into beautiful purple. What I should have done (and what my wife recommended I do) is get as much juice out of the berries as possible then just use that in primary. Instead, I had the idea that I wanted the whole berries in the fermentation. I put all the solids into a mesh bag and dropped it into my usual mix of three pounds of honey and D47 yeast.

Unfortunately, the bubbling action of the fermentation lifted the bag right up and out of the water. In retrospect, that is predictable. I would have needed to use some weights to keep the bag submerged. However, I only wanted to keep the fruit in the fermenter a few days, and if I submerged it, it would have to stay until racking. In short, I had made a problem for myself.

Next time, just smash the juice out the berries and use that. 

In any case, the result is a lovely color. It has a subtle berry flavor, which I understand to come from the fact that the fruit was added in primary. It's quite different from when I infuse a mead with fruit, which picks up the fruit flavor more intensely. It also came out quite dry. Sometimes that is what I want, and sometimes I add just a splash of simple syrup to the glass before drinking. That's much simpler than formal backsweetening with no danger of exploding bottles from restarted fermentation.



Friday, July 12, 2024

Summer course revisions 2024: CS222 Advanced Programming

I made a few significant structural changes to CS222 for the Fall semester. The course plan has just been uploaded, so feel free to read it for the implementation details. The motivation for all the changes was the same: reduce friction. The course has always had a lot of stuff going on in it, and students seem less able to manage this than they could in the past. For example, it used to be that I could explain triage grading such that most of the students understood it, but students become more brainwashed into the LMS way of running a class, they become less able to conceive of alternatives.

I decided to use the same grading scheme in this class as I am trying in Game Programming. Each assignment will be graded on the scale Successful, Minor Revisions Needed, New Attempt Required, or Incomplete, following Bowman's case study. The EMRF approach that I tried last year did not work, and I am hopeful that this alternative alternative will patch some of the leaks. I considered breaking down the CS222 assignments into individual goals, as Bowman does in his math courses and as I have done in Game Programming, but I found it to be unnecessarily complicated to do so. Instead, I have taken each day's work and consolidated it into a single assignment with multiple graded parts. I hope that this, too, simplifies the students' experiences.

I am still using achievements, but I have changed how they are assigned and assessed. For many years, I have had an open submission policy, where students can complete achievements at any time, and their final grade is based on the quantity and quality submitted. This gave students one more thing to manage, and it was something that could not easily be represented in Canvas. My wishing that students didn't delegate or subjugate their planning to Canvas won't change the fact that they do. Hence, I'm just asking students to do three achievements during the semester. It will be like choosing an assignment from a menu. Since they are otherwise a normal kind of assignment, I don't need special policies for resubmission, either. Maintaining this parallel structure between achievements and assignments also made me remove the star system evaluations. Previously, students could claim one star through self evaluation, two through peer evaluation, and three through expert evaluation. I love the idea of having students review each others' work in this way, but in the name of streamlining, I have removed it. Since I don't have this kind of peer evaluation on other assignments, I am going to remove it here as well.

From the beginnings of CS222, I have used Google Docs to manage submissions so that students can see and comment on each others work. I used to spend time in class doing more peer review in this way, but this got cut out as new "content" was added to the course. Google Docs stayed as a convenient way for me to see student work and especially for students to do the peer reviews required for achievements. Taking those away means there's no real good reason to make students go through the process of submitting through Google Docs. As students' general computing literacy has declined, I have had more and more trouble with students understanding how to use Google Docs and the browser according to the instructions. Now there's no reason besides tradition to keep it, so out goes Google Docs.

I still want to keep my course plans online and publicly available rather than having them stashed away on Canvas. However, my old approach to managing the course site as an SPA made it impossible for me to link directly to specific parts of a document. Somewhere between .htaccess configurations and shadow DOM, I could just not make it work. This was especially frustrating since this is so simple in vanilla HTML: just link to a named anchor. With the change in how I am assigning and evaluating work, I decided it was time to make this work. I have spent about two work days fighting with web development and finally ended up with the solution you can find on the course plan. I have kept lit html and Web Components because of the powerful automation tools they provide: I can define the data of an achievement, for example, and use Javascript and HTML templates to generate the code that displays it. I have stopped using the open-wc generators and npm. I looked into trying to use the open-wc generator and rollup without the SPA configuration, but it turns out that the instructions for doing this are not up to date: they produce a conflicting dependency error. Hence, I just went with a simple deployment solution that copies my source files and a minified JS lit-html library to the web server. Even though I already wrote about my frustrations with maintaining my Game Programming site, and how they led me to migrate the site to GitHub, I am thinking about revisiting that decision based on the work I've done to get the CS222 page working properly.

Friday, July 5, 2024

O teach me

I recently read Romeo & Juliet for the first time since the early 1990s. I was struck by this particular line by Romeo in Act 1, Scene 1:

    O teach me how I should forget to think

At the time, he is smitten by unrequited love, yearning for a woman who he has convinced himself will bring him complete joy. It inspired me to make this.

UPDATE: A friend's commentary on this idea was too brilliant not to pursue, so here are a few more for the album. Maybe I will make up some T-shirts next semester.



I wonder if one even needs Benny on there? It would probably work just as well without. Here's a set you can use for your own satirical ends.





Wednesday, July 3, 2024

Thoughts on Koenitz's "Understanding Interactive Digital Narrative"

Harmut Koenitz's latest book, Understanding Interactive Digital Narrative, quickly establishes itself as being postmodern and political. I am grateful for his overt framing, although significant portions of the book contradict the tenets of moral relativism and subjective truth, as I will discuss below. Whether populism truly is a "cancer" that only leads to violence and trouble does not come up again in the text beyond the introduction.

A primary contribution of the book is Koenitz's System-Process-Product (SPP) model for analyzing interactive digital narrative (IDN). SPP recognizes that the system is created by the developers, that it is reified through a process of user interaction, and that this results in a product, which can be the discourse about the experience or a recording thereof. SPP includes a "triple hermeneutic" that users would bring to the experience, recognizing the interpretation of possibilities for interaction, the interpretation of instantiated narrative, and the reflection on prior traversals, which entails using memory from prior traversals. The explanation of SPP draws explicitly on familiar concepts from object-oriented software development, describing how IDN systems are instantiated through interaction as being like how objects are instantiated from classes. I remain surprised that this model would be considered revolutionary since it is exactly how any reasonable game developer would think of their work: design ideas are captured in code and assets; the player interacts with the dynamic system; and as a result of that experience, players can talk about it or share their playthrough.

Koenitz brings up the cautionary tale of Microsoft's Tay, the chatbot that was taken down after only 16 hours due to its absorbing and then repeating racist content. In a book that is otherwise about the boundless potential of IDN, the author here exhorts the reader that there must be protections in place to prevent IDN from exhibiting such behavior. This reveals a significant gap in his analytical model. SPP has no affordance to talk about morals and ethics outside of a participant's or scholar's subjective interpretations. The analytical framework in the text lacks the epistemological power to claim that any player activity is ethical or unethical. The claim that some interactions are universally unethical reveals that the author is using a different interpretive lens than the one he describes.

I appreciate his lengthy treatment of the narratology vs. ludology wars and its numerous references. I transitioned into games scholarship when this conversation was cooling. Koenitz's claims that the ludologists' primary mistake was narrative fundamentalism. Because they believe in only one kind of narrative, they misunderstood the narratologists, who had special knowledge of the avant-garde and multiplicity of narrative forms. I am not conversant enough in this literature to support nor critique his arguments, but the unyielding insistence that the opposition has no merit leaves me wanting to hear a bit more from the other side.

The book includes a discussion of the interpretation of Bandersnatch. He explains how interactors have created different mappings of this IDN's formal structure based on their experience, pointing out that none of them are "the structure of Bandersnatch" but rather are each "an interpretation of the structure of Bandersnatch." He also claims, however, that "unless the original design documentation is released, we cannot be sure, and therefore different interpretations of the underlying structure exist." 

Two things struck me about this claim. The first is that he presumes the existence of an authoritative and correct "original design documentation." This seems like the same fallacy that desires design bibles in games and BDUF in software development. My experience is that there may be some original design documentation but that the design-as-such is only definitively manifested in the system. Anything that specifies the possible player experiences at the fidelity matching actual player experience is homologous to the system itself. (Incidentally, one of the reasons most of my projects are released as open source is to allow the curious to study the actual system and not just interpretations of it.)

Second, there is a contradiction inherent in defending the recognition that Bandersnatch has a structure and that all interpretations are valid. He states that the differences in mappings "do not mean that any of these interpretations are wrong in absolute terms, but rather that we need to be aware of their epistemological status as post-factum interpretations." How can the interpretations not be wrong and yet the difference in interpretations be contingent upon the original design documentation not being released? That is, there is an implicit acknowledgement that there is an absolute and authoritative structure, and that these interpretations are approximations of it, such that if one had the former, some of the latter could be shown to be wrong. It is possible that a commitment to postmodernism requires one to admit the viability of demonstrably-wrong structural interpretations, but if that's the argument here, it's awfully subtle. If there is a difference between someone's structural analysis of Bandersnatch and its actual structure, then that means that it can be demonstrated with formal analysis or automated tests. I would call that interpretation of the structure "wrong." Aside from this engineering-design perspective, one can see the problem from the lens of constructivism: the interpretation yields a non-viable mental model because it makes incorrect observations about the world. It reminds me of Koster's point about Monopoly: you can use house rules all you want, but you have to acknowledge that you're playing a different game with the same pieces. If someone's model of Bandersnatch leads to contradictions against the actual thing, then it's either wrong or it's a model of something that doesn't exist.

Koenitz uses a cooking analogy to distinguishing between the prescription of a recipe (a specification) and the description of food (a product of the experience). It's a reasonable metaphor, but here is where also writes, "Far too long have we tried to learn how to cook from descriptions of finished meals." There is no referent for "we," but I don't consider myself included. When I decided I wanted to get better at making games, I didn't turn to descriptions of games: I turned to the writings of people who talk about why and how to make games. Indeed, it's hard to imagine how one could expect to get better at any art form by only looking at descriptions of experiences of that art form. Who is "we" then? He must mean "my community of IDN designers."

The penultimate section of the book provides advice on how to design IDNs. It is what any seasoned designer would expect, and it repeats what has been documented in countless books on video game design: specify goals, create prototypes of increasing fidelity, produce the software, and test. This is the conventional production process that has been talked about in games since at least Cerny's Method talk in 2002 and well before that in user-centered design. The approach is so well established that it allows scholars like me to interrogate it to determine where agile software development methods can improve it.

Reading Understanding Interactive Digital Narratives helped me understand both IDN and the community of IDN scholars. I learned some new ideas from it that will certainly be helpful in my thinking and writing, including narrative fundamentalism, narrative ambivalence, and the cognitive turn in narratology. I applaud Koenitz for his insistence that precise definitions of words like "story" and "narrative" are necessary and that lazy or colloquial use holds back progress. Indeed, I respect that he doesn't insist that people use his definitions necessarily, but that one has to define their terms in order to ensure that they are understood. I believe the SPP model will provide a useful starting point for learners who wish to analyze IDNs, including games, especially those learners who don't have a background in systems design. However, for my students who want to get better at writing for games, I will continue to recommend Bateman's collection.

Tuesday, July 2, 2024

Representing character damage through loss of skills and equipment

I was surprised to come across two recent tabletop RPGs that both eschew "hit points" as a means of representing character damage: Lester Burton's Grok?! and Runehammer's Crown and Skull. Neither cites the other nor any common inspiration, which makes me think that there's an interesting games history project hiding in here: is there a common ancestor or is it convergent evolution?

In Grok?!, the player has seven resource slots that can hold items. When a character suffers duress, there are a few possible outcomes. The player may choose to create, remove, or change one of their items, or they may take a condition that uses up a resource slot. These are intended to be temporary, but if a character has no more slots, then the character is incapacitated and the condition instead becomes a permanent trait. A player may also voluntarily add conditions to their character in order to roll additional dice after failing a check. Grok?! is clearly a story-focused game, using an elegant universal resolution system that invites creativity and narration.

Crown and Skull has an intricate point-based character-creation system in which players determine a character's skills and gear. Taking damage involves crossing off skills and gear, which is temporary, and sometimes destroying gear, which is permanent. Damage is classified by whether it targets skills, equipment, or both, and it is further classified by whether it is a random target or whether players choose. Runehammer describes this as an attrition system, and it's easy to see how it invites more interesting narration than "You lose five hit points." Crown and Skull is presented as a game that the players themselves get better at, learning more about it by playing it. Part of the challenge of the game is learning to create and manage a versatile, robust, survivable character.

I have played a lot of CRPGs, but I don't remember ever seeing a system like this—one where damage is exclusively represented by the temporary or permanent loss of gear or skills. It makes me wonder how well such a design could be adapted into a video game. Could such a system be adapted into a satisfying video game experience, or are these formal systems too strongly coupled with the improvisational storytelling of tabletop games?

Friday, June 14, 2024

The Endless Storm of Dagger Mountain: A short adventure game that is Powered by the Apocalypse

Introduction and background

Last night, I released a new game into the wild: The Endless Storm of Dagger Mountain. I submitted it to Crossroads Jam 2024, a statewide game jam sponsored by the Indiana Gamedevs community. 

This game scratches a creative itch that I've had for over two years: what happens when you apply Apocalypse World style rules in a digital game? I've had this as a component of a few different design explorations, none of which bore any fruit—sometimes because they weren't fun and sometimes because their scope exploded. In fact, in May, I started work on a project that was growing too large, and it included PbtA elements. By the last week of May, I had put that side project to rest. I decided that I could use Crossroads Jam as an excuse to isolate just this single design idea—digital PbtA—and package it up into a jam-sized game. Readers may be interested in looking at my previous exploration of tabletop PbtA, which took the form of Kapow! The Campy Superhero Role-Playing Game. I also wrote an essay comparing the math of PbtA and d20 systems.

I was a little disappointed that the theme "severe weather" won the polling on the Indy Gamedevs Discord. Since this is the first Crossroads Jam, it seemed like a great opportunity to highlight something positive about the state. You know, like corn. More seriously, there are a lot of great things about Indiana, including globally-recognized events like the Indianapolis 500. And corn. But I digress, and others preferred "severe weather." I had been wanting to explore some pulp fantasy writing a la Robert E. Howard's Conan stories, which I read a few years ago. This presented a good opportunity: a lone, stoic hero, making a long journey up to the top of a mountain where dark magic has brought about the destruction of the innocent.

Game and Narrative Design

Most of the writing is really just a first draft. Despite years of gamedev, I have done barely any game writing. It felt good to get my hands dirty and create enough content to carry the gameplay. I estimate I spent about 15 hours just writing content for a game that takes a few minutes to play. The writing was enjoyable, but especially as I got tired, I couldn't shake the feeling that much of the text was low stakes. Each dice check has at least three paths—succeed, succeed at a cost, and fail—and I tried to keep them equally interesting. I do not spend a lot of time with interactive fiction, but I quickly ran right into the same problem that any narrative designer has: with chance or with agency comes the loss of authorial control. There are a few specific scenes in the game that I would like to spend more time on, to make the story more compelling and to evoke Howard more strongly.

I was not able to pull in all of the tabletop inspiration that I originally wanted. In tabletop games, I love the idea of the Countdown Clock or what Runehammer calls Timers. A simple timer that ticks down becomes a source of tension for the player and, in the system, it's another formal element to manipulate. In my first draft of Dagger Mountain, I had a timer that, if it ran out, would cause the game to end before the player could summit the mountain. This worked well as a penalty, especially when asking the player to choose between attribute reduction and advancing the timer. However, the game ended up being too short to make the timer meaningful. In fact, one of the problems that inspired me to think about removing the timer was the trouble of visually representing a "timer" as something with only two or three units. I could not reasonably balance it and give it a significant value: a timer with two clicks feels more like it's depleting some other resource rather than feeling like time.

The other common PbtA element that I didn't add was what Apocalypse World calls "reading a sitch." The idea here is that a player can spend an action trying to understand a situation, where success or partial success determines how many questions they can ask about it. Systematically, this is simple: I could have a preconstructed list of questions and answers, and these could provide lore and setting information. As I got into building the narrative, this felt like it would not have a good return on investment: every other decision produced changes in the world state, such as modifications to attributes, and I didn't want "reading the sitch" to be a wellspring of mechanical benefits. It would be relatively easy to add this into my software since it was in mind from the beginning, but it did not find its way into this project.

Speaking of attributes, I still have something of a pipe dream that one could make an RPG attribute system that is consistent with Thomistic philosophy. Consider that the legacy of Dungeons & Dragons presents a sort of dualism, that the mind and the body are separate. Yet, as confirmed by a recent conversation with a weight-lifting friend of mine, the two must work together: it's not clear that Strength (as physical might) and Wisdom (as willpower) are independent variables, for example. I couldn't find a way to distill a more Aristotelian view of the human person into three or four attributes, but reviewing Apocalypse World's attributes, I was reminded that they describe how one does something rather than what someone is. That's a great hook for future design work. In the meantime, for this project, I made a list of actions that I wanted the player to perform, knowing the genre and setting, and I categorized these into the three attributes that are in Dagger Mountain: bold, determined, and savvy. I admit that there are a few stretches in the game where penalties might be hard to classify in these ways, but I will be curious to hear what players think about them as a trio.

Technical Considerations

Inspired by Knights of San Francisco and the beauty of Dart, I started writing Dagger Mountain in Flutter. It's a beautiful way to write applications, but there's a significant difference between declarative and imperative UI programming, and I find that I stutter a bit when I hop between them. I set up the essential architecture and was enjoying myself until I tried to implement some UI features, specifically the scrolling list of text and buttons along with placeholder animations. There is a lot of typing required, animations were hard to debug, and it wasn't always obvious where my problems came from. I am sure there was a lot to learn from the endeavor, but I was on a deadline and wanted to get things up and running quickly. In transitioning back to the comfort of Godot Engine, I realized something: while dart's asynchronous programming features are wonderfully expressive, there is a real power in GDScript's simple signal syntax. It is hard to get more terse than that, although it comes at the cost of not having explicit control over things like Futures. I returned to Godot Engine, re-creating everything I had written in dart in very little time. With the design decisions made, I just had to interpret it in the new environment and type it up.

I spent too much time in Godot Engine adding dynamic font resizing. I knew I wanted the game to run comfortably on a desktop or mobile browser, and giving the player control of font size seemed the best way to do this. It required a lot of shenanigans with theme overrides, and as I added more visual elements such as the visible dice, it got more and more convoluted. Near the end of development, I just gutted this feature from the play experience and put a configuration on the main menu, which allowed me to just fiddle with the values in the main theme rather than deal with distributed theme overrides. What really irked me was when I started working on deployment, realizing that the cleanest solution for the player would be to use the browser's built-in font rendering and resizing... the way that a Flutter app would have done. Sigh.

Here are a few summary observations along these lines. Flutter is great for its static typing, robust asynchronous programming support, autoformatting, built-in browser font resizing support, spread operator, null safety, and most importantly, refactoring support. Godot Engine clearly wins on terseness of signals and tweens and the ability to rapidly build and test scenes independently of each other. To clarify that last point, I regularly decompose my Godot Engine programs into scenes that I can run and configure by themselves, confident then in how they will work when instantiated as part of a larger system. In Flutter, I wish I could easily say, "Spin up one of these widgets by itself and let me tinker with it," but I have not found anything that comes close to Godot Engine's rapid development support this way.

Incidentally, I did briefly consider other options than writing my own engine. I am intrigued particularly by ink, which I have never used. I was hesitant to jump into something with such a different syntax, although I am sure I could learn a lot from it, too. What killed the deal for me though was that it wasn't clear to me that I could easily plug in the PbtA aspects that I wanted. I discovered a Godot Engine integration, so perhaps I will investigate that later this summer. It wasn't until my family was testing the game that one of them mentioned Dialogic, which I haven't used since Godot 3.x. I haven't looked at it to see if it could have been modded for my purpose. However, writing for Dagger Mountain made me appreciate why narrative designers need better tools than just piles of scripts and a notebook sketch.

Dagger Mountain is my first released game that uses an event queue to isolate the game rules from the interface. I have tinkered with this pattern in several abandoned projects. Two summers ago, I spent a lot of time studying egamebook and its architecture, and I learned a lot from it even though that particular summer project was never released. 

My approach separates the software into three parts. The module is the content of the adventure itself, the story of Dagger Mountain. Each scene in Dagger Mountain is a GDScript file that is given a reference to an Adventure object. The next part is the rules engine, which is manifest in an Adventure object. The scene is given a reference to an Adventure object, and the module tells it to do things like show text, modify attributes, or present a series of choices to the player. Internally, the rules engine generates events to correspond to these interactions, posting them to the event queue. The final part is the presentation layer, which subscribes to the event queue. It dequeues events, processes them, and then notifies the rules engine when it is complete. The code is all free, so feel free to look at the prelude scene for an example of how this works.

I decided early in the project that I would repurpose GDScript as my narrative scripting language rather than create an independent data format that would be interpreted. The primary reason for this decision was the pressures of time: GDScript is already a scripting language, so using its support for functions and conditionals would be faster than writing my own. This is true, but I hadn't considered all of the costs at the time. The game runs through function calls in the module layer, which is nice for terseness but actually makes it hard to test in a modular fashion. I am sure that if I had used TDD, I would have had a more testable architecture. I would much rather have test coverage of the whole state space of the game; instead, I have to hope that my manual testing was adequate.

I did add integration tests near the end because of the need to await basically every call in the module layer. Missing a single instance will break the player experience. I wrote a test that reads through the module layer and looks for cases where await is missing. It took a little time get the test working, but it immediately found a case that I had missed, so that was worthwhile.

Conclusions

I enjoyed building The Endless Storm of Dagger Mountain, and I hope you enjoy playing it. I think I will go and tweak some of that text with this morning's remaining coffee. Despite its small scope and shortcomings, I feel good about having built it. Not only does it explore digital PbtA in a way that I've been imagining for a few years, it also gave me an opportunity to do some creative writing and build more empathy for narrative designers. 

Regarding digital PbtA, I think the jury is still out, since for every promise it has, it comes at the cost of dramatic increase in content creation costs. For interactive fiction, using PbtA resolution requires writing an enormous amount of text, much of which will never been seen my players. It is, of course, not the same feeling as the give and take of a tabletop RPG. However, I can see opportunities for using this resolution system if there were more supporting systems. In a larger game, for example, one could put back in "reading the sitch" style actions that give clues to puzzles. I prefer losing attribute points over the abstraction of hit points, particularly for a narrative-focused game, but I think this would benefit from more explicit representation. Gaining statuses like "confused" or "twisted ankle" would help carry the narrative forward, but then these would be most meaningful if worked into the other systems or stories of the game. All that being said, I appreciate how the PbtA elements feel more like a description of a whole human person than do hit points, armor classes, and the six classic attributes.

Thanks for reading. Let me know what you think of the game!