Paul Gestwicki's Blog: 2024

Monday, December 30, 2024

Repo Deleter: A utility to batch-delete repositories from GitHub organizations

TL;DR: I created a tool to help batch-delete repositories from GitHub organizations. You can find the source repository at https://github.com/doctor-g/repo-deleter-flutter.

I have a few GitHub organizations that I re-use every semester. I set them up as through an academic account years ago. At the start of the semester, I add all my current students, and they push their work to repositories within this organization. This way, not only can I easily access students' work, they can also help each other out. For example, we can do peer code reviews in class across organizations without requiring anyone to use public repositories. I can also share all my sample code for the semester in the organization, and only those in the organization can get to it.

The downside to this approach is that the organizations require significant cleanup after a semester ends. Although I always instruct students how to move their work from the class organization into their own accounts, there are inevitably dozens of repositories left unattended. Deleting repositories manually through GitHub's web interface is mindless and tedious. There are a few online tools that claim to support batch-deletion of repositories, but I never had great luck with them.

After selecting a repository, going to its settings, scrolling to the bottom, selecting the delete option, and confirming that you want to delete the repository, then you also get to type in its name for super extra confirmation. Doing it once is not bad. Doing it fifty times is awful.

To make my life a little easier, two years ago, I created a little command-line tool to manage the process. I created it in Dart using the github package, which wraps GitHub's Web API. This little tool required you to go into the source code to modify the organization name and any special rules about which repositories to list. For example, I have had semesters where students had to name their projects in a pattern "PX-username" where X is the project number and username is a BSU username. The tool then had two different paths, which I would comment out alternately: the first printed the names of the repositories that it would delete, and the other would delete those repositories. It was not a great utility, and it needed manual cleanup for the repositories that didn't follow the patterns, but it did save me some manual work on GitHub's Web interface.

After a couple of semesters of dealing with that tool, I decided it was time to make something better, and so today I released Repo Deleter at https://github.com/doctor-g/repo-deleter-flutter. This new version includes a graphical user-interface powered by Flutter. Like its predecessor, it requires using a GitHub personal access token with the appropriate permissions; the details for this are given in the project README file. With the proper credentials in place, Repo Deleter allows you to select one of your GitHub organizations. Then it shows you all of the organization's repositories, both public and private. The user can select any number of these, and then, with the click of a button, delete them.

Repo Deleter screenshot (student names blurred out)

In addition to solving a proximal problem, there are two technical aspects to this project that I found rewarding. The first and most important one is that this application uses the bloc pattern. I mentioned my experiments with bloc as part of my tinkering with Dart and Flutter for creating tabletop-inspired videogames. That work is hidden away in a handful of private repositories, and because nothing became of them, it was hard to assess my own understanding of the pattern. I used bloc for the Repo Deleter as well, and it felt quite comfortable. I wonder how a bloc expert would critique the particular states and events that I used, but as a proof of concept, it definitely works. I suppose the proof of the pudding may be in six months when I have to open the project again and inevitably want to add a feature or two. Will I be able to read and make sense out of the code? Time will tell.

The less important but still interest aspect of Repo Deleter is that it's the first place where I used a formal logging framework in Flutter. It is not fancy: it's just the stock logging package, and I'm only echoing logs to a print command. Still, it eliminates the compiler warnings I had from the handful of print statements I had peppered in as ad hoc debugging aids.

One thing I would like to have done, but did not, was to have developed it via TDD. My early prototypes used the github package libraries throughout the application, with no adapter layers. Isolating the data layer from the logic layer would have facilitated testing layers without making actual network requests. I had idle hopes of using this as an example for my students, but in the end, I made the decision to just build on a working prototype rather than engineer something more robust.

Right now, the repository only has Linux platform support, but its easy enough to add more using the Flutter tools. I simply run it from Android Studio because it's easy to set an environment variable for a specific run configuration.

Tuesday, December 24, 2024

Experimenting with software architectures for video games inspired by tabletop roleplaying games

I have been tinkering the last several months with a videogame prototype inspired by some of the tabletop roleplaying games that my boys and I have been playing. Similar to how my game The Endless Storm of Dagger Mountain explored PbtA mechanisms, I've wondered about the strengths and weaknesses of interpreting Forged in the Dark systems into a text-based videogame. Last week, once I put away most of the work of the Fall semester, I was able to dive more deeply into work on a prototype. I felt really good about it until a few days ago, when I came to doubt—not for the first time—some decisions I had made in the software architecture. So, in this, my sixth December blog post, I want to unpack some of the considerations that I have put into these efforts so that I might stop programming in circles.

I decided to use Dart and Flutter for the game. I teach with Dart and Flutter in CS222 because I legitimately enjoy the technology stack. I am competent with them but would not call myself an expert. I have only built two public systems with these tools: my Thunderstone Quest Randomizer and a little timer utility to help with Promotion and Tenure Committee meetings. The former is much larger than the latter, and if I were to build it again, I would do it differently, but I keep maintaining it for myself and other fans of the card game.

I appreciate Dart's static typing, named parameters, pattern matching, and sealed classes, and Flutter's declarative approach can simplify otherwise complex UI logic. Something else that draws me toward Dart and Flutter, besides the elegance of the language and framework, is the inspirational work of Filip Hracek. His Knights of San Francisco is similar to some of the experimentation I have been doing, and his writings about Flutter's performance and the ethics of software design are interesting and insightful. I spent most of a summer working through his open source egamebook repository, trying to understand how a serious Dart programmer uses the language to accomplish his game design goals.

However, the choice to use Dart and Flutter over Godot Engine is never fully settled in my heart of hearts. Whereas Dart is dreamy for game logic, Godot Engine makes it dead simple to create juicy bits of design. Its AnimationPlayer is brilliant for little effects, whereas setting up an AnimationBuilder in Flutter takes a whole lot of typing. Godot's node-based approach means that individual parts of the program can easily be run in isolation and tested, and tool scripts allow customization of the editor itself. Unfortunately, GDScript has no refactoring support, and this is a significant impediment to a test-driven approach: changing my mind about a name or a design choice in GDScript has nasty rippling effects. Type hints in GDScript are invaluable, but they are no replacement for real static typing. Also, creating simple data structures in GDScript is much more arduous than in Dart. All this is to say that I'm dealing with game logic in GDScript, I find myself thinking, "This would be easier in Dart," and when I'm working on simple UI tweaks in Flutter, I think, "This would be easier in Godot Engine." I know that there's no silver bullet, yet I cannot silence the little fear that maybe I chose the wrong environment for this project.

State management is at the heart of any game software. The official Flutter documentation explains the basics, and the list of advanced options makes it clear that there is not one right way. I have long been intrigued by Bloc and decided to try using it as a state management solution for my experimentation. I spent a lot of time the past several weeks reading the official tutorials, and I believe I have a good sense of the system now. Crucial to Bloc is a separation of concerns: Flutter widgets provide a humble view of the UI state, which is managed in a bloc (business logic component), and this is separate yet from the domain layer. For Internet-connected apps, the domain layer involves a repository layer, but for my purposes, it was simple enough to roll these together. For my first Bloc-powered prototype, I followed the tutorials' approach and used equatable to generate some of the boilerplate required. Searching the Web reminded me of freezed, and once I understood how Bloc and equatable worked together, I happily switched to freezed for its excellent code generation support. Using the bloc and freezed snippets plugins for Android Studio is practically necessity here. Once my experimental coding was done, I felt like I could move forward with a more rigorous TDD approach, since now I could think about the features separately from the underlying architecture. I was inspired as well by Dave Farley's commentary about how a layered approach to unit tests means that developers can change their minds about implementation strategies without breaking all of their tests. Knowing that I would continue to change my mind as I explored the design space, I moved forward.

One of my early experiments explored whether I might just consider the whole game to be "business logic" that belongs in the bloc. That is, I considered cutting out the separate domain layer and putting all the game logic in the bloc. This was of limited viability as I quickly ran into two problems. One was that I found myself having to put game logic in the Flutter widgets since they could not simply read UI state from the bloc. This was clearly counter to the spirit of the architecture. The other problem came up when dealing with threat rolls. In the Deep Cuts rules expansion to Blades in the Dark, players roll dice and assign them to consequences, which are negative effects like taking damage or losing items. Assigning dice to consequences mitigates their impact. It struck me that assigning dice was purely UI state and not game state. That is, a player might experiment with different assignments of dice to consequences, but nothing in the game domain model actually changes until those arrangements are committed.

Armed with this realization, I extracted the game rules into their own module, and I gave this module its own immutable state. The state could be modified by a few public methods that were called by either the bloc or my unit tests. For example, the method commitDice took the assignment of dice to consequences and computed the resulting change in the game world state. This also let me separate that state from the widgets entirely: whereas I had been sending the world state to the Flutter widgets, now I could add a layer of abstraction related to UI state. For example, rather than sending the game world state to the view from the bloc, I could send only those details that mattered for the state, such as which buttons were enabled, or what text should be shown in a label. This meant I could have tests on the bloc and trust that a humble view would work as anticipated.

My pleasure at this transition made it even more disheartening when, earlier this week, I sat down to add a new feature and realized I had programmed myself into a corner. After assigning dice to consequences and before their effects are committed to the game world, a player can also opt to "push themselves" to mitigate consequences. This results in another dice roll whose outcome determines how much stress the pushing causes to the character. It means that between the committing of dice assignments and the final changes to the state is another step in which players might push themselves to alter the outcomes. However, this means that the changes to the world state might be coming from unmitigated consequences, dice-assignment-based mitigation, or pushing-based mitigation. The game world simply needs to change, but a good player experience in the UI should distinguish among these.

A fair criticism at this point would be that I should have foreseen that pushing would require a more robust handling of actions and consequences. In fact, I was aware of this, but I was also trying to push the limits of narrow slicing and Farley-style TDD/BDD combined with emergent architecture. I wanted to complete a well-factored feature (in this case, dice assignment) before increasing the complexity by adding a new feature. Despite my efforts, I can see now that revising the core action resolution system will have significant ripple effects on my test layers.

Just before exploring the pushing mechanism, I had stubbed in an approach for dealing with the outcome of progress clock expiration. I needed to attach represent arbitrary game effects to a clock, and so I sketched in a Command pattern. In particular, I encapsulated the idea that the main clock would end the game by creating an EndGameEffect and attaching that to the clock. I used freezed for the Command objects to facilitate future serialization. With this design pattern fresh in my mind, as I faced the bigger problem of state management, I found myself thinking I should be queuing game state change events rather than just making world changes. This would work, but it also made me realize that all I really wanted was to give a command to the world like "mark two stress on the character and reduce the effect of this consequence." That sounds like a couple of method calls to me.

Casey Yano of MegaCrit (Slay the Spire) reflected on his company's evaluation of Godot Engine following the colossal leadership failures at Unity. The sample code he shares uses a combination of a stateful model with asynchronous invocations: await FighterCmd.GainHp(owner, 2, owner). Clearly, he's going through a presentation layer that implements all the fundamental game verbs as asynchronous calls, giving these methods the responsibility to both change the model and display the state change to the user. By contrast, Flutter's declarative approach leans toward having the UI detect a change to the model and then animate the feedback. The latter gives a clear separation of layers that facilitates testing. In practice, though, the game's UI and the game's logic are tightly coupled, and now the code for a feature like "update the health bar when taking damage" is split into disparate places.

In The Endless Storm of Dagger Mountain, which was written in Godot Engine, I managed the state and UI as in Yano's example, writing code like await adventure.show_text('It was a dark and stormy night...'). The use of await makes the call a coroutine, but Godot has no other syntactic indication that a function should be called this way. This means that forgetting a single await will break a chain of intended asynchronous calls, and that's exactly what led to a post-jam patch for that project. I didn't have exhaustive test coverage, nor did I prioritize running through all paths of the game. The result was that a missing await call made at least one of the paths completely lock up for the players. To me, this reflects a weakness of the GDScript language design; by contrast, Dart's use of async, await, and futures make it clear at compile time which invocations are asynchronous and which are not. (Incidentally, Yano is using C# instead of GDScript. I did experiment with GDScript's C# bindings, but a few things held me back from using them: many of the strengths of GDScript, such as elegant signal management, are lost in C#; there is a lot more boilerplate required; there is no Web export for Godot 4.x when using C#; and Rider is so much better than the alternatives, but because is justifiably commercial, it would mean losing money and time to my experiments.)

I had hoped that by this point in my prototyping, I would have a minimal interactive system to which I could focus on adding content and visual flourish. Instead, I have several abandoned experimental architectures. This narrative has been my attempt to explain how I got here. I have learned more about some aspects of Flutter and Dart, but I am also holding two paradoxical ideas in my head: Fred Brooks' observation that you should build a system to throw away because you're going to anyway, and the knowledge that the last 10% of a project takes another 90% of the effort. That is, any understanding I claim to have is on a sandy foundation if the project itself has not shipped. Dagger Mountain may have had critical post-launch patches, but at least I understand exactly why. Whether any particular bloc-like or asynchrony-based approach would be better for this other project is still uncertain. I continue to second-guess myself, but I am also hopeful that having written this, I can return to prototyping after the Christmas break with a fresh perspective.

Monday, December 23, 2024

Happy Camper: A December 2024 FamJam Game

TL;DR: Check out our new game, Happy Camper.

On Saturday, my eldest son led the family in a one-day fam jam as part of fulfilling a Scouting merit badge. We had attended the Indy Indies 2024 Showcase the previous night and played the eight games featured there. Playing Hardcore Cottagecorre in particular got the boys talking about wanting to make a single-stick shooter. The older boys had played Vampire Survivors, but the others really just used Harcore Cottagecorree as their genre example. We laid out responsibilities and got to work a little after 8:00AM. We wrapped up work just before 5:00PM, giving us enough time to talk about our experience over dinner and then get out to see Christmas Carol at Muncie Civic Theatre.

The result of our work is Happy Camper, which you can play in the browser as long as you have a keyboard and mouse. The game is free software and you can browse the source code on GitHub.

I told my friend a little about our experience yesterday, and he had some questions about what technology we used to create the game. To that end, here are some explanations and links. These are all no-cost, free, open source tools.

We used Godot game engine to build the game, including its GDScript programming language.
The art was made using Piskel, which is perfect for the pixel art aesthetic.
The music was created using LMMS.
The sound effects were recorded using Audacity.

My wife commented on how much smoother these jams go now that everyone has more experience using these various tools. There are two things that I myself need to remember from the experience. One is that I need to talk to the younger boys about how to think like a musician when approaching LMMS. Both have a tendency to try to "make it work" rather than trying to model how composition is approached, doing things like aligning notes to beats and measures. Admittedly, the piano roll interface in LMMS is less clear here than staff, and so maybe I need to look into showing them something like MuseScore or Rosegarden, both of which give you access to a traditional notation editor.

The other observation I had was with respect to communication, internally and through the medium of game design. My second son took the task of designing a series of weapons that would work well together. He had a list that he considered finished, but I encouraged him to write them up in a way that they could be used as a specification, together with illustrations of how they would work. He did this, but he still described all seven weapons in half a small sketch book page, cramming them all together and including indecipherable drawings of the design intention. We talked briefly about how the task was not merely to inscribe his ideas onto a page, but to do so in a way that invited others to comment, edit, and learn from them. That is, there had to be more room for annotation, more space for people to read the diagrams together. He had succeeded at the "invention" part but was weak on the "communication" part (see Cockburn's argument that software development is a cooperative game of invention and communication). It's all part of the development of teamwork and game design, and I'm glad we had a chance to talk about it. I would like to have an opportunity soon to give him another similar task and see if he can apply our conversation.

This relates to a similar story from later in the day, when he and his elder brother were trying to figure out what to work on before we shipped the game. They seemed blind to the fact that, in its current state, the game was unlearnable and not fun to anyone. As makers of the game, they could play for about ten seconds, and there was no scaffolding for anyone who didn't know all the implementation details. I pushed them on this point, that unless we were making the game only for ourselves, we had to think about the perspective of new users—people who didn't know what the enemies or weapons looked like, where they would come from, or how these systems worked together. Giving them that charge, I left them for about an hour. When they pushed their changes, the game was much more enjoyable, with better balance and escalation without needing massive changes to the implementation. Of course, if we were not in a one-day jam, we could have done even more work here, but within our constraints, I think they did a great job, and I told them so. It was only later that I realized that this was in the same class of feedback as I had given my son earlier: to recognize that "done" needs to be considered from the perspective of the consumer, whether that is the reader of a design document or the player of a game.

It had been almost a year since our last Fam Jam. Some of us will certainly participate in Global Game Jam in January, but I hope it's not another year before we get the whole family involved. I'm not sure what will happen to our Fam Jam tradition once the boys start leaving the house.

Thursday, December 12, 2024

Reflecting on CS315, Fall 2024 Edition

As described in my course revision post in June, the overall structure of CS315 Game Programming was unchanged from previous semesters: half the semester was spent on weekly projects designed to build skills and confidence, and half the semester was spent on larger projects.

The most significant change was in how those weekly assignments were evaluated. The past several years, I have used checklist-based evaluation, but I was hoping to find a fix for the problem of students doing the checklists wrong. This takes something simple and makes it into more work for me than if it was just a point-based rubric. Unfortunately, the strategy I used did not make things any simpler. Instead of checklists, I gave students a list of the criteria that needed to be met in order to be satisfactory. Their work then was assessed as Satisfactory, Needs Minor Revision (fix within 48 hours), or New Attempt Required. New attempts could be made at the rate of one per week, as I've done for years in most of my non-studio courses. I ran into a bit of the same problem as I wrote about yesterday, where Canvas' "Complete/Incomplete" assessment combined with no-credit assignments leads to a bad user experience, but it was not among the dominant frustrations. Those frustrations were two: students not submitting satisfactory work, and students not submitting work.

The first of those is the most disconcerting. As with checklist-based grading, I gave the students the precise criteria on which a submission would be graded. All they had to do was to meet those, and most of them did. Sometimes it took minor revisions or a new attempt or two, but these were no big deal: handling and correcting misconceptions is exactly what the system is supposed to do. The real problem came from students who submitted things that were wrong multiple times after I had told them what was wrong. In a strict reading of the evaluation scheme, this means the work was still simply unsatisfactory, whereas in other schemes (including checklist-based) they might have gotten a D or C for the work. I am still torn on this issue: was the system unfair to students of lower ability or was it the only fair thing to do with them? Put another way, is it better to give a student a C when they still have serious misunderstandings, or is it better to clearly tell them that they should not advance until they understand it? I don't interpret any of the criteria I gave as strictly "A"-level. That is, it did not require excellence to meet those criteria. What it required was rigor.

The other problem, of students not resubmitting work that needed to be resubmitted, seems unrelated to the evaluation scheme chosen. Speaking with professors across campus and institutions, this seems to be part of a generational wave of challenges. I have a few hypotheses about root causes, but the point of this blog post is not to opine on that topic.

Some of my early-semester assignments take the form of multi-week projects. For example, the set of assignments involve creating an Angry Birds clone. It is submitted as a series of three assignments with increasing complexity, and the complexity is scaffolded so that someone who has never made a game before can follow along. I had a student in the class this semester who fell behind, and then he wondered if he could just submit the final iteration of that three-week project as long as it showed mastery of each week's content. I ended up declining the request. One of my reasons is that the assignments double as a sort of participation credit. It makes me wonder though if it's worth my separating these things. For example, something I've done in other courses in the past is make it so that the final iteration's grade supercedes earlier ones if it is higher.

This was the first semester that a colleague offered a different section of CS315 during the same semester. Looking at his students' games, as well as some recent conversations in the game production studio, made me realize that I should probably emphasize the build process more in my section. Rather than simply running their games in the editor, I should ensure that they know how to create an executable or a web build. It's an important skill that's easy to miss, and there's a lot to be learned by seeing the differences between running in the editor and outside of it.

Now that we've grown the number of games-related faculty in my department, there's a chance I may not teach game programming again until 2026. I expect I will come back to these notes around that time. The biggest pedagogic design question I will need to consider is whether to return to checklist-based grading (with its concomitant frustrations) or move to something else, like a simple point distribution.

Wednesday, December 11, 2024

Reflecting on CS222, Fall 2024 Edition

I had a little break from teaching CS222 last semester as I wrapped up work on STEM Career Paths. I have not blogged much about that project, but you can read all about it in my 2024 Meaningful Play paper, which I understand will be published soon. In any case, here I want to capture a few of the highlights and setbacks from the Fall 2024 class, and I promise, I'm trying not to rant about Canvas more than I have to.

Regular readers may recall that I tried a different evaluation scheme this semester, which I wrote about back in July. In September, I wrote a detailed post about some of my initial frustrations with the system as well as a shorter one about how I felt my attention being pecked away. I don't want to bury the lede, so I'll just mention here that to compute final grades, I went back to my 2022 approach, the tried and true, the elegant and clean system that I learned from Bill Rapaport at UB: triage grading. Between my failed experiment this semester and the similarly failed EMRF experiment from last year or so, I feel like I'm looking for a silver bullet that doesn't exist. It reinforces to me, yet again, that I should really be running some kind of workshops for local people here to learn about what makes triage grading superior.

I still want to track some of the specific problems of the semester, though, so that readers (including future self) won't walk into them. First, I tried to set up a simple labeling system in Canvas such that I could mark work as being satisfactory, needing a minor revision, or needing a new attempt. I made no headway here in part because of Canvas' intolerable insistence that courses are made up of points. I talked with a respected colleague who is willing to toil over Canvas more than I about his approach, and he mentioned that he encodes this information into orders of magnitude, something like 10 points for satisfactory, 1 point for minor revisions, and 0.1 points for new attempt required. Combining these together, students get a weird combination of numeric and symbolic feedback. He acknowledged that it wasn't perfect.

What I tried to do instead was to use Canvas' built-in support for grading as "complete/incomplete." Because that was all I cared about, I set the assignments to be worth zero points. When I used SpeedGrader, sure enough, the work was labeled properly. It wasn't until midsemester that I downloaded all the grades as a spreadsheet and saw that it only gave me the zero points. That is, whether the work was complete or incomplete was stripped from the exported data set. There wasn't so much data that I couldn't eyeball it to give students midsemester grades, which was facilitated by my recent transition to only giving A, C, or D midsemester grades (which are epistemologically vacuous anyway).

It wasn't until weeks later that it dawned on me that my students almost certainly had the same problem: Canvas was showing them zeroes instead of statuses. Of course, all my policies for the course were laid out in the course plan, and I do not have any qualms about considering those to be the responsibility of my students. However, when the university's mandated "learning management system" actively disrupts their ability to think about the course, it becomes more of a shared responsibility. About two weeks ago, I went in and re-graded all of the work to use triage grading instead, which allowed me to distinguish not only between complete and incomplete, but also between things that were submitted-but-incorrect and things that were not even attempted.

One positive change that I made this semester was counting achievements as regular assignments. This made processing them simpler for me, and I suspect it made thinking about them easier for the students too. While they have a different shape than the other assignments, they are "assigned" in the sense that I expect people to do them to demonstrate knowledge. I also set specific deadlines for them, spaced out through the semester. This reduced stress from the students by providing clear guidelines, since they could still miss one and resubmit it later by the usual one-resubmission-per-week policy. It also helped me communicate to them that the intention behind the achievements is that they give you a little side quest during the project-oriented portion of the course.

I had a really fun group of students this semester, as I mentioned in yesterday's post. There were still some mysteries around participation, though. I had several students withdraw a few weeks into the semester without ever having talked to me. It is not clear to me if they decided the course was not for them or if they were simply scared. By contrast, I know I had at least one student who was likewise scared early on, but who stuck with it, and ended up learning a lot. It is not clear to me if there is more I can do to help the timid students lean toward that mindset. Also, despite excellent in-meeting participation, I had many students who just didn't do a lot of the assigned work. I have some glimmers of insight here, but it still puzzles me: how many times do I need to say, "Remember to resubmit incomplete work?" I hope that some of the simplifications I have made to the course will help streamline students' imagination about it, but more than that, I am thinking about the role of the creative imagination. I am sure that a lot of students come into this required sophomore-level class without a good sense of what it means to study, to work, or to learn. My friends in the Biology department recently took their required senior-level professionalism course, in which students do things like make resumes, and made it a sophomore-level course. I wonder if we can do something similar to help the many students we have who are not well formed.

Tuesday, December 10, 2024

What we learned in CS222, Fall 2024 edition

My students are currently typing away, writing their responses to the final exam questions for CS222. As per tradition, the first step was to set a 20-minute timer and ask them the list off anything they learned this semester that was related to the course. This was an enthusiastic group with hardly a quiet moment. They listed 130 items in 20 minutes. I gave them each six votes, and these were the top six:

TDD (9 votes)
SRP (8 votes)
Code cleanliness (6 votes)
DRY (6 votes)
Git (6 votes)
GitHub (6 votes)

Here are all the items they listed, together with the number of votes each earned, if any. There some interesting items here that point to interesting stories of personal growth. It was really a fun group of students to work with, even though several of them exhibited some behaviors I still cannot quite explain, such as a failure to take advantage of assignment resubmission opportunities.

Flutter (1)
Code cleanliness (6)
TDD (9)
A new sense of pain
How to set up Flutter (1)
DRY (6)
SRP (8)
Mob programming (2)
Pair programming (1)
Git (6)
Version control (2)
Future builder
Setting up your environment
Asynchronous programming (1)
UI design (3)
GitHub (6)
Code review (1)
Defensive programming
Working with APIs (1)
Model-View Layers (2)
Teamwork (4)
Better testing (1)
What "testing" is (2)
Explaining code with code instead of with comments (1)
Understandable and readable code
Agile development (1)
Naming conventions
Functional vs Nonfunctional Requirements
User stories (2)
Paper prototypinig
CRC Cards
User acceptance testing
Programming paradigms
How to write a post-mortem
Resume writing
Knowing when something is done (3)
Debugger (1)
Time management (3)
Using breakpoints
Test coverage (1)
Modularization
Distribution of work (1)
Communication skills (1)
Discord
Dart
commits on git
pull using git
Flutter doctor
pub get
Configuring the dart SDK
Rolling back commits
Checking out commits
Going to office hours early
Commit conventions
CLI tools
Don't use strings for everything
Structuring essays
Enumerated types
Sealed classes
Better note-taking
Humans are creatures of habit
Parse JSON data
JSON
Refactoring (5)
How often wikipedia pages change
Data tables
OOP (2)
URL vs URI
One wrong letter can lead to the program not working
How data are handled in memory
FIXME comments (1)
Widgets
State management
Encapsulation (1)
Abstraction (2)
Presenting projects
Coming up with project ideas
Reflection (2)
pubspec management
.env files
Hiding files from GitHub
Serializing JSON
Personal strengths & weaknesses
Falling behind sucks
Software craftsmanship
Work fewer jobs
Finding internships
Remember to email about accommodations
Accepting criticism on resubmissions (1)
Procedural programming
You don't have to take three finals on one day
Painting miniatures
GitHub has a comic book
Being flexible
Dead code
Holding each other to standards
Bad and good comments
Aliasing
Reading a textbook thoroughly
Rereading
No nested loops (no multiple levels of abstraction)
Using classes is not the same as OOP (1)
SMART
A bit about the Gestwicki family
Places to eat in NY
Getting ink to the front of an Expo marker
How to clean a whiteboard properly
New York Politics
Data structures vs DTOs vs Objects (1)
Conditions of satisfaction
Setting up ShowAlertDialog
Handiling network errors
Handling exceptions
Build context warnings
CORS errors
Semantic versioning
Dealing with Flutter error reporting
Test isolation (1)
Don't make multiple network calls when testing
Improving test speed
Always run all the tests
You can test a UI
Writing 'expect' statements
Running tests on commit
Autoformatting in Android Studio
Testing in clean environments
Creating dart files
Hard vs soft warnings
Functioning on 0-3 hours of sleep
Configuring git committer names

Tuesday, November 26, 2024

Bloom's Taxonomy, Teaching, and LLMs

Recent discussions of LLMs in the classroom have me reflecting on Bloom's Taxonomy of the Cognitive Domain. Here's a nice visual summary of its revised version.

Blooms Taxonomy of the Cognitive Domain
(By Tidema - Own work, CC BY 4.0, https://commons.wikimedia.org/w/index.php?curid=152872571)

Bloom's Taxonomy, as it is called, is a standard reference model among teachers. The idea behind it is that a learner starts from the bottom and works their way upward. As far as I know, it has not been empirically validated: it's more of a thought piece than science. This is reflected in the many, many variations I've seen in the poster sessions of games conferences, where some young scholar proposes a play-based inversion that moves some piece into a different position on the trajectory. All that is to say, take it with a grain of salt. The fact remains that this model has had arguably outsized influence on the teaching profession. (Incidentally, I prefer the SOLO taxonomy.)

There's been a constant refrain the past few decades among a significant number of educators and pundits that technology has made obsolete the remember stage. Why memorize this table of values when I can look them up? Why remember how this word is spelled? Spellcheck will fix it for me. My skepticism of the concept has only increased as I have worked with more and more students who use digital technology as a crutch rather than a precision instrument.

LLM-generated code comes up in almost every conversation I have among teachers and practitioners in software development. There are ongoing studies into the short- and long-term implications of using these tools. My observations are more anecdotal, but it's no exaggeration to say that every professional developer and almost every educator has landed in the same place: LLMs can generate useful code, but knowing what to do with it requires prior knowledge. That is, the errors within the LLM-generated code are often subtle and require knowledge of both software engineering and the problem domain.

From the perspective of Bloom's taxonomy, a developer with a code-generating LLM is evaluating its output. They come to their evaluation by building upon the richness of cognitive domain skills that undergird it. At the very fundamental level, they bring to bear a vast amount of facts about the praxis of software development that they have remembered and understood.

If Bloom is right, then among the worst things we could do in software development education is throw students at LLMs before they have the capacity for viable evaluation. Indeed, before LLMs, the discussion around the water cooler was often about how to stop students from just searching Stack Overflow for answers and submitting those. Before Stack Overflow, it was that students were searching the web for definitions rather than remembering them. My hypothesis for learning software development then is something like this:

Google search eliminates the affordance for learning to remember.
Stack Overflow eliminates the affordance for learning to understand.
LLMs eliminate the affordance for learning to apply.

This hypothesis frames the quip that I share when an interlocutor discovers that I am a professor and, inevitably, asks what I think about students using ChatGPT. My answer is that I'm considering banning spellcheck.

Monday, November 25, 2024

Walking away from a November game project: A reflection on NoGaDeMon 2024, Dart, Flutter, and Bloc

I would hate to make this a tradition, but it seems that I once again entered NoGaDeMon. National Game Design Month (NaGaDeMon) is November, and for several years, I created interesting little projects during the month. Last year, I was not able to pull a project together, and I'm afraid that's the case this year as well. However, I was able to learn a bit through the attempt, so I want to capture some of it here before it slips away.

Before November, I had been tinkering with an intersection of ideas related to posts in the last few months: interactive narrative games like my The Endless Storm of Dagger Mountain, which drew from the Powered by the Apocalypse tabletop RPG space, built around some concepts from Blades in the Dark and Scum & Villainy. I figured that, for November, I would try building a very small slice of the idea. For various reasons, I also wanted to try building and releasing a game using Dart and Flutter. I dug in and started making reasonable progress for a side project.

A few days into November, John Harper released Deep Cuts, a campaign and rules expansion for Blades in the Dark. I bought a copy and was quite surprised at the rules changes. I had expected little tweaks and balancing maneuvers, but Deep Cuts actually provides a complete overhaul of the most fundamental Blades action resolution system. This was too cool not to play with, so I rehashed my planned NaGaDeMon project, essentially starting from scratch to support some of the Deep Cuts ideas.

Before last week, I was able to get a very small version of the game working, letting the player experience a single, badly written game scene. The user-interface was just awful, so in order for the game to come together would have required adding a ton of content and a complete player experience design and implementation. Both of those would be tedious efforts, especially the latter, since I am not very fast with Flutter UI development. Part of the inspiration for choosing Flutter was to gain more practice with engaging UIs.

About two weeks ago, the work of one of my committees exploded into taking most of my unassigned work hours, and this was not altogether unexpected. We also just got the good news that we will be hosting family for several days around Thanksgiving. This will be wonderful, although it also means these won't be hobby-project days. The result is that I've decided to put this project to rest. I did learn quite a bit going this far into the project, and that is the topic for the remainder of this post.

First of all, the obvious lesson is that if I wanted to really focus on learning to make a top-notch interactive Flutter UI, I should have chosen something with zero other design risks. I knew that the best I could do in one month was to make something just functional, yet I am not sure I was honest with myself about how ugly that would likely end up. Maybe I will find a game jam that will let me get a better handle on combining turn-based game timing with implicit animations.

Prior to November, I had been tinkering with some of these design inspirations in Godot Engine, which is of course the engine I used to build The Endless Storm of Dagger Mountain. I was using a rather conventional mutable-state object-oriented architecture. I found myself frequently frustrated by the lack of good refactoring tools for GDScript. This is a significant hindrance to evolving an appropriate design. This is part of what made me switch over to Dart, which is a joy to work with in part because of the excellent tooling support from Android Studio.

A few summers ago, I spent a great deal of time studying Filip Hracek's egamebook repository. Nothing shippable came out of my efforts—I don't think I ever even blogged about it—but I did learn a lot. I was struck by how Hracek separated the layers of his architecture, and it was the first time I spent a lot of time in a game that used immutable data models. At the time, I had looked into the Bloc architecture and struggled to make sense out of it.

Approaching this November's project, I decided to dig deeper into Bloc. I spent a lot of time with the official tutorials and puzzling over this seemingly simple diagram:

The simple tutorials are simple, which is convenient, but the more robust ones separate the "data" component into a data provider and repository. It seemed clear that the game state could be conceived of as data, but I struggled to conceptualize where the game rules should live. The game rules can be considered part of the domain model, and as such, should be separated from the bloc. This would mean that a response from the domain model may be the modified game state, which then is echoed back through the bloc to the UI with a bloc state change. However, it's also reasonable to conceive of the game state itself as the data layer and the "business logic" as being the transformations of that state. Indeed, this seems to be the difference between the simple and more complex tutorials: the simple ones deal with simple in-memory state, and the more complex ones draw data from different sources and transform them in the data layer.

Of course, there is no silver bullet. Given the tight time constraints on the project, I simply considered the immutable game state to be my data layer, and I put the game logic in a bloc. I also simply passed the game state along to the UI, but in a more robust solution, I would have had clearer separation between layers. Including a dependency between the UI and the data layers was a matter of expedience and the intentional incurring of technical debt.

My first pass at the implementation had me writing my game states and bloc states by hand. The Equatable package meant that I didn't have to fret over writing some of the boilerplate that's necessary to do state comparisons, and it was easy to integrate this in Android Studio using Felix Angelov's Bloc plugin. When scouring the Web for help with Bloc, one quickly also comes across discussions of Freezed, which library is also integrated into Angelov's plugin. I had tinkered with Freezed in my egamebook-inspired explorations, but I have not shipped anything that uses it. After having built up my understanding of Bloc using Equatable, Freezed was an obvious next step. Next time, I would jump right into using it for cases like this.

Writing a functional Flutter user-interface was straightforward using BlocBuilder. I found this to be a convenient way to conceptualize the game, especially since it had very clear states. For example, in my original explorations (before Deep Cuts), I had the player choosing an action from a list, then customizing the action with various options from Blades in the Dark, such as pushing yourself to trade stress for dice. After rolling the dice, the player is now in a different state of the game in which they are responding to the result, such as by resisting its consequences. This was elegant to express in the code, and I am confident that with enough effort, I could make a compelling user experience out of it. By contrast, Dagger Mountain used an architecture inspired by MVP but that depended too heavily on the undocumented, unenforceable behavior of coroutines. Both of these are "only jam projects," but they are helping me to conceive of how I would approach something more significant in this problem domain. The aforementioned coroutines were my solution to synchronizing the model and view states (for example, to finish an animation before continuing to the next step of the narrative); I'm fairly certain I understand how I can do that with bloc's events and states, but since the November project will remain unfinished, there is risk.

All this exploratory coding meant that I did not follow a test-driven process. I ended up not getting into the testing libraries specifically for bloc. It's possible that this would have helped me better to conceptualize the business logic versus the domain layer, but that remains future work.

There are still a lot of questions about the game design itself. Indeed, this entire exploration is inspired by design questions around the adaptation of Blades in the Dark tabletop gameplay into a digital experience. Citizen Sleeper is the only project I know of that has worked in this space, and it's a fantastic interpretation. I only became aware of Citizen Sleeper after I started doodling my own ideas, and it's interesting to see where they converge and where they diverge. I hope to dive back into this design space later, but for now, my attention must go toward wrapping up this semester, planning for next semester, and enjoying the upcoming Thanksgiving break.

Wednesday, November 13, 2024

What people believe you need to do to be an independent game developer

Aspiring game developers are starving for advice. I recently attended a meetup of game developers where an individual gave a formal presentation about how to become an indie. The presentation was thoughtfully crafted and well delivered, and it was entirely structured around imperatives—the things that you, the audience member, need to do if you want to be a successful independent game developer. The audience ate it up and asked for more. They were looking for the golden key that would unlock paradise.

There are two problems here, one overt and one subtle. The overt one is that there is no golden key. There is no set of practices that, if followed, will yield success. I imagine most of the audience knew this and were sifting for gold flakes. However, it was also clearly a mixed crowd, some weathered from years of experience and some fresh-faced hopefuls. I hope the latter were not misled.

The subtler problem was made manifest during the question and answer period when it became clear that the speaker was not actually a successful indie game developer at all. Their singular title had been in development for three years and had just entered beta. They had no actual experience from which to determine if the advice was reasonable or not. The speaker seemed to wholeheartedly believe the advice they were giving despite not being in a position to draw conclusions about their efficacy.

Once I saw the thrust of the presentation, I started taking notes about the kinds of advice the speaker was sharing.

Document everything, and specifically create:

Story and themes document
Art and design document
MDA document

Have a strong creative vision
Be a role model for the work environment you want
Consider these pro tips for hiring staff:

Use a report card to score your candidates
Look for ways to get to know what it would be like to work with them
Try collaborating with them as part of the interview
Always have a back-up candidate, not a top candidate but someone you know you could work with
Being their best friend does not mean you should work with them

Thank people for their contributions and efforts
Use custom tools to help you work better

Use the Asset Store in Unity
Use tools to help you test
Automate as much as you can to save you time
Learn to prompt so you can use generative AI

It allows an artist to be a developer by removing coding barriers
LLMs can replace tedious use of YouTube, Google, Reddit, etc.

When pitching to publishers, have two versions of your slide deck:

pitch slides: the version you send
pitch presentation: the version you present

Take budgeting seriously

Budget for specific deadlines
Don't spend your own money if you can get money from someone else (e.g. publisher)
Get a job so that you can support yourself until you can get funding from someone else for the game project

Quoting one of his professors: "To make money, you need to spend money, and to spend money, you need money."

Don't get distracted by others (e.g. on social media)

These aren't the things you need to do to be an indie game developer. These are the things that an audience believed you need to do to be an indie game developer or the things that someone with a modicum of experience thought would be worth telling indie hopefuls. It seems to me that this is the advice you would get if you spent an afternoon collecting advice by searching the Internet. It's helpful for me to have a list of what people are likely to believe from consuming popular advice. Sometimes advice is popular because it is accurate; sometimes people tell you to make your game state global.

Three other things jumped out at me about the presentation. First was the unspoken assumption that one would be using Unity. There was no indication from the speaker that this was even a choice, and none of the questions reflected on it. Second, the speaker acknowledged the importance of automation and automated testing, which was great to see. Third, no one pushed back regarding the use of CoPilot or other LLMs to help with coding, whereas I suspect there would have been a riot had he suggested using the same tech to generate artwork. There's a study in there.

Tuesday, November 12, 2024

Serendipity

As mentioned in yesterday's post, I was at Meaningful Play 2024 a few weeks ago, and I'm finally processing the many pages of notes that I took there.

Sabrina Culyba gave the morning keynote on that last day of the conference. She spoke about serendipity in game design, sharing a compelling story about the development of Diatoms. The talk was brilliantly prepared and executed. She summarized research findings around serendipity that shows that the following factors can affect its likelihood:

Having a prepared mind
Openness
Being connection-prone
Belief in serendipity

These are really interesting, and if I didn't have a pile of other research projects in the hopper, I'd be curious to dive into the literature here. The first item sounds like a variation on the maxim, "Luck favors the prepared." The second sounds to me like the eponymous Big Five personality trait that tracks with creativity.

I don't have much else to contribute to the discussion, but it's a neat idea that I don't want to waste away in my notebook.

Monday, November 11, 2024

Fantasy heartbreakers

I am currently reading William White's Tabletop RPG Design in Theory and Practice at the Forge: 2001-2012 after having met the author at MeaningfulPlay. This excerpt from Chapter 3 made me shout with delight at having a name for a phenomenon.

A fantasy heartbreaker was [Ron Edwards'] term for an independent game that contained interesting innovations, usually without realizing that they were in fact innovative, but whose designers had failed to fully examine their underlying design assumptions—thus producing games that were highly derivative of D&D, whether or not that was actually a design goal of the game—and who were either naïve or overambitious in their expectations for success in the marketplace. (p.93)

Ron Edwards' original post on the topic is cited, but I haven't made the time to read the source yet. White's summary was enough to excite me and want to share it here.

Tuesday, October 1, 2024

Paper!

I have a pile of things to grade, seemingly unlimited committee work to complete, and major decisions to make. I am having a bit of a stressful week. But you know what I just did that made me so happy that it's worth taking the time to write a blog post?

I graded something on paper.

My new coworker Travis Faas shared with me a format he uses for peer critiques during his game programming class. It's something I want to draw into that class. Today, in CS222 Advanced Programming, my students were to showcase their two-week project submissions. I've traditionally done this in an unstructured way, something like an academic poster session. Just a few minutes before class, I thought to myself, "What if I tried out that crit format here?" I literally did not have time to lay out even the simplest of templates, so I just grabbed a stack of blank white paper and headed downstairs to class.

I told the students that, during their showcase, they had to write at least three outcomes from their discussions. I suggested (following Travis) that these could take the form, "I learned X," or, "Y is something I want to learn more about." I also foreshadowed that there would be a secret final step.

As always, they walked around with real interest in what each other had done. This time, however, they paused after each station and jotted little notes on their paper. What might otherwise be fleeting thoughts were tracked, held on to.

Once we were done—and gave out the Audience Choice award, of course—I gave them the final step: to write down some action that they plan to take next that relates to the outcomes of their discussion. I gave them two or three minutes to do this before collecting their papers.

Both of my Tuesday/Thursday classes had major deadlines today, so it was quiet during office hours. I sat down in my chair, grabbed my favorite pen, picked up the stack of papers, and read through them. On each, I gave a little, hand-written affirmation, encouraging students or providing tips on how they might move toward their goals.

Paper! Wonderful paper!

I am looking forward to turning back their papers on Thursday. I wonder when the last time was for them that they had such a human experience as handing a teacher their ideas and then waiting, waiting without a chance of hearing from me about them before our next meeting. No anxiety about checking grades. No notifications. Quiet, from which comes a chance for peace.

Paper!

Tuesday, September 24, 2024

Grading rather than improving

I talked too much today. I had back-to-back 75-minute class meetings, first of CS222 Advanced Programming and then of CS315 Game Programming. Both times, I spoke almost the whole time. I would much rather have had structured exercises to help teach what I wanted to show. It wouldn't have been that hard to set them up, just an hour or two each of setting up a template project that demonstrates what I want to show. I don't have an hour or two for each meeting for each class. I have filled my allocated class time with grading. This is partially due to the new grading system I am using. I'm having a lot of back-and-forth with my students. Turns out that getting them to mastery is a lot harder than giving them partial credit. I believe it's bearing fruit. But it's also taking all or more of the time I can give to a class.

I am not sure what the path forward is. I will do less grading later as both classes move from individual lessons to large project integrations. Then, however, it's too late: we will have passed the point in the semester where a strong introduction is better than 75 minutes of my talking.

Thursday, September 19, 2024

CS222 and CC17

It has been many years since I have required my CS222 Advanced Programming students to read chapter 17 of Robert Martin's Clean Code. This chapter is entitled "Smells and Heuristics," and it contains a wonderful collection of common code problems and potential solutions. This year, I had my students read the chapter just before starting our two-week project, and I gave them the challenge to pick three items from the reading that were particularly interesting to them. These were fun for me to read, displayed thoughtful reflection on programming, and to top it all off, were easy to grade.

Some of my favorites showed up in the students' responses, such as the advice to extract conditionals into named functions, to replace magic numbers with named constants, and to avoid selector arguments. Feature envy showed up more than once, which surprised me. Students recognized that some of their previous courses actually habituated them to these smells rather than their cleaner alternatives.

I need to remember to keep this assignment. I plan to ask my students today whether they think this chapter would have made a good introduction to our reading rather than a capstone on it. Because the chapter is so accessible, it's possible that reading it first might help them get better faster, and to do so before they get into the trickier distinctions such as SRP (Chapter 10) and the distinction between objects and data structures (Chapter 6).

Wednesday, September 18, 2024

Docs is code

Clint Hocking's birthday blog post led me to look at the EXP tabletop roleplaying game, and in turn, that got me looking at AsciiDoc and the Docs as Code movement. I understand completely the arguments that AsciiDoc makes against Markdown. Regular readers will recall that I experimented with converting my course plans to GitHub-hosted Markdown and almost immediately backed away from it: Markdown almost immediately requires a polyglot approach for anything significant. However, I don't see AsciiDoc nor Docs as Code as addressing what I consider the most important tool for technical writing: the ability to embed scripts.

I have been using lit-html for years (and Polymer before that). What it lets me do is separate the structure of my writing from its display. For example, when I write an assignment for my students, I might conceive of it as having a list of objectives. In Markdown, AsciiDoc, or even HTML, I could easily represent that information as an ordered or unordered list. Later, however, I might decide to change the representation, instead showing it as a definition list, or making sure the name of the objective is bold, or generating unique links to each individual objective. In any of those plain markup environments, I have to do this by hand or, worse, with a regular expression.

What I don't see from Docs as Code, although I admit I haven't done more than a cursory search through their materials, is the observation that docs is code. If I separate my model and my view, I gain a robustness that any journeyman programmer understands. For example, using lit-html, I can create a simple JavaScript data structure that represents a goal, with a name and a description. Either or both of these can be html templates, not just strings. With that structure defined, I can create a list of them for an assignment. Now, on my first pass, I show them as list by iterating through the list and dropping the data into list items in an ordered list. When my requirements change—as they always do—I can modify my script and make the same data into a definition list, section headings, etc. If I need to change the actual definition of an assignment goal, I can make that change explicit.

Of course, the whole thing is in version control with sensible commit messages.

I have taken a similar approach in the past to build documents using LaTeX, coordinating the execution of multiple scripts through GNU Make. That works when LaTeX is needed for document output, but it feels less elegant to me than being able to generate the HTML directly from the Javascript.

If you know of an approach in the AsciiDocs or Markdown vein that gives the same level of robustness as what I can do with lit-html, please let me know.

Tuesday, September 3, 2024

A Morning with Scum & Villainy

After writing about my first experience with Blades on the Dark, I heard from a friend who recommended that I also look into Scum & Villainy. It is a sci-fi interpretation of the Blades in the Dark rules following the Forged in the Dark license. I use "sci-fi" intentionally since the rules and setting lend themselves to space westerns or space operas—anything with scoundrels on spaceships—but it would be difficult to do science fiction with them. The rulebook makes it clear that it's drawing on the "rag-tag group of outlaws traveling across the sector" trope as seen in Firefly and Cowboy Bebop. Both are clearly space westerns.

This theme is a good thing, and those two shows are among my favorites. It is a shame, then, that one of the first things one notices on opening the book is that it doesn't mesh with these themes. Blades in the Dark sings out its theme in graphic design and illustration. In contrast, Scum & Villainy feels like it cannot decide what it wants to be. This is exacerbated by the initial impression given when the structure of the book and much of the copy itself are taken verbatim from Blades in the Dark. None of this is inherently bad, but it gave a negative first impression after having been given such a strong recommendation to read it.

The few mechanisms added to Blades are quite good. Developing your ship rather than your headquarters captures the theme well, also lending the feeling of an episodic series. Getting bonus dice from gambits is a welcome addition to my group, since they have a penchant for beating the odds by rolling consistently low.

My favorite addition consists of the three starting scenarios, one for each of the ships. Playing Blades in the Dark, or even just reading and imagining it, it wasn't quite clear where to start. Scum & Villainy gives more tightly scripted introductory scenarios. At least two of them boil down to simple chase sequences, but there is nothing wrong with that. Each of these scenarios has just enough background to fill in details as needed, and each one provides clear hooks into the next episode. On top of that, there three outlines for other, unrelated jobs that are fit for the theme of the selected ship.

We got the game to the table yesterday morning, and I played with my three eldest sons. My third son had previously expressed disinterest in tabletop roleplaying games, having not enjoyed whatever fantasy game we had tried together once years ago. I convinced him to try this one, knowing that he's a storyteller at heart, and he and his two older brothers had a great time. They chose to be bounty hunters on a Cerberus ship, playing a Scoundrel, a Pilot, and a Mechanic. We played the recommended started mission: tracking down a member of the Ashen Knives gang with multiple bounties on his head. There was a little hiccup due to an ambiguity in the scenario description, but once we got into that, we had a blast. The two older boys had a handle on the action resolution protocol as well as the role of flashbacks. They used flashbacks much more successfully as part of the storytelling than in our two Blades games, using them to set up a two-pronged assault on the mark's location, and then using one to set up and soup up hoverbikes for big chase scene. A glorious failure by the Mechanic led to a potentially disastrous desperate situation for the Pilot, but he used a gambit and pushed himself to ace it. It was exactly the kind of thing one wants out of a chase scene.

There are a few places where Scum & Villainy falls short of its august predecessor, doomed perhaps by its own lineage. For example, in Blades in the Dark, there are sensible limits on how much coin (or value in coin) a person can carry or that one can stash. This is actually quite interesting, a point that I don't remember seeing before: you can only carry so much money, and you can only have so much liquid cash, particularly in a Victorian setting. Scum & Villainy borrows this mechanism despite it describing money as being kept as software on credsticks. Also, while both games admit that any given action may be sensible under multiple action ratings, the action articulation in Scum & Villainy feels more forced than Blades'. I suppose, due to my career, I am particularly puzzled by how the Doctor action rating is for "doing science" and the Study action rating is for "doing research." It makes me wonder about how I would take the Forged in the Dark idea and put it into a setting of my own choosing, as many others have done. For example, making Kapow! years ago was a great exercise in understanding how Powered by the Apocalypse ideas could apply to campy 1960's superhero action.

I find the setting of Blades in the Dark to be more intriguing, but the setting of Scum & Villainy appeals to me personally while also being a "safer" space to explore with my boys. One could do a sci-fi criminal gang drama, but maintaining a ship vs. expanding gang turf really pushes toward the Firefly vibe. Scum & Villainy certainly stands alone, although there are parts whose rationale would make more sense if one is familiar with Blades in the Dark. Reading the Blades book was enough to see why there has been so much excitement about it, even though I know I'm late to the party. The Scum & Villainy book may have lacked some of this pizzazz, but the table doesn't lie, and we had a great time.

Initial reflection on Bowman-style grading

I had my first batch of submitted student work last week, and I would like to share some reflections on exploring a new grading system. As I mentioned over the summer [1,2], I have revised two of my courses to use a new grading scheme. CS222 Advanced Programming and CS315 Game Programming are both using a technique that I have lifted from Joshua Bowman's work. This technique looks at each goal and assesses a student's contribution into one of four categories:

Successful
Minor revisions needed
New attempt required
Incomplete

The first and the last are clear, but I found myself tripping up between the middle two. I think this is in large part to an important distinction between this technique and Rapaport-style triage grading, which I have used for years. In that model, you have four categories as well:

Done and correct
Done and partially correct
Done and clearly incorrect
Not done

The distinction between "partially correct" and "clearly incorrect" is very clear to me, and these are the second and third categories for Rapaport. I started using that as a heuristic to differentiate between "Minor revision needed" and "New attempt required," but I don't think that's right. With Rapaport's approach, "partial correct" captures a huge category of errors that one would put into the "C" letter grade bin: such a submission has some elements of correctness but significant flaws. I think Bowman's "Minor revisions needed" is much closer to Rapaport's "Done and correct." Clearing up the differences between these two rubrics caused me to have to re-grade many submissions.

Bowman's philosophy, which I am also bringing to bear in my classes, is grounded in mastery learning. Hence, recognizing the affordance for resubmission is fundamental to understanding the system. I knew I wanted to throttle my students' resubmissions, so I set up a two-tier system. With minor revisions needed, students could make the necessary tweaks within three business days, then get full credit for their submission. With new attempt required, or if they didn't make minor revisions within three business days, they could resubmit at most one per week.

I switched to Bowman's model in an attempt to clarify evaluation, and I'm already confused. I think this kind of system could work brilliantly if there were any tool support for it, but every gram of this technique fights against Canvas. Not only does Canvas lack robustness to anything but the least interesting of point-based pedagogic models, it and its LMS ilk breed an intellectual laziness among the students. The student usage pattern is to look at how many points were earned and then ignore any formative evaluation. My conclusion so far is that doing this on paper would be a great improvement over using Canvas if it weren't for the fact that my students' submissions are often inherently digital and not just accidentally digital.

It is early in the semester, but I have yet to see that Bowman's approach is going to be any more clear that Rapaport's. I've been using Rapaport-with-resubmissions, and that fills the middle ground between a clear representation of points and clear feedback about which parts are wrong. I will have to give it another two or three weeks to see how students respond before I make any systemic changes: there hasn't been ample time to get complete submit-evaluate-resubmit-evaluate loops from enough students yet.

Last year, I experimented with EMRF grading and ended up quickly dropping it. Canvas had no clear way to express this system either, and I did not see any clear benefit from distinguishing between "excellent (E)" work and "meets requirements (M)". It's easy to blame the tool for its shortcomings, and in this case, that's exactly the right thing to do. I know folks who "make it work" with tricks and hackery, but in my mind, there is no excuse for having a system that demands that the only real part of a class is something that has points and contributes to a pool of points. It's not how learning works, and it's never been how teaching should work.

Monday, August 26, 2024

An Afternoon with Blades in the Dark

I heard about John Harper's Blades in the Dark tabletop role-playing game from a talented undergraduate student around 2018. He was creating his own RPG as part of a games research group I was running, and he regularly brought up Blades along with the Powered by the Apocalypse movement as inspirations. I came across it again when looking for information about non-hit-point damage systems, which Blades has, although not via inventory manipulation—the particular topic of my investigation.

I bought a copy of the rules, read them while on family vacation at the end of the summer, and found them quite inspirational. It made me want to run a session or two in order to see the systems in motion. As anyone who enjoys tabletop games knows, it's one thing to read the rules and another thing to try running them: the latter exposes ones incomplete knowledge from the former. I hesitated to invite my boys to play though because of the vicious nature of the game. Blades in the Dark is a game in which you play a scoundrel who is part of a criminal crew. You advance in the game through illegal and immoral activity. I prefer to encourage fantasies about heroic living. Yet, I found myself thinking about how one can learn from stories of heroes and of villains, and if nothing else, I knew my boys would also enjoy experience the game and exploring its setting.

In my retelling of our experience, I want to highlight places where I felt unsupported by the book and online resources. This is constructive criticism meant that I hope myself and others can use to improve future sessions of this and other games. Note that in this blog post, I will be freely referencing rules and lore from Blades in the Dark. If you are the type who enjoys reading or playing role-playing games, I recommend you pick up a copy. That said, you can also get an overview of the rules from the public System Reference Document.

We played for about three hours on Sunday afternoon, during which time we created characters and crew and completed one score. I had downloaded and printed the recommended materials, and so we dove into character creation. I had also crammed a lot of lore into my memory, which made it harder to introduce the game at a high level. I would have liked a canned paragraph I could read to set up the experience for new players. It was also too late for coffee, which could have been a contributing factor. Also, my boys, because they are my boys, are probably not familiar with any of the cultural touchstones that are referenced in the rulebook: we just don't watch stories about criminals and antiheroes, and I myself had not heard of most of the things in the list.

The playbooks were useful for letting the boys start creating their characters. One picked a Leech (saboteur, tinkerer, alchemist) and the other, a Hound (sharpshooter, tracker). Among the first decisions one makes in character creation is choosing a heritage, and here is another area where a handout would be useful. The book has short descriptions of each, but the playbook only lists the name. A simple handout that gives a single sentence about each would be sufficient for a table of players to pick the one they like; otherwise, the GM has to explain each while players hold the lore in their heads. When it came choosing vices, we had a similar problem: I explained that one had to choose the category of vice, then the particular vice and its purveyor (for example, "Poker at Spades' Tavern"). The boys looked at me rather blankly: without knowing more about the world, it was not at all clear what kind of creative boundaries they had for this. I remembered that there was a list of vice purveyors in the appendix, so I turned to page 299. They readily chose from this list, and this makes me think that this, too, should be a handout in the starting materials.

When we turned to creating the crew with the corresponding playbooks, I realized that we should have inserted a step before the character creation: a quick discussion about what kind of game we wanted to play. They didn't talk much during character creation, but when we got to crew selection, it became clear that the Leech wanted to do sabotage and the Hound wanted to do assassinations. I ended up encouraging them to compromise and create a Shadows crew, which leans more into Leech styles but should have room for the Hound as well. In part, I was thinking about an initial score that would resonate with their crew playbook, and I did not want to open with an assassination.

We had some trouble with defining the crew's hunting grounds. The district map from the downloadable resources was useful, and the players figured that their lair could be in Crow's Foot while their hunting grounds was across the river in Whitehall. After all, wouldn't a band of spies want to spy on something worthwhile? I looked up more details about the district in the rulebook, and I found that it was listed as having maximum wealth and maximum security. That doesn't seem like a reasonable target for a crew that is just starting out. Mechanically, the Shadows were Tier 0 but their targets would all be much higher tier. If there were a recommendation for new players, we could have just taken it. In the absence of this, setting up the crew felt overwhelming, being high-stakes and made almost blindly. We ended up shifting the hunting grounds to also be in Crow's Foot.

A related complication came up in the required decision of how to deal with the faction that controls the turf containing your hunting grounds. Unfortunately, there is no concise summary in the rules about which factions control which hunting grounds. For Crow's Foot, I remembered that the Crows claimed control over the whole district, but that there were also smaller factions who were trying to take it over. When the crew thought they would use Whitehall as their hunting grounds, I had no idea who controlled it. Would the Bluecoats—the corrupt law enforcement officers—be the faction that gets paid off? This is another case where a simple reference or a table of defaults would really help new GMs who don't have the spare cycles to memorize the litany of factions. A GM can always override a default, but in the absence of a default, I felt stranded in 150 pages of lore.

The book gives a recommended starting score that brings in three competing factions and gives the player some choices about whom to trust and whom to target. I was intimidated at the thought of doing this because it introduces several important NPCs and multiple factions, and the scenario is still likely to require improvising believable within this complex setting of Doskvol. I had previously searched for tips about how to start a Blades game, and I had read an article by Justin Alexander about alternative starting situations. I liked the simplicity of his "Aim at a Clock" advice. The book provides long-term goals, with progress clocks, for each of the factions. Given that, Alexander recommends picking a faction whose goal plays into something the crew could do, then having them pick up a score on behalf of that faction. This felt more controllable, and so as the players were finishing up their crew details, I had already started pulling pieces together: the Red Sashes and the Lampblacks each want each other eliminated, the Lampblacks are pushing some new drugs in Red Sashes territory, and the Red Sashes want the players to stop the production of these drugs. This would advance the Red Sashes long-term goal to eliminate the Lampblacks, and it would give the upstart players a powerful ally. However, the players had already decided that their crew had paid off the Crows for their hunting grounds, which also introduces a little conflict, since the Red Sashes and the Crows both want control of the district. Also, while arson is hardly virtuous, I liked the idea of having my boys focus on stopping the manufacture of drugs rather than, say, assassinating a union leader.

Part of the art of running and playing Blades in the Dark is knowing how much planning is too much planning. It is so important that the designer put the planning constraints right onto the character playbooks: choose a plan and provide the detail. The crew knew where the drugs were being manufactured, but they knew they did not want to go in guns-blazing. They asked where the raw materials came from, which is a great question. In the moment, I decided that this was an Information Gathering move. That would make this the first dice roll of the afternoon, and in retrospect, I don't like it. Information Gathering is a roll without stakes. I would not have recognized how this put the wrong foot forward until reading Matthew Cmiel's thought-provoking (although hyperbolically titled) article, "The Unbearable Problem of Blades in the Dark." I like his heuristic that dice rolls should always be with stakes, but Information Gathering just gives you better results the higher you roll. Also, it wasn't clear to me if the fact that an action rating was being used for Information Gathering meant you could aid each other, take it as a group action, or push yourself. Looking back at it (in light of Cmiel's analysis and other reading), I think the intended answer is no. In any case, the players rolled, and they discovered that some materials come by carriage regularly and some come by ferry intermittently.

After having thought about it, I think this step could have been a small score of its own. It would have made a decent introductory mission to gather this information as part of a long-term plan to take down the factory. Indeed, this would have helped me meet my own goal of understanding the whole Blades in the Dark system, including downtime. As it is, we did not have time to wrap up the score or do the downtime actions since real-life obligations interrupted the session—including the need for the dinner table.

The crew decided that this would be a stealth mission, sneaking into the factory via the river, starting a fire, and then getting out. A few times, the players wanted to get into details such as whose gondola they could use, but I assured them that Blades wants us to get to the action. They made a standard engagement roll, and we picked up the action with them silently sliding their boat into the factory. I described an enclosed dock with several rowboats moored to it along with two thugs, chatting and smoking. The players and I had a good short discussion about how to use the game's rules to indicate the character's goal in a fiction-first approach, and they decided that their goal was to sneak past the guards and into the main body of the facility. They succeeded at this but with the complication that the factory floor was just beyond the crates.

The players talked about trying to sidle up to the work tables and pretend to be laborers, and we discussed how a flashback could be used to set up an insider. Instead, they decided to go for broke, with the Leech tossing a vial of flammable oil into the midst of the work area while the Hound fired off a few rounds to cause a panic. I told them that this was a desperate move, and the Leech botched the roll. Here is where things started to go badly for our Shadows. The Leech badly failed the roll, getting all ones and twos. Since it was a desperate move, I described how he botched the throw and spilled the oil mostly on himself, inflicting Tier 3 Harm. This allowed me to introduce the rules for Resistance rolls as well as Armor, and by using both, he reduced the consequences to minor burns, Tier 1 Harm.

As part of my post-play reflection, I realize now that I violated one of the GM rules for Blades in the Dark: "Don't make the PCs look incompetent." I treated the roll like a critical failure in part because of the incredible number of ones that the player rolled. It was also funny, in a tragic sort of way. However, if I could do it all again, I would have had him throw the vial and have it hit something else, something dangerous to them but not immediately deadly, and certainly not something as incompetent as wandering onto a factory floor and setting himself on fire. Alternatively, since I had already established that the workers were dealing with open flames as part of the production process, they could have immediately followed a fire suppression protocol.

Their cover blown, the Hound decided to use his special ability to lay down suppressing fire to buy them some time. Unfortunately, despite having taken a Devil's Bargain that this action would anger the Crows who claimed control of the district, the Hound botched this roll, too. This was clearly a desperate move, and after accounting for armor, he took a bullet in the chest, Tier 2 Harm.

At this, the crew decided to beat a hasty retreat while trying to start a fire near the docks. The Leech had plenty of fire oil to attempt this. They wanted to escape, but they also wanted to succeed, so I offered them another Devil's Bargain: they fling the fire oil recklessly and end up setting fire to the very boat they came in on. The Leech took it and got a partial success, so I described how this area went up in flames, but several Lampblacks from the work floor were charging at them, wielding pistols and clubs.

The crew charged at the two guards who were still standing by the rowboats, and the Hound incapacitated them with some quick shooting. The complication for this filled up the clock I had started for the Crows' tolerance. At this point, the big faction controlling the region was going to take action against our Shadows for causing such chaos. The crew was more concerned at this point about survival, so they tried to unmoor a boat and get out before the charging reinforcements arrived. You guessed it, they botched this roll, too, and both of them took a beating in the attempt (Tier 2 Harm).

Faced with no other viable option, they undertook a desperate maneuver and dived into the water to swim away. This, dear reader, resulted in the first and only six that they rolled the entire afternoon. Despite their burns, bruises, and bullets, they swam out of the dock area and into the river. To me, the fiction demanded that some of the Lampblacks grabbed a boat and chase them, but I also realize that this was a place where we could have made their exact goal more precise in our discussion: did they think that their diving into the water was to get completely safe or to simply get out of the immediate scrap? I interpreted it as the latter, but we could have been more clear.

I started a four-slot clock for them to evade the Lampblacks and gave them one tick for swimming out into the river. The players thought their only choice was to swim for shore, and I pointed out that there were some other options, such as swimming out into the river, or pleading for their lives. That said, swimming for shore made the most sense in the moment, so they tried... and botched the roll. The Lampblacks in the rowboat got into the river and took a few shots at them. This was enough to max out the Leech's Stress, and he took the Trauma of being Unstable, which is completely understandable given how badly this mission had gone.

Now we were in a strange situation. I had established a progress clock for the crew's escape, although it was down to just the Hound now. He said he would just swim for shore, but I recognize that this would violate the Blades GM advice, "Don't roll twice for the same thing." It felt like it would just be "Swim again, but better this time." That didn't feel right, so I retconned the previous situation so that the Lampblacks had brought their boat between the swimmers and the shore. Hence, the Hound could do something like swim out into the ocean (with his punctured lung and bruises) or do something else, like beg for his life. He chose the latter, and I don't blame him. I offered him a Devil's Bargain on this attempt to sway the ruffians: he could have an extra die on his attempt if they let out their bloodlust by killing the Leech. To my surprise, he took it. The Hound knew that they were both as good as dead anyway. It was better for one of them to live than for both of them to die. The Hound succeeded at a cost, so they beat him with Tier 3 Harm and left him for dead on the shore.

We completed neither the score wrap-up nor the downtime activities. Both seemed moot with half of the crew dead, and as I previously mentioned, there were real-world pressures to clean up the table and get one of the boys to a youth group meeting. I plan on reading through the rules regarding how to wrap up a score later today so that I can do a mental walkthrough of how it would go. If the three of us play again, I think we'll just start afresh with a better understanding of the world and the rules.

And I would play again. Despite the game going badly for the crew, we all enjoyed the experience. It was a little rocky at times when I had to reference rules or lore, but that's the way it goes when you learn a new system. Every review of Blades that I have read says that you have to play several sessions before you really get into its way of playing. Although I would play again, I do not currently have plans to play again. I think it would be great fun to play with an adult group with beer and snacks, but getting a bunch of fathers together for a game night is already a desperate move where the dice are loaded. In the meantime, my boys and I got to share a fun afternoon together, and now they have a story to tell about what happens to those who turn to a life of crime.