Thursday, August 28, 2025

An evening with Torchbearer

I mentioned my interest in Torchbearer in my post about goblins and game design, and I was glad to be able to run my first session of it last night. The group consisted of my two older boys and two of their friends. We had a good time, and I want to capture a few thoughts about the experience here. What follows is a mix of a session report together with my reflections on the game.

Background

I cannot remember all the details of how I became aware of Torchbearer, but I can provide a little background about my interest. It goes back to an interest in Burning Wheel, which is always discussed with reverence and respect despite there being a small player base. The Burning Wheel rulebook is a wealth of brilliant and unique ideas for tabletop roleplaying, but it also feels too intimidating to run without some experience. More recently, I have heard amazing things about Mouse Guard RPG, which is rooted in the ideas of Burning Wheel. I tried for a while to get the Mouse Guard RPG box set for my family, but I heard about it after it went out of print. I even ordered one from a reseller who, after a month, acknowledged that they did not have a copy after all and refunded my money. Now Torchbearer is a riff on Mouse Guard RPG with a setting that appeals more to me, so I decided to get a copy of the core books for myself. 

Torchbearer is recognized for having interesting, interlocking systems, and it did not seem too overwhelming for me to run it without having played it. I am grateful that my players accepted that the game might occasionally get choppy. Good DMs know that rulings are more important than rules, but my players understood that one of my goals in running the game was to understand it.

Character Creation and the Call to Adventure

We started with by-the-books character creation. I expected this to take an hour, but it took almost 90 minutes. I did not present a complete rules explanation ahead of time, but instead taught the pieces of the game that we needed as we needed them. We ended up with a Warrior, Outcast, Burglar, and Magician. The players had just a little trouble with articulating beliefs, but reading the ones from the sample characters was helpful. They had more difficulty with instincts because to write these effectively requires some knowledge of how the game works. The instinct format restrictions helped: for example, one player wanted an instinct, "I always know where the exits are," and I was able to help him convert toward something active, like "I always look for the exit." One of the instincts was, "I always strike first," which sounds exciting and appropriate but is not really workable in a game without initiative. (When we got into our only conflict of the evening, I ended up giving him +1s on his first strike as compensation, and I figured we would just change his instinct if we play again.) The players' goals were reasonable given what they knew from the adventure seed, although no one ended up making progress toward them by the end.

The players decided that the characters did not know each other but were in the right place to be recruited for the "Tower of Stars" adventure from the Cartographer's Compendium. It is shorter than the one in the core books, so I thought it might fit better as a one-off. In fact, we only got through about half of it. I modified one of the introductions slightly to get the party headed toward the tower to deliver a letter to the reclusive Beholder of Fates, allowing us to start the adventure at the base of the tower.

The Adventure

Their initial plan was to split the party, half exploring the surrounding area and half climbing into the ruins. This was a good opportunity for me to suggest, especially in this kind of dungeoncrawl, don't split the party. They succeeded at getting inside, which gave me the opportunity to explain the Grind. This is a beautiful aspect of Torchbearer where every four turns, the characters gain negative conditions—and what makes a turn advance is any roll. It made it more clear to them why instincts were powerful and why it's better to have A Good Idea than to rely on skills. 

In the ruined ground level of the tower, while searching for treasure, the Warrior discovered the remains of a broken basalt statue with a magical rune emblazoned on its forehead. The Magician wanted to decipher the rune, but when he pulled together his dice pool, it was clear that he had little chance. At this point, I explained how he could use a trait against himself, reducing further the odds of success but earning a "check" that would be useful in camp. He described how his Quick Witted trait might lead him to jump to the wrong conclusion. Sure enough, he failed the roll, and it provided me a glorious opportunity to deploy one of the adventure's recommended twists. I described how in the back of his mind, he could hear the name of the rune, but as he thought more about it, he realized he was hearing it chanted all around him. This was when the party discovered that some kind of troll rats had emerged from the rubble, and they were chanting in a mysterious ancient language. Once they were recognized, the rats attacked the party, leading us into the evening's conflict.

Here, I explained to the group a few of my favorite pieces of Torchbearer. First, there are different kinds of conflicts depending on the party's goals. In our case, this was a Drive Off conflict since the goal was merely to get rid of the rats, not necessarily to kill or capture them. (In retrospect, I should have left the rats chanting menacingly and given the party the opportunity to take the initiative here, to decide if they wanted to chase them off or leave them as creepy watchers.) Second, I explained how disposition works, that characters don't have "hit points" until they are in a conflict, and these hit points are only relevant to the current goal. Third, I explained the conflict system itself in which the GM and the players each queue up three actions, each of which are then executed simultaneously.

I could not remember, nor quickly find, how to compute the disposition for a group of enemies. I mistakenly thought I would multiply their Drive Off disposition by the number appearing, but this is wrong: I should have used the base disposition and then added one for each extra participant. Because these enchanted troll rats had such low disposition, it was only a four-point difference. Unfortunately, I also could not remember the rules for whether damage would distribute across multiple enemies once one was eliminated. It does, but I treated it as if it didn't, which was unfortunate since the opening attack by the Outcast was a brilliant and brutal success, and it should have taken out at least two of the rats instead of just one. The battle went on a little longer than it should have, and the Burglar's elimination made me take away the party's Fresh condition as a minor compromise. Even though we didn't get all the rules right, the players enjoyed the excitement and narrative structure of the conflict. They could see that this would be a successful conflict from early on, especially given the Might difference between the groups, but they also saw that they could not be overconfident. During the conflict, we also saw the Magician use Beginner's Luck to take an unexpected swipe at the rats, which was much more memorable than watching him roll a d20 and hope for a high roll.

With the rats eliminated, the Outcast proceeded to work his way up to the crack in the ceiling, fail the Dungeoneering test to get through, and got stuck. This happened to trigger the Grind, and everyone became hungry and thirsty. The ceiling was only ten feet high here, and so the Magician decided to try to push the Outcast through with his staff, soliciting the help of the Warrior. I explained that this was a test of pure strength, so it was a Health check. The Magician's player, seeing that his character had only 2 Health, said that if it's a Health check, he won't bother. I explained that he was already doing it and had him roll. Again, this is a case where a simple rule—if you say you're doing it, you're doing it—gives the game a lovely gravitas. Naturally, he failed the roll. Rather than throw a new twist at them, I decided to give a condition. I described how he was grunting and shoving, standing next to the smelly and crude Warrior, looking up at the Outcast's rear end, and so although they got the dwarf out of his predicament, the Magician came away Angry. The table loved it. The Magician's player had earlier expressed how uncomfortable he was with role-playing and talking in character, but here, right away, he picked up on the series of things that compounded to make him angry: the tower is ruined, I don't know these people, the dwarf got stuck, the Warrior burps constantly, we got assaulted by magical rats... It was wonderful. Unfortunately, I forgot to give lesser conditions to the Warrior and Outcast, both of whom helped. I am still learning the ropes.

The part got up into the next room, where there happened to be a decaying corpse. The Warrior was very excited to check it out given his instinct ("Always look for loot.") and his being corpse-wise. He was surprised, and so was I, that his being corpse-wise didn't actually help him in evaluating the corpse: it would have been useful for him to help someone else do it, but in the absence of fate or persona points, it provided no benefit to him. This was an unfortunate bit of ludonarrative dissonance, and it's also where we decided to call it a night after almost four hours together. All the guys thanked me for running the game, and many expressed their being  impressed with Torchbearer. I think they all had come to understand why I was excited to run it, and they came away with some good stories to tell.

Thoughts on Torchbearer

I love the simultaneous actions in Torchbearer; it makes d20-style combat look like a tennis match.  Not only do simultaneous actions provide interesting opportunities for storytelling, but at the table, you get that lovely feeling of flipping over a card to reveal a secret. It's not obvious to me how much of the action selection is really tactical rather than blind choice, but it almost doesn't matter: it's still fun. On the side, I have been working on an OSR-style dungeoncrawl video game inspired by ICRPG and Knave, but witnessing the joys of simultaneous actions makes me waver in my dedication to traditional turn-based combat.

I was nervous about remembering all the moving parts of Torchbearer. I printed up two different cheatsheets, but I'm not sure either one helped as much as I had hoped. One particular thing I was concerned about was how Fate and Persona points work. It wasn't until my prep the day of the session that I realized that new characters don't have these anyway. I share this for other new GMs so they can reduce their cognitive load on the first session. On the flip side, I had forgotten how Nature works, that when a player is facing a situation where they have no skill, they can opt to use their Nature instead. The party may have missed an opportunity to use these in a narratively interesting way.

The Grind is a brilliant system for putting pressure on the party. It presents time dramatically, tying together time and action, rather than in a simulated way. This is another aspect of the game that inspires me to think more about tabletop-videogame crossover. It's easy to run any number of simulations in a computer, but where is it better to deploy a dramatic tool instead? For example, I think it's more engaging to say that a torch lasts for two interesting moments than for sixty simulated minutes.

The Torchbearer books explain that it is designed to produce stories that unfold over ten to twelve sessions. Mine was a one-off learning session with a collection of players that will be hard to reproduce. I had a great experience though, and it makes me want to run a more sustained campaign to see more about how these systems interact.

Tuesday, August 26, 2025

Notes from Tynan Sylvester's "Designing Games"

I just finished reading Tynan Sylvester's Designing Games. The book was published in 2013, but Sylvester is probably best known as the creator of Rimworld, which came out a few years later and is not mentioned in the book. I have read many game design books, but I still learned a lot from this one. This inspired me to share some of my notes here.

Sylvester's core conceit is that games are artificial systems for generating experiences. He presents a model early in the book that undergirds the rest of the text, that a game provides mechanics wrapped in fiction, which produces events with fictional meaning, which produces emotions, which leads to an experience. I am a collector of definitions for "game," and I think this may be the strongest formalized definition I have encountered. I appreciate that it highlights what makes games different from other media, which is a theme that also comes up frequently in the text.

An early chapter discusses emotions and their role in game design. He uses the term "human values" to refer to anything that is important to humans that can move through multiple states, such as life/death, victory/defeat, and friend/stranger/enemy. I am not keen on his terminology here since it sounds relevant to morals or ethics, but I like the hook for thinking about design. It reminds me of the binary opposites that are essential to imaginative education. Sylvester provides a catalog of emotional triggers that can be used here, each of which is given appropriate exposition in the text: learning, character arcs, challenge, social interaction, acquisition, music, spectacle, beauty, environment, newfangled technology, primal threats, and sexual signals. 

I have long been familiar with the idea that players cannot truly answer what caused different emotions, but that they will construct a narrative to explain the phenomenon to themselves. I was not previously aware of the technical explanation known as misattribution of arousal and the psychological studies around it. Sylvester focuses on the "bridge" studies, and a little searching online finds that this is a robust finding. A related psychological concept he draws upon is the two-factor theory of emotion, which explains that what we call "emotion" is a combination of physiological arousal and a cognitive label. This explains how the same physiological stress response could show up in Tetris and in Doom, but the latter's fiction is what leads us to call it "fear."

I would have preferred that Sylvester give more precise definitions around his use of "elegance" and "emergence" since he does not adequately address Kate Compton's 10,000 bowls of oatmeal problem. It's one of a few chapters where I wondered how Sylvester might approach a second edition. No Man's Sky came out a few years after the book was published, and Sylvester's own work on procedural storytelling undoubtedly gave him new insights into these ideas.

Sylvester uses apophenia to describe how humans find patterns in noise, but I wonder if he uses it a bit too broadly. Apophenia describes phenomena like "lucky streaks" in gambling. Seeing a face in the clouds strikes me as different because our visual cortex is programmed specifically to find faces. Similarly, anthropomorphising a non-human character seems like it appeals to our narrative sense, not to a matter of pattern-matching. I wanted to track this observation from my notebook because it's an area where I think precision matters, but I realize I am not completely confident in my own ability to describe the edges of these concepts.

The need for frequent playtesting also recurs throughout the book. I am glad that it does since it is such fundamental advice. In his discussion of balance, he points out that what we should be gathering from playtesting is experiences and not suggestions. Again, the essence of this advice is the same in any game design text, but Sylvester presents it with admirable clarity and succinctness, especially given that he is working within a specific and explicit definition for "experiences." He points out that each playtest is a story, but you need to playtest until you can see past the stories into the systems. This gives me an interesting heuristic that I plan to share with my game production students.

His chapter on multiplayer games includes a section on game theory. I am no expert in this area, and I appreciate his layman's covering of Nash equilibria. He points out that the structure of rock-paper-scissors is the only elegant symmetric game without pure Nash equilibria and matching pennies is the only elegant asymmetric game without pure Nash equilibria. All other (game theoretic) games add more entries to the decision matrix without fundamentally changing the structure and are therefore inelegant. I don't usually turn to game theory in my designs, but I enjoyed this section for its clarity and perspective.

I was a little surprised to see him adopt the term Yomi from David Sirlin to describe how players "read" each other—how they predict, deceive, and outwit each other in multiplayer games. I used to follow Sirlin more closely, and I recently uncovered my copy of Flash Duel. It was fun to see his name show up, and I see he's working on a second version of the eponymous Yomi.

I am guilty of colloquially referring to dopamine as a pleasure-causing drug, but Sylvester is more careful in the book to distinguish between motivation and pleasure. Dopamine causes motivation, not pleasure. Remove dopamine from rats, and they will do nothing, not even eat, even though they can still enjoy sugar syrup that is fed to them. It is an important distinction and one I should be careful not to blur since this biochemical understanding allows us to talk more clearly about how games can motivate us to play past the point of enjoyment.

The third and final part of the book is devoted to process. I expected a recommended series of steps to create a game as offered by other texts. Instead, Sylvester opens the section by pointing out that a lot of our terms around process and management are actually borrowed from other disciplines: director, preproduction/production/postproduction. Here, he applies a similar perspective to the process design problem as he does to the rest of game design. One needs to recognize assumptions and evaluate them, realistically but not cynically. Although he is right, also later uses terms like "production" as if we all know what they mean, and I found myself wishing he was more prescriptive here like he was in the section on emotions and experience.

His advice on process comes down to the same observation made by the signatories of the Manifesto for Agile Software Development: work in tight iterations with good feedback loops. This appeals to my sense of lean production, that we should only as much management as necessary. When it comes to sequencing, his advice is to figure out the dependencies among needs and then work upward from there. Here, I would augment his advice with some observations from story maps a la Jeff Patton, but I don't disagree with him. Sylvester is explicit about keeping the game as low fidelity (that is, grayboxing) as possible throughout so that time is not wasted re-creating assets. He mentions that working with an artist or level designer is important, but he is mum about what those other folks might be doing while this grayboxing is going on. I would have liked to know his thoughts about this, given that I mentor fixed-staff student teams.

The chapter on authority is beautiful, and I plan to use it as an assigned reading. Although he does not use these terms, what he describes is an approach to creative management rooted in natural law and subsidiarity. In particular, he talks about how a team member has natural authority over their work, and how it is a mistake to arrogate that. (Note: I learned the word "arrogate" from reading this book even though I've been battling arrogators in ISS Vanguard for some time without looking it up.) This chapter dovetails nicely with the next, which is on motivation. It is clear that he believes the only way a team can succeed is in an environment of trust and respect. Indeed, he goes so far as to say that one can only succeed by "getting people you can trust and then trust them." I used to recruit undergraduates for special community-engaged game development projects, but now, any student can elect into the game development track that puts them into the game production sequence; I am not sure what it implies for my teaching if Sylvester is right. 

Speaking of teaching, I am reminded of the essential challenges presented by the need to grade students in creative projects. Grades are an external reward, and we know that external rewards can kill intrinsic motivation. What then is a professor to do? 

One of the final points in the book is that the best way to motivate a creative person is to ensure that they have constant, small, visible progress. This is the progress principle. It turns my mind again toward considering physical task boards to replace the convenient digital ones my teams used the last two years.

I thoroughly enjoyed the book, and I am grateful for how much I learned from it. I will certainly return to my notes once I start pulling together my plans for the next game production sequence, assuming I get assigned to teach a section or two. The book inspires me to draw more from my own learning to build a process that I believe in, which will be easier after having stepped away from team-teaching for now. I am not sure how much of the book would resonate with beginning designers, who, in my experience, need more structured explanation and exercises to get them into the right mindset. I think it's a valuable reading for someone who has already moved beyond the amateur steps though, once one has come to tacitly understand iteration and the challenges of communciation and motivation.

Saturday, August 23, 2025

On Goblins and Game Design

I recently purchased the rulebooks for Torchbearer 2nd Edition. Something reminded me of the project a few weeks ago, and the free introductory chapter captured my imagination. I have not had a chance to play the game yet, but I hope to do so in the coming weeks.

The Scholar's Guide includes a bibliography akin to AD&D's Appendix N and DCC's eponymous homage. I decided that I would add some of the references to my ever-growing pile of books to read. If nothing else, it will give me a good talking point about "remedial fantasy" during my sabbatical presentation. I started with Lord Dunsany's The King of Elfland's Daughter because it was easy to download from Project Gutenberg while I was on vacation. It contains the most poetic description of Elfland I have encountered: it is a place where poetry lives and some places can only be described in song. Yesterday, I finished The Wizard of Earthsea, which I have known about for decades but never took the time to read. It was an enjoyable story even if it was in the tired child-of-prophecy genre.

The Torchbearer Scholar's Guide describes various creatures that a brave adventurer might find lurking in ruins and caves. Many are classics of the TTRPG hobby described with the default setting's Norse flair. The description of the humble goblin blew me away. 

It is questionable whether goblins are alive in the same fashion as humans or halflings. Rather than being born, a goblin springs from the shadows each time a child tells a lie to its mother or a grandchild steals from its grandparent. They age but do not die from senescence or disease. They can be slain or driven off, but soon after they regather on the margins, hungry for more mayhem.

This is wonderfully mythical. Goblins are not just little green people: they are something else entirely. They are fearsome creatures born of sin. They reflect a dark world where evil is not just a privation of good but itself a creative force. It is both poetic and frightening to the core.

I got onto the Burning Wheel Discord server and asked whether Torchbearer's goblins were inspired by any particular work in the bibliography. I got into a discussion with the coauthors, Luke Crane and Thor Olavsrud. Below is Crane's description of how he designed the goblins, quoted with permission.

The inspiration came while editing tb2e and, as it often is for me, it was born of frustration. I have seen so many goblins slaughtered in my time as game master in D&D. Goblins as vulnerable diminutive anthromorphs might make sense from an evolutionary niche perspective, but it’s entirely unsatisfying to me in terms of a supernatural cosmology in a fantastical world.

As Thor points out in his example, categorizing supernatural beings is never easy. By their nature, these beings defy classification. The difference between trolls, giants and ogres, for example, is best left for the academics to debate. Adventurers should be more concerned with more pressing matters.

So for goblins, in the editing process, I needed a way to use Thor’s taxonomy of beings that demonstrated the vibrancy of these Others and gave goblins a reason to be. They needed a supernatural niche, not an ecological one. So I cackled to myself (out loud!) and gleefully muttered: Spirits! What if they’re spirits?

Since we were developing the spirit conflicts in the LMM [Lore Master's Manual] at the same time, I knew this classification would create problems and possibilities for adventurers.

To support this idea with the description, I attempted to reach into tropes found in folklore. What if those warnings to children about not lying and stealing were true? A second cackle emerged as I imagined goblins sprouting like weeds in the shadows of towns and steadings throughout Middarmark, while grandmothers fruitlessly wagged their fingers and plead with their charges to behave.

Even better, this supernatural provenance sketches a supernatural economy. Why should the simple folk of this land tolerate magicians, theurges, shamans and sorcerers? They are the only ones capable of banishing this incessant plague of goblins. Or what of a witch-queen who inveigles children to lie and steal for her, and so creates an army of mischief?

The possibilities are many, and they wear a different mask than that of the fearsome descendants of Azog and Bolg.

I love how his explanation combines mythmaking and systems. To me, that is the essence of good design, where the narrative and the mechanisms support each other, creating an engine for interesting experiences.

Monday, August 18, 2025

The Pyramid Point Games Summit

My family recently completed a lovely road trip around the Great Lakes. One of the most spectacular views was found at Pyramid Point on the Sleeping Bear Dunes National Lakeshore. The photograph doesn't do it justice.


When we reached the lookout point, a young couple was just turning to go back down the trail, and an older couple was behind us on their way up. The young man in front me wore a University of Michigan cap with "CSE" emblazoned on it. Upon my inquiry, he confirmed my suspicion that it stood for "Computer Science and Engineering," and he told me that he was a doctoral student there. I told him that I had completed my  Ph.D. at the University at Buffalo's CSE department, and I asked about his specialization. He told me that he studied game theory. I laughed and told him that I worked in game design and development, which was a different kind of "game." At this point, the other couple had caught up, and the gentleman joined us, telling us that he worked for the Michigan Gaming Control Board, working with the certification of electronic gaming (that is, gambling) machines.

And so it happened that we held the Pyramid Point Games Summit, where, by incredible chance, the three men on the top of the dune represented specialists in the three different definitions of "game." We laughed at this unlikely coincidence, enjoyed the view, and then parted ways.

Thursday, July 3, 2025

Some thoughts on Fate and Forged in the Dark games

Yesterday, my two older boys and I played a tabletop RPG session with a friend and his two oldest children. We had done a one-off OSR game years ago, and I was surprised when my son reminisced about it and wanted to so something similar again. After some discussions about his hopes and goals, I decided to try a session of Fate, which I haven't done in years. I played Fate once five years ago, and though that was my most recent experience with it, I find myself referencing it frequently as a touchstone in RPGs. I think it would be interesting to play Fate under the direction of a GM who had mastered the system. In the absence of such experience, as with my last play, I felt like I was a little underprepared, despite the system being relatively light.

I settled on Fate Accelerated (FAE) as a ruleset for a one-shot game with our friends, and we had a fun time exploring the work of a secret, elite, international organization in the 1930s that was funded by a mysterious benefactor to pursue justice. It was very pulpy and inspired in part by a cousin's fondness for Doc Savage

I looked for some advice online about how best to use FAE for a one-shot, which reminded me to skim the excellent Book of Hanz. An old Reddit thread suggested using the Phase Trio from Fate Core for character creation, and that sounded like so much fun that I decided to try it. It was a lot of fun, being essentially a narrative minigame in itself. It took a while to get everyone involved, in part due to unfamiliarity (mine and theirs), and I think it may have also shown different player's comfort levels with this style of character creation and with tropes of 1930s pulp. If I do it again, I should probably send out some introductory materials and preparation suggestions before the game to help get players in the right mindset.

We ended up with a ragtag team of specialists who had some complications in their backstory. The group played into the idea that character aspects can be strengths and weaknesses, and this was a lot of fun, although it's harder to play this up equitably in an ensemble cast. I tried to spread the spotlight around, but five players is hard to manage (and more than the rulebook's recommended maximum by one). Many scenes took much longer than expected, but one marvelous outcome was that the group discovered a completely unexpected way to resolve the main plot. The climactic ending involved explosions and petty vengeance, which is an excellent way to end pilot episode.

A quick turnaround of events meant that we did not have much time for a debriefing conversation after the game, but in a short chat with my boys, I found myself comparing Fate and the Forged in the Dark games. Both games are designed to give a fiction-first adventure, but they accomplish it in different ways. Fate encourages a cinematic approach, but its mechanisms do not demand it. For example, in our game yesterday, the team received a call that had them flying to Havana. Rather than cutting to the flight, some players wanted to start itemizing what gear they were bringing along in case they need it. Now in a movie, you'd have a Chekov's Gun scenario: if the camera shows a character packing a particular gadget, that gadget better be used later in the show. With player-driven story, though, how is one to know what will be useful later? The instinct is good from a player's point of view, and it's not exactly anticinematic, but it is dull. I pointed out that any player could always spend a Fate Point later to simply declare that they had brought some particular gadget with them, and I realized immediately that I had just turned Fate Points into Blades in the Dark's flashback and load system. This helped the players understand the various uses and robustness of Fate Points, but in reflecting on the experience, it also pointed out how brilliantly Blades ties the flashbacks with Planning and Load. Get the competent characters into a situation with little fanfare and let the details be clarified through play. I like that better than having players trying to lay out the perfect plan and hope it works.

The other piece from Blades in the Dark that I missed was player-facing rolls. In truth, I struggled to remember the distinctions between Fate's Challenges, Contests, and Conflicts, each of which are almost synonomous in English. I had hoped that the players would face one of each, but we never had much of a direct Conflict. Or, put another way, we could have, but I ended up handling it using what feels like a faster player-facing, elective-order system. I find the idea of tallying each side's Speed and then taking turns with each NPC to be tedious. It's the kind of thing that I may have enjoyed in my youth DMing D&D2e, but those games had a more tactical and simulationist bent. Now, I much prefer the rhythm of letting the players act and watching the world react.

Still, one of the most compelling ideas in Fate is the robust aspect system, and I quite like the action of creating advantages. We had only one good example of that, but I think it's because a player could see that there was interesting on-screen ways that his character could lead another character to an exciting spotlight moment. I have not seen any Forged in the Dark games that have something equivalent.

One other criticism that I have of Fate Accelerated in particular is that, after playing through the game, I don't like approaches over skills. I like the idea of describing how characters approach problems, but practically speaking, it was too easy to alter one word of a story to get a systemic boost. If you're in a boat race, are you necessarily driving "speedily"? It seems to me that there's not much difference between saying you're doing it speedily, flashily, cleverly, or even forcefully. What really matters here is that you know how to drive a boat. (Do you "drive" boats? Whatever.) So, one of my takeaways is that I want to take a closer look at Fate Condensed as an alternative. I think it would have really helped some of the players, especially in the large group, to have an obvious spot where their particular skills could be brought to the table—that they can do a thing others cannot do, not just approach something in a way that's essentially the same as their neighbor.

In preparing for the session, I came across the It's Not My Fault! cards, which I have ordered and am excited to try, so this will certainly not be my last experience with the Fate system. I suspect that this set may be a better tool for one-shots, although I have no regrets about exploring the Phase Trio for character creation. However, it also makes me want to get Scum & Villainy back to the table. I want to try running it using the Deep Cuts systems changes to Blades in the Dark. I also want to rewatch Cowboy Bebop, and maybe these things are coupled.

That's my hodgepodge of thoughts collected from an enjoyable game yesterday. My boys have been on an unexpected TTRPG kick lately, and I need to talk to them about whether they want to pick up the Scum & Villainy campaign we started last Autumn. Incidentally, folks who like Blades in the Dark should check out the recently released video game Cyber Knights: Flashpoint, which is obviously inspired by it and draws liberally on its systems.

Monday, June 23, 2025

Stones

My family gave me this lovely collection of stones for Fathers' Day.

They are all stones from Clank!, one of my family's favorite board game series. In reading order, they are sapphire, ruby, emerald, black tourmaline, (Herkimer) diamond, smoky quartz, amethyst, rose quartz, and topaz.

Wednesday, May 14, 2025

A proposal for the disclosure of the use of generative AI in scholarly writing

I recently encountered a scholarly manuscript in which the author acknowledged ChatGPT as being instrumental in composing the work. This is fraught with similar problems to students' use of such generative AI tools: how is the reader to trust that the author actually means what is written? 

I propose that those who wish to publish such manuscripts must also release their original draft, prior to its being passed through AI tools. This recognizes that the purpose of these tools is to translate between natural language dialects, and so they should be treated like other translations. While I am reading an English translation of Francois Mauriac's Viper's Tangle, I can access the original French in order to distinguish between the connotations of the original and the interpretation of the translator.

Wednesday, May 7, 2025

Reading "Fifty Quick Ideas to Improve Your User Stories"

I recently finished Fifty Quick Ideas to Improve Your User Stories by Gojko Adzic and David Evans. The book was strongly recommended by Dave Farley on his Modern Software Engineering channel. I have been using user stories for many years as part of my teaching and practice with agile software development, and I hoped that the book might give me some new ideas and perspectives. I read it with a particular interest in helping students manage game development projects.

The book itself is laid out simply: every two-page spread presents one of their recommendations. Each has a narrative, a summary of benefits, and then practical recommendations for getting started. From the title, I expected the book to be short and punchy like The Pocket Universal Principles of Design, but it contains a lot more detail. The authors are clear that the book is not an introduction to user stories; they assume the reader is already familiar with user stories and the jargon of agile software development. This is good for a reader like me, except that I had earlier lent the book to a student who was just learning the fundamentals. I would use a different resource for that purpose if I could do it again.

Here are suggestions from the book that struck me as potentially helpful to my teaching and mentoring. I have included the page numbers for ease of future reference.

Describe a behavior change (p14)

This is the second time recently that I have come across Bill Wake's INVEST mnemonic for user stories—that stories should be Independent, Negotiable, Valuable, Estimable, Small, and Testable. The other place was Vasco Duarte's No Estimates, which I wrote about earlier this year.

The idea behind this recommendation is to quantify change. For new capabilities, "start to..." or "stop..." are simple quantifiers. I feel like there's something useful in here for introducing students to user stories.

Approach stories as survivable experiments (p18)

This tip is all about the perspectives of what stories allow. I often see students mistake stories for traditional requirements, probably because traditional requirements look more like school. Framing the stories as experiments may help students see that this is more creative and exploratory.

Put a "Best Before..." date on stories (p24)

I wish I had thought of this one. I have mentored student teams where they have a story in the backlog that only makes sense to do by a certain date or milestone. It's an easy one to remember, and it falls into a common pattern in the book: make the stories your own.

Aside: It reminds me of the advice about managing my curriculum vita that I received from my colleague Mellisa Holtzman many years ago. She said that your vita is your own, and that you should use it to tell a story rather than slavishly follow a template. She was right, both philosophically and rhetorically. My university recently moved to using a one-size-fits-none digital solution, and I was disappointed that there has been no discussion about the impoverished epistemology of such a move. 

Set deadlines for addressing major risks (p28)

This is similar to the previous one, and it references it as a technique: one can use learning stories that have "Best Before..." dates.

The authors distinguish between learning stories and earning stories, referencing an old presentation by Alistair Cockburn. This is a nice distinction as well that I am sure I can use.

Use hierarchical backlogs (p30)

Having spent the last year mentoring three teams with flat backlogs, it made me miss story maps, and I look forward to going back to them. Story maps come up explicitly on page 36 under the recommendation Use User Story Maps, which is just after the recommendation on page 34 to Use Impact Maps. Gojko Adzic also wrote a book on impact mapping. I have read most of it and find it intriguing, and I can see how it could be useful in a business environment. It didn't strike me as immediately helpful for my needs teaching the game design and development or the basics of agile software development.

Set out global concerns at the start of a milestone (p40)

The authors acknowledge that user stories cannot and should not capture everything that a team has to do. Writing emails or reading applicant vitae are examples of crucial work that is not directly related to a user story. 

Cross-cutting concerns such as security, performance, and usability should still be discussed, but they should manifest as something like a checklist that accompanies the whole process—not embedded or wedged into a user story.

Use low-tech for story conversations (p54)

The recommendation for whiteboards and sticky notes comes as no surprise to me. What did surprise me was that they caution against using a room with a large central table. Such a configuration makes people sit and opine rather than converse, and what comes out of the conversation (what we often call the "user stories" themselves) are just markers of that conversation.

Diverge and merge (p58)

When user story analysis is being done by a large team, they recommend breaking the team into smaller groups. Each group comes up with examples of user stories, then the groups compare their results. They suggest 10-15 minutes of divergence. 

In last year's game production studio, one team was much larger than the others, and I recommended that they break down story analysis by feature domain. I wanted each group to have representatives from Computer Science and from Animation. It worked reasonably well, but it was slow, and there were still a lot of holes in the analysis: cross-cutting ideas were lost. (Also, most of the students didn't know anything about user stories, and some harbored significant misunderstandings.) Next time, I will keep this advice in mind and have them work on the same context rather than different ones. I regularly have students push back on this kind of activity as redundant but this is because they have not yet experienced the productivity drop that comes from realizing later that the requirements were wrong.

Measure alignment using feedback exercises (p62)

This one jumped out to me because it sounds like it came right from a book about active learning techniques for the classroom. 

During user story analysis, rather than ask if anyone has questions, give the discussants a sample edge case and ask them an open-ended question such as, "What happens next?" Each writes their response, and then the team compares them. This shows whether there is shared understanding or not.

Reading this gets me fired up to use it in all of my teaching from now on. That's awkward since I won't be teaching again until Spring 2026. That gives me all the more reason for this blog post.

Split learning from earning (p90)

As mentioned above, this separates two kinds of stories. Learning stories still need a concrete goal that is valuable to stakeholders. An interesting corollary to this advice is that every story should be either learning or earning; if not, it is not really a user story at all.

Don't push everything into stories (p98)

I foreshadowed this one earlier as well. They advise strongly against tracking non-story work with user stories. An example they give is insightful: if such work is tracked in the backlog, and the stakeholders sort the backlog, then they will deprioritize work that does not generate value. The non-story work won't get addressed, then, until it is a crisis, at which point it becomes the highest priority, so there was no value in having it in the backlog.

Avoid using numeric story sizes (p102)

The two-page spread summarizes some of the key arguments behind the No Estimates movement. Here, they recommend that if someone uses T-Shirt sizes, to only use at most three different sizes. Better, they prefer Goldilocks sizing: a story is either too big, too small, or just right. Their conclusion is similar to Duarte's: just count stories.

Estimate capacity based on analysis time (p106)

If we know that planning for an iteration takes 5%-10% of the iteration time, then we can timebox the actual planning and then only work on the things that could be covered in that planning meeting. That is, use the planning meeting's duration as a heuristic for the scope of an iteration. This is clever, and the authors acknowledge that the moderator needs to prevent the meeting from getting too far into the weeds.

Split UX improvements from consistency work

They recommend having UX changes explored by a "mini-team" that explores and prototypes. The output of this mini-team's work are the user stories for the whole team, not exhaustive design specifications. The work of this exploratory team is on the learning side of the learning/earning divide. The mini-team can slipstream back into the main team to help with implementation.

This interested me considering how often I see teams struggle with questions of gameplay and engagement. "Will this be more fun?" is a good design question that can be approached through prototyping. Unfortunately, my teams often want to do this work as if the answer is a fait accompli. I wonder if this recommendation, together with the learning/earning split, will help me frame better for them what it means to answer the design question with a prototype first. The outcome would be the stories for the whole team to actually engineer a solution. I suspect I will encounter much of the same resistance I mentioned previously, where teams assume working on the "product" is more efficient than working in prototypes because they have never seen a product fail under bad design or bad engineering. I think this is why I prefer working with students who have tried hard things and seen how mistakes have consequences beyond a scolding or a bad grade.

Check outcomes with real users (p116)

The authors proclaim this as potentially the most important thing a team does. The important corollary here is that this requires one to write user stories the way one approaches TDD, or even how one ought to design learning experiences: focus from the beginning on an assessment plan.

Saturday, May 3, 2025

The outcomes of the 2024-2025 Game Production Studio

I have been working with a multidisciplinary cohort of students from Computer Science, Art, and Music in the Game Production Studio for the past three semesters, team-teaching the experience with Antonio Sanders from Animation. At the end of this process, we had three games released for free on itch.io:
I am proud of the work these students have done, but it was a very different experience from the previous cohort. Our third cohort of students in the Game Design & Development concentration are currently wrapping up preproduction, but I have not been involved in that course. I am out of the sequence for a year as Travis Faas steps up to lead the sequence, which frees me up for Fall's sabbatical that I mentioned yesterday.

One of the primary differences between last year's and this year's cohorts was scale. Mentoring a single team of six students is very different from mentoring 25 students on three teams. Some things I did intuitively needed to have been more structured. The clearest example is the asset pipeline. Both years, there was a studio policy that everyone needed to be able to open and edit assets and get them into the engine. It was easy to confirm this with my smaller team, and for the larger one, I took students' word for it. It became clearer to me late this Spring that a significant number of students could not, and never did, know how to do this: instead, they relied on manual hand-off procedures. The point here is that, with a small team, I could simply observe them in every step and know they had met expectations, but with the larger team, that did not scale. "Trust" and "verify" needed to be coupled, especially since it wasn't even clear that the students who had previously certified their compliance with the protocols understood what it was all about. Release engineering and quality assurance were two other areas, along with asset pipelines, where I know that I will need more formalized approval processes in the future. I don't know yet how much those processes will look like industrial ones, with some kind of threat of firing or demotion, or academic ones, wearing traditional pedagogic trappings.

Managing three simultaneous teams made it difficult to determine the timing of iterations. We ended up keeping all the students on the same schedule, which meant that there was one day when each team was doing review and retrospective meetings. One of the best decisions my co-teacher and I made was to have both of us sit in on the review meetings, and that's something I need to keep for next time. However, in reflecting on the semester, I think I should have also been there for the retrospective meetings, at least at the beginning of production. This would help me model the process for the students. Also, it made me reflect on how in my immersive learning classes, it was the retrospective—not the review—where I made a lot of my pedagogic insertions. Once a team identified a pain point, I could direct them toward an industry standard solution via just-in-time teaching. If I have another large section, I need to consider staggering their iterations so that I can be at both meetings.

We gave our students a short written final exam that was inspired by the questions we used during team retrospectives. I took notes about their responses to the final question, which was to identify something that they would do differently during production, knowing what they know now. I consider this one of the most important outcomes of the entire three semester process, and so I will summarize their responses here. I have clustered their responses and removed any identifying information, of course.

Work management. We used HacknPlan to manage work, but each team had slightly different stories about how this didn't go as well as they would have liked. Several students commented that using physical, visible task tracking was far preferable. It makes me want to bring in the same kind of task boards that I used in RB368 for my immersive learning classes. How to do this with equitably in a shared lab is an issue to figure out.

Scope management. Several students commented on how their scope was not well under control. I have been thinking about this as well, especially since I recently read Vasco Duarte's No Estimates book, in which he advocates for having proximal work in higher focus than distal work. This is somewhat contrary to the advice given by Lemarchand in Playful Production Process, which is our reference text for the studio. Lemarchand advocates for having a full production plan at the end of preproduction, but at the same time, he recommends breaking down a production phase into iterations—without really explaining how one might do that. I think Duarte's model is going to be more helpful here, and I'm thinking about abandoning the entire "macro chart" element of preproduction in favor of epic stories.

Code quality. A few students commented on how they needed to have better control over their technical implementations. Some mentioned a bad habit of pulling the first tutorial they could find into their production code, and how this would lead to an unmanageable mess of ideas. Others talked about how they should not have tried to reuse the preproduction code since it was too quick-and-dirty. A few recognized that introducing more automation would have saved them a lot of trouble, but also that their failure to refactor meant that their code was not amenable to testing. 

Iteration. Some students recognized that they needed more iteration on their core design loops and on their assets. Our students had a bad habit of taking an inspiration directly into production without low fidelity prototyping and experimentation. It was good to see them talk about how they should have done more sketching and playtesting. I do wish they had appended "as you instructed us to do" to their answer.

Vision management. One of the strangest things to me is that, despite my repeated suggestion and admonition, I don't think any team had a vision statement: they didn't articulate design goals or experience goals. This either caused or is a symptom of significant problems. Conversations about design alternatives became about force of personality rather than alignment with a vision. It was good to see that some of the students recognized this and highlighted the need to articulate vision as something they would do differently. It is the kind of thing that, in my producer hat, I can demand of future teams, and in my professor hat, stick a grade on so they know I am serious.

Marketing. Some students recognized that decisions made in preproduction lingered far longer than they should have. This is strongly related to the previous point about vision management. From my perspective, it looked like these students were on the road to recognize that they were rationalizing the past rather than being intentional about the future. I see this frequently with my students, that they will tell themselves a story about what their vision must have been because it aligns with what they already did. It's very human. It's also dangerous for collaborative creative work.

Specificity. This final pattern in the students' responses caught me by surprise, and I was delighted with how students articulated such an interesting problem. Many recognized that their discourse throughout the project was often at a level of abstractions but that the design and development work they needed to do was at the level of concretion. Again, I think the lack of clear design and experience goals contributed here, but it's also a case that makes me think about techniques like Stone Librande's one-page design documents. Looking back at the three projects my students made this year, each one of them could have been expressed in a one-page design and hung on the wall.

It was a trip working with these students for the past three semesters. They should all be proud of their accomplishments: making games is hard! Some of them really impressed me by learning new things, working through social and technical challenges, and devoting themselves to solving hard problems. I love teaching this kind of class, but it will also be good for me to take a little break. I look forward to what a sabbatical can do for my perspective on some of these ideas.

Thursday, May 1, 2025

Reflecting on CS222, Spring 2025 Edition

This semester's CS222 class was unlike any other. My department is once again involved in a 1+2+1 program with a Chinese university, meaning that Chinese students complete their first undergraduate year in China, their next two years in the USA, and then their final year back in China, culminating in dual degrees. It was a vibrant program many years ago, and I am glad to see that it has come back. In the Spring semester, I had all of the eligible 1+2+1 students in my section, which meant that half of my enrollment consisted of domestic students and half consisted of Chinese nationals. I used to have some in my classes, but I've never had half of the enrollment in a class be international students before. It meant that on most days, there were more native Chinese-speakers in the classroom than English-speakers. 

One of the things I realized this semester is how often I speak in idioms, using phrases whose literal translations don't have an obvious meaning. For example, I told one of the teams that they had gotten "in a pickle." Then I laughed and wrote that expression on the board, gesturing to it and pointing out how ridiculous it was. I explained that it meant that the team had gotten themselves in trouble, possibly of their own creation. 

Unfortunately, moments like this were not as powerful as I would have hoped. Turning to my class, the domestic students were smiling as they contrasted the literal and figurative meanings of the phrase, but almost all of my Chinese students were as they always are: eyes glued to their monitors or smartphones. A guest speaker's presentation allowed me to sit in the back of the room and verify my suspicion about what was going on: most of the students had their phones propped up on their laptops and were running voice translation software. They were not listening to the speaker in any significant sense but reading in Chinese what it thought the speaker was saying. Occasionally, a student would switch to the laptop to visit a site or verify a term, but mostly they were reading real-time translated transcriptions of what was spoken. Not all of them did this, but most did. Some listened and took notes. One watched Chinese television. In this way, they are not unlike domestic undergraduates.

As usual, we completed a major project lasting about 8 weeks, and it was split into three iterations. I gave lengthy feedback to each team's submission, and sometimes teams even read and responded to it. An unusual frustration from this semester was that none of the teams really nailed the final iteration. I expect 20% or fewer to get the process right the first time, and then maybe 40% on the second iteration, and I usually get 60% or better addressing the fundamentals by the third iteration. I didn't have that this time: each project had something fundamentally wrong that I had already pointed out to them in a previous iteration. One conclusion from this is that I may need to rearrange or remove some of the elective content from the class in favor of more supervised in-class practice. Another consideration, though, is that I hadn't accounted for the extra labor done by international teams. I strongly encouraged—but did not require—international project teams, and most of the teams were. This means that in addition to tackling the significant challenges of the course, which include trying to change habits and conceptual models around programming, these teams were also dealing with language and cultural differences. That work is real work but it was outside the grading scheme. When I laid out their final grades yesterday, I decided to add a flat boost to the international teams to compensate them for undertaking the challenge. The resulting grades looked more appropriate to me for how well I knew the relatively small class. If I were to teach another class with such a high proportion of international students, I would need to revisit the question of whether an up-front incentive might be fruitful.

Course achievements were once again counted as regular assignments rather than their own category, as I did last semester. I do think this works better since it lowers the cognitive load of the course. My department currently has unprecedented growth in student organizations, and I need to bring back some of the achievements related to those. 

Counting achievements as a regular assignment has helped students navigate that part of the class, but I have yet to crack the problem that many students who ought to resubmit work don't do it. The idea is that students resubmit work until they have shown an understanding of it, and I am not sure what impedes students from doing this. It's possible that it's too liberal for Canvas to handle. Canvas won't tell them that something is due because they can resubmit any past assignment once per week. I suppose I could make a separate weekly resubmission assignment, which may solve the problem that some students fail to understand what "one per week" means. It feels like catering to their LMS addiction to me, but maybe it's what they need to help them through the challenging content.

I'm still stuck on the question of whether the content of the course would work well as a portfolio. Take a principle like naming: I could require students to submit in a portfolio an evaluation of something they wrote before 222 (which I do as an assignment) along with a sample of work that demonstrates their following the principles. Such a portfolio would be a powerful testament to what they have learned, but I've struggled to figure out how to pace such a thing. When I first started using achievements in CS222, all the assignments were from a grab bag of achievements, and in many ways, it was a portfolio-style assessment. As I put more emphasis on the project, I had to put that aside. I mention it here in part because, as I understand it, the 2026-2027 academic year will see a transition from a 15-week to a 14-week semester here. That will require me to blow up the course and rebuild it since there's no way I could just trim a few things out and have the rhythm still match.

The last note I'll mention here is something from my planning document, where I keep short notes about how things went and what I want to change the following year. This year, I had several teams undertake projects whose nature did not fit the requirements of the final project. The reasons undoubtedly come from a combination of background, culture, language, and generational differences. To cut to the chase, I realized I need to make it even more explicit that the final project needs to do something computationally interesting: being a clever interactive app is not enough. Some of the projects students wanted to undertake would have been great in a course on web and mobile apps, but they were not good contexts for exploring TDD for the first time.

I will be on sabbatical in the Fall, which means I'll be stepping away from this and all my other regular courses for a semester. It will be a good chance to catch my breath, and I won't have to carve out a week over the summer to rebuild the course.

Tuesday, April 22, 2025

What we learned in CS222, Spring 2025 edition

I usually like to make the What we learned exercise the last thing in the semester before the final exam, but this year, I had to move it up a little. I had two seniors come to class today to talk about their experience, and that meant that the remainder of our meeting was just enough time to do this exercise.

In 30 minutes, the students came up with 109 items. I gave each student six stickers, and they voted on the items that were the most important to them. These five rose to the top:

  • Test-Driven Development (13 votes)
  • Clean Code (11 votes)
  • Programming intentionally (6 votes)
  • Model-View Separation (5 votes)
  • Canvas stupid (5 votes)
The students recognized that many of these top items are categories rather than particulars and so tended to attract more votes, but that's fine with me. The list is still remarkable in two ways. First, the third most important item to this population was programming intentionally. I don't remember this coming up as an outcome of the course before, but it's a fascinating sentiment. It is different from saying "We are using Clean Code" or "We are using Mob Programming." It is a statement of how we even go about making those kinds of choices, which is great. Maybe if I ever pull all my ideas together into a book of programming advice, I'll call it Programming Intentionally.

The other noteworthy thing on the list is the last one. It's the first time I remember a "joke" entry showing up as a top item. Any good class is going to have some funny items on the list, especially once they relax into the exercise of reflecting on the semester. In this case, "Canvas stupid" was my shorthand for a student's much longer comment, which was reflecting on my telling them how the way that Canvas deals with points is stupid, that you cannot deal with small numbers nor large numbers adequately. In my particular case, I believe I was ranting to the class about how I want to normalize scores into the [0,1] range, but how Canvas has hard-coded two decimal places. I even reached out to Canvas support earlier this semester to see if we could enable more somehow, and I was told it was hardcoded into their implementation. 

Thursday, April 17, 2025

Article in Well Played special issue, "For the love of games"

I have an article in the latest issue of Well Played. It is a special issue with the theme, "For the love of games," and the editors invited articles from games professionals about a particular game that impacted our career paths. My article certainly has the best title I have written: The Thief of Fate and the Devil Whiskey. It was a delight to write, and I hope readers enjoy it.

Monday, April 14, 2025

Making Dart test helper methods show up in the structure view of Android Studio

Why such an awkward title for this post? It is the kind of search I was doing the other day when I found no hits. I'm writing this in hopes that I can save someone else the trouble I faced when the solution is, in fact, quite simple.

My Dart project contains many unit tests, but many of them use helper methods rather than calling test directly. Unfortunately, I was deep into this approach when I realized that these tests were not showing up in Android Studio's Structure view, which is otherwise a good way to navigate files.

Here is an example to illustrate the problem. Notice that the call to test is nested within the helper function, testExample.

 import 'package:test/test.dart';  
   
 class Example {  
  bool isEven(int value) => value % 2 == 0;  
 }  
   
 void main() {  
  final example = Example();  
   
  void testExample(  
   String description, {  
   required int value,  
   required Function(bool) verify,  
  }) {  
   final result = example.isEven(value);  
   test(description, verify(result));  
  }  
   
  group('Example.isEven', () {  
   testExample(  
    '2 is even',  
    value: 2,  
    verify: (result) => expect(result, isTrue),  
   );  
  });  
 }  
   

When this is opened in Android Studio, the Structure view looks like this:

Notice how the group is empty. 

The other day, I tried different searches to find an answer. It seemed to me that there had to be some way that unit testing libraries communicated their structure to JetBrains IDEs; it could not be the case that the JetBrains engineers were doing simple string matching in source files. Yet, I had no luck. In my confusion, I even turned to ChatGPT, which confidently told me that the only way to do it would be to refactor all of my tests into (yet more) higher-order functions so that I was calling the standard test function at the top level. I asked it for a reference, hoping to find the documentation I had unsuccessfully been searching for, and it unhelpfully pointed me toward two web sites that don't address this issue. Still, I put a potential refactoring into my project plan, although with over a hundred tests and counting, and with a rather eloquent current solution, this would have been both tedious and disheartening.

One of the reasons I started writing helper methods in this particular style—that is, by sending named functions as parameters—was because this is how the bloc_test package handles testing. A day or two after having tried to search for a solution to my problem, I was in a test file and noticed that all my blocTest calls did show up in the Structure view. How was that possible? Thanks to the MIT license of bloc_test, I checked the implementation and found the @isTest annotation, which comes from the meta library. Quickly reviewing its documentation, this was clearly exactly what I needed. I included this little annotation to my project.

 import 'package:meta/meta.dart';  
 import 'package:test/test.dart';  
   
 class Example {  
  bool isEven(int value) => value % 2 == 0;  
 }  
   
 void main() {  
  final example = Example();  
   
  @isTest  
  void testExample(  
   String description, {  
   required int value,  
   required Function(bool) verify,  
  }) {  
   final result = example.isEven(value);  
   test(description, verify(result));  
  }  
   
  group('Example.isEven', () {  
   testExample(  
    '2 is even',  
    value: 2,  
    verify: (result) => expect(result, isTrue),  
   );  
  });  
 }  
   

The result was that my Structure view looked exactly how I wanted it.

There you go: the solution in Dart is to annotate the helper method with @isTest. The documentation of that annotation make it clear that it solves my problem, but I must not have hit the right keywords with my original search. I hope that this post helps anyone else who is caught by this issue.

Monday, April 7, 2025

Ludum Dare 57: A Weekend with Flame

[UPDATE 1 on April 7: Added paragraph about effects and tweens.]

I am still working on the Flutter-powered game that I mentioned before, although I had to put it down for about three weeks while I worked on a departmental report. This past week, I was able to get back into working on other things, and I was somewhat surprised to see Ludum Dare 57 coming up. I am glad the organizer was able to get the support he needed to run the event. On Friday, before the jam started, I spent several hours tinkering with Flame. It is a Flutter-powered game engine that I have known about for some time but, until Friday, had never tinkered with. I pieced together a minimal arcade-style interaction following my usual inspiration: Every Extend. It was enough of an exploration that I figured I could try using Flame for Ludum Dare, even though I knew it would be slow going.

At 9PM Friday, the theme was announced: depths. I sat with a paper and pencil and started doing some doodling. As I drew out some screens for a rather silly concept for a fishing game, into my mind came the Bonzo Dog Band's "Straight From My Heart." The juxtaposition of these two tickled my fancy, and I decided to go with it.

My hope was to spend most of Saturday on the game and then be done with it. Alas, it was a lot slower going than I had anticipated: I ended up spending all Saturday and much of Sunday on the project, and I still didn't get to some of the juice that I had thought of as essential for the experience. Of course, if the goal was to make the game as well as I could, I would have used Godot Engine. My goal, instead, was to learn as much Flame as I could to finish a project within a 48-hour window, and that, I did.

You can check out Fish Face on its Ludum Dare page or go directly to the web-playable version. The source code is on GitHub. The title of the game is a bit on the nose, and I had originally wanted the entire experience to be more surreal. I would have then given it a name like It's a baseball cap on top of an umbrella, but maybe I can save that for my next surrealist project.

Before I got into the technical implementation, let me say a little about the music, which may be my favorite part. I used to do more songwriting, but these days, I only ever eke something out during a game jam. A student pointed me toward some different SoundFont files, and I had downloaded a few last week to tinker with. That tinkering came during Ludum Dare 57, when I adapted the chord progression, transposed, from "Straight From My Heart" into a doo-wop ballad. This arrangement entirely used the Arachno soundfont. Curiously, in my head, this was the actual instrumentation of the Bonzos as well, and it was only after going to the recording that I realized it's primarily guitar, bass, and drums, with just a little sax and then, later, Hammond organ. It was intentional though that my arrangement could be played by the Bonzos. I hope Neil Innes doesn't mind the rhythmic eighth notes the piano got stuck with. In any case, I was really happy to arrange something with a diminished chord, an augmented chord, and modulation.

As for Flame, I will share my experience as some observations about "pros and cons." Keep in mind that this is my first experience with it. Much of the friction undeniably comes from my trying to think about the problem in the way the Flame architects intend it to be used. My intuition at this point is to approach it the way that Godot Engine would handle it, but while there are similarities, it is not the same.

Benefits of Flame

Flame uses the composition of components to build a scene tree in a manner similar to Godot Engine's nodes. This made it easy for me to design a state machine for the main gameplay following the classic State design pattern. I made an abstract State class that was itself a component, and I made my GameWorld have only one of these instances at a time. By mounting the current state as a child component of the GameWorld, its methods were automatically called by the system without my having to delegate from GameWorld. For example, the individual state's onLoad and update methods were called by virtue of its being in the component hierarchy.

One particularly useful type of component is Effect, which can be used to add all sorts of useful transformations. Progression through the transformation is handled by an EffectHandler, so doing something common like an easing function is done with a CurvedEffectHandler that uses an ease-in curve. Effects provided an easy way to get simple animations in place. Running effects in parallel was a simple matter of adding multiple effects, and running them in series was made easy with SequenceEffect.  I especially liked RemoveEffect as well, which comes at the end of my animations that toss face parts off of the screen. It wasn't until after the jam that I realized that Flame's effects fill the same niche as Godot's Tweens: you can stick an arbitrary effect on an object and then forget about it or, if needed, get a notification when complete. Comparing the two shows an interesting difference: Tweens can operate on any property because GDScript is interpreted, while Effects have the benefit of static typing and clear factory constructor names.

Individual components can be augmented through mixins. Dart is the first language I have used significantly that incorporates mixins, and I feel like they are not yet in my arsenal as a software designer. Flame gave me a good excuse to see them in action. For example, my CatchState is where the player must either keep or toss a face part. By giving it the KeyboardEvents mixin, this state now can implement onKeyEvent. I like how explicit this is and how it prevents the superclass from having more methods than necessary. I need to think more about whether this is better or worse than an approach like Godot's, where any node can process input.

Mixins are also used for dependency injection via HasGameRef. This mixin makes the corresponding class have access to a reference to the game, which prevents one from either having a global variable (with the corresponding spaghetti) or passing an explicit reference around all the time (with the corresponding noise). It makes me wonder if there are places in my Flutter game project where my code would be made simpler by using mixins of my own devising, and this inspires me to dig more into the literature on best uses of mixins.

As I have mentioned before, refactoring is comfortable and seamless in a statically-typed language with a powerful IDE. Easy renaming is so important that it should be a presumption of an environment Anything else introduces friction into the already difficult problem of evolving good designs. Similarly, Dart's library-based privacy is convenient for rapid development and code evolution. I can quickly add new classes to one file, and I can keep these private to that file. Classes can then be pulled out into other files as needed.

Flame allows one to use Flutter for UI elements, which should be a greater strength given the power of Flutter and the frustrations of UI work, but this strength was not demonstrated well in Fish Face. I came into the project with a backwards understanding, thinking of Flutter UIs as being added to a Flame game, but I think now it's the other way around: you add a Flame game to a Flutter app. Honestly, I am still having trouble figuring out what this means in terms of design trade-offs and best practices. There was a very helpful chap on the Flame Discord who pointed me in the right direction, but I still feel like I don't have a good understanding of how, where, and when to bring these two worlds together. I revisit this in my commentary about Flutter below.

One of the advantages of being in Dart is that you have all of pub.dev at your disposal. Adding or removing dependencies is a breeze. 

Building for the Web was mostly seamless. I had one case where my code worked locally and failed in production, and I am not sure what the root cause was. I got around it by changing my implementation of randomized selection from a list. I should probably see if I can reproduce the situation in a smaller example and report the bug. Otherwise, though, it was very easy to keep the game running locally as I worked on it, in traditional Flutter Web fashion, as long as I wasn't changing assets or dependencies.

Challenges of Flame

Parts of the reference documentation are helpful, such as the list of effects, but there are other parts that seem to assume that the reader is already approaching the problem the right way. This is a bit hard to quantify, but I think what I was looking for is something more like the Flutter Cookbook: not a tutorial for beginners and not reference material for regulars, but something in between, that deals with common problems and their idiolectic solutions. In part, I fear that the Flame developers are hamstrung by the excellent documentation and examples provided by the Dart and Flutter teams. There is no way for a small project to keep up.

My vision for Fish Face was that there would be a magenta background with random shapes falling down the screen. This is the job of a particle system, but I could not make any headway with Flame's particle system; the best I got was a single particle. The documentation declares how robust their system of particle composition is, but I found the examples lacking for simple use cases. I spent a little time online looking for other tutorials, but I was running out of steam and this was low-priority juice. It was particularly frustrating because I knew how very easy it would have been to use Godot's particle system, to add it and interactively tweak it until it was good enough. This was a case where Flutter's code-based approach was much slower than turning a few knobs in a UI. Fortunately, Flutter is getting a widget preview system akin to Godot's "Run Current Scene" feature. I am heartily looking forward to how this will benefit my Flutter workflow. Thanks to the Discord denizen who pointed me toward this upcoming feature. (Incidentally, reading about Flame's particle system reminds me of the time I tried to learn the very basics of Niagara. I think there's a similar kind of shift in perspective that's required here. It's where I become tempted to do a deep dive and create a tutorial video to help others caught between worlds.)

When I sketched in the rough arrows of Fish Face's interface, my intention was that I could easily recolor them in-engine for animated effects. In Godot Engine, I would use a shader to do this, and I assumed Flame would make this as easy. Alas, the only mention of shaders in the Flame documentation says that they are coming later, once Flutter supports shaders for the Web. I don't understand why Godot Engine has shaders that run fine on the Web but Flutter and Flame don't, but it's probably because the Web is kind of an awful platform. I played with Flame's tint feature as a way to simulate what I wanted, but it was tinting not just my background, but the black outlines as well; that is, it tinted the whole image, not selectively the way I intended. I ended up just copying and pasting the images and manually coloring them in Gimp, then importing them as new assets, like a barbarian.

I enjoy programming in Dart and Flutter, but I still don't have a good handle on Flutter's animation systems. I have built a few demos but never really developed something that helps me internalize how to use implicit animations for juice. I have a sense that there's untapped potential there and that my ignorance is the impediment. Indeed, one of the things I'm hoping to tackle this summer is to make my Flutter-powered, TTRPG-inspired game more delightful by incorporating some UI animations. I realized too late into Fish Face that this exploration would be taking me in the wrong direction. In retrospect, I could have built the whole game just in Flutter, without Flame, and learned more of what I wanted to learn. There's a mismatch between the game design and what I was hoping to get out of the experience, but it's possible that I could only know this after having made this mistake.

Wrapping Up

It was a fun weekend with Flame, and I feel like I have a much better understanding now of when I would pick it up again—and it would not be for Fish Face 2: Umbrella Hat. Part of me wants to return to my Every Extend tech demo since that would be a much better use case for it. The Flame Jam is coming up later this month, but it's not a good time for me to another side project due to the end of semester and family obligations. Spending the weekend with Dart makes me eager to make some more progress with my Flutter-powered side project even though I still haven't played with the latest Godot Engine release. 

Wednesday, February 26, 2025

An Android Studio Live Template to simplify defining loggers in Flutter

This morning, I came across a discussion on the Flutter Forum about logging. I recently added rudimentary logging support to a side project, and I was interested to hear the perspectives of people with a lot more Flutter experience. Using Android Studio's Live Templates to reduce boilerplate sounded like a good idea. In my case, for every class where I want to log something, I want something like this:

final _logger = Logger('$MyWidget');

The dollar sign there is important because it uses string interpolation to insert the name of the class. If I rename the class, the refactoring tools in Android Studio will also replace this reference.

I don't remember the last time I made a Live Template for a JetBrains IDE, or if I ever have, so I decided I should take a not here. 

I created my Live Template in the "Dart" category. This is not necessary, but it is sensible. I called it "logger" and it looks like this:

final _logger = Logger('$$$CLASSNAME$');

Under "Edit Variables," I configured CLASSNAME to use the expression dartClassName(). Checking "Skip if defined" means that keyboard focus skips over the variable after expanding the template, which is a nice feature. I set the context to Dart/Other.

Using double-dollar-signs to escape a dollar sign is a little strange, but once I got past that, it was easy to set up. One unexpected pitfall is that if you erase a variable from the template and then retype it, you have to remap it to an expression: the IDE doesn't "remember" that the variable had a previous definition while working on the template body.

Tuesday, February 11, 2025

Notes from "No Estimates"

I recently finished reading Vasco Duarte's No Estimates after hearing about it on Dave Farley's Continuous Delivery YouTube channel and Allen Holub's #NoEstimates talk. I had been curious about the #NoEstimates movement for some time, reading an article here and there, but this was my first real attempt to understand it. The book itself is clear and direct, interleaving traditional content with an ongoing fictional narrative that motivates and reinforces the ideas. I found many connections to my research and teaching interests. In this blog post, I will share a few findings from my notes and reflections.

Estimates

One of the foundational principles of the book is fairly simple but not something I had considered before: an estimate communicates the peak of a probability distribution. For example, if I estimate a task to take two hours, I am saying that the most likely case is two hours, but it could take as little as zero or negligible time, but it could also take a low probability that it takes forever. The cumulative probability is the area under the curve. From this, we can conclude that the probability of being late is much higher than of being early.

Duarte classifies estimates as waste as defined by a Lean perspective, that more of it will not make the product better from a user's point of view. It clarifies that "no estimates" isn't a goal but a vision: estimates won't be axiomatically removed but minimized. The discussion of waste got me thinking about QA testing in games, a point I will return to below.

Managing time, scope, cost, and quality

I regularly talk to my students about how cost, quality, and scope are the three levers we can control in project management. Since our work is constrained by the semester's schedule, I point out that we cannot shift cost; that is, we cannot simply add more time to the end of the semester to be able to get our projects done. I also argue that quality is non-negotiable: the point of undergraduate education is to learn how to work well, so sacrificing that is against the telos of the endeavor. Therefore, the only manipulable lever is scope. This perspective seems to help students understand why we focus on user story analysis, prioritizing based on features that add value to the users. 

I wish I could remember where I encountered that heuristic since it is distinct from a similar concept that dominates Web searches: the project management triangle, which is also known as the triple-constraint model or the iron triangle. This model explains how the constraints of scope, cost, and time are connected such that cutting one without changing the others will result in a loss of quality. Duarte uses this model in his book to draw a distinction between value-driven and scope-driven projects. Traditional management approaches are scope-driven, where the scope is fixed and so cost and time are unbounded. Scope-driven projects instead fix time and cost, leaving scope flexible, which leads to the approach of delivering the most value first. This is a standard agile perspective, but I previously didn't have the nomenclature of "value-driven" and "scope-driven," perhaps in part because in my academic environment I rely on that alternative model described above.

Reducing variability in throughput

In Chapter 3, Duarte provides suggestions for techniques to reduce the variable in throughput for a development team. I have used many of them before in mentoring student teams. These include using stable, cross-functional teams teams; having clearly defined priorities; not passing defects down the line; standardizing and automating when possible; freezing scope within iterations; and protecting the team from outside interruptions. He also suggests reducing dependencies so that people can work on one thing at a time. This got me thinking about how often my teams end up with coupled user stories, such that completing one requires work on another. Creating independent user stories comes up more than once in the book, and it's something I can watch for opportunities to practice and teach.

Duarte points out that good requirements must allow measuring progress early and often. They must also be flexible enough to determine which aspects of a system need to be implemented now and which can be built later, after the system is better understood. 

This leads to Duarte's conclusion that the only real measurement of progress in software development is Running Tested Stories. Anything else is ambiguous or unreliable. Teams can be managed toward consistent throughput by ensuring that there are no large stories (no larger than half an iteration), several independent stories can be completed in an iteration, and the distribution of story size stays about the same throughout.

The book references a 2003 article by Bill Wake about the "INVEST" acronym, which I had not seen before. Wake describes how user stories need to be Independent, Negotiable, Valuable, Estimable, Small, and Testable. "Negotiable" here means that they deal with the essence and not the details: they are not contracts about technical details. Wake's definition of "Small" is between half a day's effort and a day's effort. Duarte adapts "Estimable" to be Essential, which is sensible given his specialization. He includes the term blink estimation, which he attributes to Angel Mednilla and was new to me. The idea is that one makes a snap judgement about whether a story fits within two weeks or not, and that this blink estimation is usually all that is needed. Regardless of which expansion I use, INVEST may be a helpful heuristic to give to teams who are breaking down a big problem such as a game design into smaller, valuable pieces.

Planning the details just in time

I started using Scrum with multidisciplinary undergraduate game development teams many years ago, and it has been a valuable practice. I was usually the Product Owner, responsible articulating the work as user stories and prioritizing the backlog. Teams pulled stories from the Product Backlog to the Sprint Backlog during our planning meetings, as per traditional Scrum. When my teams found that a one-dimensional Product Backlog makes it hard to see the big picture, we adopted Story Maps, which ameliorated the problems. Although we tracked each Sprint's progress using burndown charts, I never bothered to compute velocity. Teams tended to get a good sense of how much they could do in two weeks by around week ten, and since I was in charge of the backlog, I could cut scope to fit into the time remaining.

My preference for agility caused some friction then when trying to apply Richard Lemarchand's Playful Production Process with my last two cohorts of game development students. Although Lemarchand calls for concentric development, he has relatively little to say about how to implement that. More importantly, his approach for each phase of production is to start by enumerating all the work that is to be done, estimating how long it will take, and then moving toward that goal. A careful reader will recognize this as scope-driven management, and a cultural observer will note that the games industry is beset by death marches and crunch. 

Duarte's alternative is rooted in agile principles: plan the details of the imminent iteration, getting them into user stories that can be completed in a day or two, and let the future work remain coarse-grained epic stories. He suggests not planning more than about two months' worth of work due to how much will be learned about the system in that time. 

This caused some stress for me since it was quite counter to one of my ongoing research projects. I have been thinking about how to combine some of Lemarchand's ideas with some ideas I took from Allen Holub's #NoEstimates talk. One of Holub's primary arguments is that we can simplify our planning, and get equivalent results, by counting each user story as a single unit of work. I have been investigating the differences between tracking work items as single units versus tracking estimated hours remaining. For example, consider these two perspectives from the end of a team's alpha phase of production.

These are two perspectives of the same period of time, an Alpha phase that lasted about three months. The first shows the number of stories in the backlog and the second shows the total estimated number of hours. The top chart shows how the team cut a significant number of features around a third of the way through Alpha. For the next third, they added stories at about the same rate as they completed them, demonstrating how they were working to reshape the project based on the initial overestimate. We set up a nifty toolchain for tracking these data in realtime using a combination of Hacknplan and Google Sheets. I even gave a workshop about this at GDEX a few months ago. But the whole thing hinges on having those planning data at the end of August for a milestone that's coming up in early December.

Duarte suggests a radically different model. Break down the problem so that the stories for the current sprint are independent and small (taking no more than half a sprint to complete). Track how many of those the team can get done in an iteration. Do that for a few iterations, and you have a good sense of how much work the team can accomplish in future iterations, which lets you control scope. More specifically, you can measure a team's User Story Velocity and its Feature Velocity, where "Feature" here is elsewhere called an epic story or an activity. 

I like the sound of that. It was clear from watching student teams try to estimate an entire Alpha phase that they knew it was shoddy. Worse, for some of them, it planted a seed in their mind that they already knew enough about their projects to plan the whole phase when, in fact, they had yet to find the fun. Switching to the #NoEstimates approach would require me to supplement or replace Lemarchand's recommendations, including new ways of using project management tools.

The medium is the message

Incorporating #NoEstimates would also mean rethinking the relationship between the developers and the artists. When I was mentoring single-semester projects with a small art team, there was seldom any trouble: artists moved fluidly from concept art, sketches, and low-fidelity assets toward production-quality assets as the semester moved on. When artists struggled to match the iterative flow of the programmers, we adopted swimlanes as recommended by Clinton Keith

It wasn't clear to me how to scale this up to teams where half may be artists. I reached out to Duarte himself, and he was kind enough to talk with me about my questions. We had a fruitful discussion, and he helped me see something that I didn't understand before: there is a whole category of practices that, fundamentally, are symptoms of a failure to regularly integrate. Swimlanes are one example, but there are countless more, as overt as separate physical locations and as mundane as job titles and org charts. If we consider that the running tested story is the only way to measure progress, then anything that does not support that is potentially distracting from it. It is the kind of observation that would make Marshall MacLuhan smile: the presence of a swimlane says more about the team than anything in the swimlane itself.

Using social complexity to determine tactics

Buying the book also grants access to a keynote presentation that Duarte gave some years ago. It's an excellent talk and a good complement to the book. One particular element jumped off the screen and into my notebook, and that is Duarte's matrix for dealing with user stories. It deals with the problem of Social Complexity, which can be summarized as "the number of people in the organization you have to talk to about it." Here is a quick reproduction from his talk:


This captures something I have tried to express to many teams but failed to capture so clearly. It relates to the four conclusions of his presentation:
  1. Predict progress with #NoEstimates
  2. Break things down by value, not effort
  3. Agree on meaning with social and technical complexity, reducing risk
  4. Use RIDICULOUSLY SHORT timeboxes
He points out that if you can only do one of these, do the fourth one, since it is the essential practice from which the others derive.

Closing thoughts

I taught the first two cohorts of the game production sequence, but I am stepping away for the third one. There are a few reasons behind this, but primarily it's so that my new colleague has an opportunity to try his hand at it. I expect to be back in the saddle with the fourth cohort, who will start in Spring 2026. Writing up these notes took much longer than I expected, especially as I began to reflect on the substantial differences between what I have done in the past, what I did following Lemarchand, and what I might like to do in the future. For now, I need to put down this line of inquiry, but formalizing these notes gives me a point of entry when I need to refresh myself on these topics.

In the meantime, if you have thoughts, feedback, stories, or reflections, please feel free to share them in the comments.