Paul Gestwicki's Blog: reading

Showing posts with label reading. Show all posts

Tuesday, August 26, 2025

Notes from Tynan Sylvester's "Designing Games"

I just finished reading Tynan Sylvester's Designing Games. The book was published in 2013, but Sylvester is probably best known as the creator of Rimworld, which came out a few years later and is not mentioned in the book. I have read many game design books, but I still learned a lot from this one. This inspired me to share some of my notes here.

Sylvester's core conceit is that games are artificial systems for generating experiences. He presents a model early in the book that undergirds the rest of the text, that a game provides mechanics wrapped in fiction, which produces events with fictional meaning, which produces emotions, which leads to an experience. I am a collector of definitions for "game," and I think this may be the strongest formalized definition I have encountered. I appreciate that it highlights what makes games different from other media, which is a theme that also comes up frequently in the text.

An early chapter discusses emotions and their role in game design. He uses the term "human values" to refer to anything that is important to humans that can move through multiple states, such as life/death, victory/defeat, and friend/stranger/enemy. I am not keen on his terminology here since it sounds relevant to morals or ethics, but I like the hook for thinking about design. It reminds me of the binary opposites that are essential to imaginative education. Sylvester provides a catalog of emotional triggers that can be used here, each of which is given appropriate exposition in the text: learning, character arcs, challenge, social interaction, acquisition, music, spectacle, beauty, environment, newfangled technology, primal threats, and sexual signals.

I have long been familiar with the idea that players cannot truly answer what caused different emotions, but that they will construct a narrative to explain the phenomenon to themselves. I was not previously aware of the technical explanation known as misattribution of arousal and the psychological studies around it. Sylvester focuses on the "bridge" studies, and a little searching online finds that this is a robust finding. A related psychological concept he draws upon is the two-factor theory of emotion, which explains that what we call "emotion" is a combination of physiological arousal and a cognitive label. This explains how the same physiological stress response could show up in Tetris and in Doom, but the latter's fiction is what leads us to call it "fear."

I would have preferred that Sylvester give more precise definitions around his use of "elegance" and "emergence" since he does not adequately address Kate Compton's 10,000 bowls of oatmeal problem. It's one of a few chapters where I wondered how Sylvester might approach a second edition. No Man's Sky came out a few years after the book was published, and Sylvester's own work on procedural storytelling undoubtedly gave him new insights into these ideas.

Sylvester uses apophenia to describe how humans find patterns in noise, but I wonder if he uses it a bit too broadly. Apophenia describes phenomena like "lucky streaks" in gambling. Seeing a face in the clouds strikes me as different because our visual cortex is programmed specifically to find faces. Similarly, anthropomorphising a non-human character seems like it appeals to our narrative sense, not to a matter of pattern-matching. I wanted to track this observation from my notebook because it's an area where I think precision matters, but I realize I am not completely confident in my own ability to describe the edges of these concepts.

The need for frequent playtesting also recurs throughout the book. I am glad that it does since it is such fundamental advice. In his discussion of balance, he points out that what we should be gathering from playtesting is experiences and not suggestions. Again, the essence of this advice is the same in any game design text, but Sylvester presents it with admirable clarity and succinctness, especially given that he is working within a specific and explicit definition for "experiences." He points out that each playtest is a story, but you need to playtest until you can see past the stories into the systems. This gives me an interesting heuristic that I plan to share with my game production students.

His chapter on multiplayer games includes a section on game theory. I am no expert in this area, and I appreciate his layman's covering of Nash equilibria. He points out that the structure of rock-paper-scissors is the only elegant symmetric game without pure Nash equilibria and matching pennies is the only elegant asymmetric game without pure Nash equilibria. All other (game theoretic) games add more entries to the decision matrix without fundamentally changing the structure and are therefore inelegant. I don't usually turn to game theory in my designs, but I enjoyed this section for its clarity and perspective.

I was a little surprised to see him adopt the term Yomi from David Sirlin to describe how players "read" each other—how they predict, deceive, and outwit each other in multiplayer games. I used to follow Sirlin more closely, and I recently uncovered my copy of Flash Duel. It was fun to see his name show up, and I see he's working on a second version of the eponymous Yomi.

I am guilty of colloquially referring to dopamine as a pleasure-causing drug, but Sylvester is more careful in the book to distinguish between motivation and pleasure. Dopamine causes motivation, not pleasure. Remove dopamine from rats, and they will do nothing, not even eat, even though they can still enjoy sugar syrup that is fed to them. It is an important distinction and one I should be careful not to blur since this biochemical understanding allows us to talk more clearly about how games can motivate us to play past the point of enjoyment.

The third and final part of the book is devoted to process. I expected a recommended series of steps to create a game as offered by other texts. Instead, Sylvester opens the section by pointing out that a lot of our terms around process and management are actually borrowed from other disciplines: director, preproduction/production/postproduction. Here, he applies a similar perspective to the process design problem as he does to the rest of game design. One needs to recognize assumptions and evaluate them, realistically but not cynically. Although he is right, also later uses terms like "production" as if we all know what they mean, and I found myself wishing he was more prescriptive here like he was in the section on emotions and experience.

His advice on process comes down to the same observation made by the signatories of the Manifesto for Agile Software Development: work in tight iterations with good feedback loops. This appeals to my sense of lean production, that we should only as much management as necessary. When it comes to sequencing, his advice is to figure out the dependencies among needs and then work upward from there. Here, I would augment his advice with some observations from story maps a la Jeff Patton, but I don't disagree with him. Sylvester is explicit about keeping the game as low fidelity (that is, grayboxing) as possible throughout so that time is not wasted re-creating assets. He mentions that working with an artist or level designer is important, but he is mum about what those other folks might be doing while this grayboxing is going on. I would have liked to know his thoughts about this, given that I mentor fixed-staff student teams.

The chapter on authority is beautiful, and I plan to use it as an assigned reading. Although he does not use these terms, what he describes is an approach to creative management rooted in natural law and subsidiarity. In particular, he talks about how a team member has natural authority over their work, and how it is a mistake to arrogate that. (Note: I learned the word "arrogate" from reading this book even though I've been battling arrogators in ISS Vanguard for some time without looking it up.) This chapter dovetails nicely with the next, which is on motivation. It is clear that he believes the only way a team can succeed is in an environment of trust and respect. Indeed, he goes so far as to say that one can only succeed by "getting people you can trust and then trust them." I used to recruit undergraduates for special community-engaged game development projects, but now, any student can elect into the game development track that puts them into the game production sequence; I am not sure what it implies for my teaching if Sylvester is right.

Speaking of teaching, I am reminded of the essential challenges presented by the need to grade students in creative projects. Grades are an external reward, and we know that external rewards can kill intrinsic motivation. What then is a professor to do?

One of the final points in the book is that the best way to motivate a creative person is to ensure that they have constant, small, visible progress. This is the progress principle. It turns my mind again toward considering physical task boards to replace the convenient digital ones my teams used the last two years.

I thoroughly enjoyed the book, and I am grateful for how much I learned from it. I will certainly return to my notes once I start pulling together my plans for the next game production sequence, assuming I get assigned to teach a section or two. The book inspires me to draw more from my own learning to build a process that I believe in, which will be easier after having stepped away from team-teaching for now. I am not sure how much of the book would resonate with beginning designers, who, in my experience, need more structured explanation and exercises to get them into the right mindset. I think it's a valuable reading for someone who has already moved beyond the amateur steps though, once one has come to tacitly understand iteration and the challenges of communciation and motivation.

Wednesday, May 7, 2025

Reading "Fifty Quick Ideas to Improve Your User Stories"

I recently finished Fifty Quick Ideas to Improve Your User Stories by Gojko Adzic and David Evans. The book was strongly recommended by Dave Farley on his Modern Software Engineering channel. I have been using user stories for many years as part of my teaching and practice with agile software development, and I hoped that the book might give me some new ideas and perspectives. I read it with a particular interest in helping students manage game development projects.

The book itself is laid out simply: every two-page spread presents one of their recommendations. Each has a narrative, a summary of benefits, and then practical recommendations for getting started. From the title, I expected the book to be short and punchy like The Pocket Universal Principles of Design, but it contains a lot more detail. The authors are clear that the book is not an introduction to user stories; they assume the reader is already familiar with user stories and the jargon of agile software development. This is good for a reader like me, except that I had earlier lent the book to a student who was just learning the fundamentals. I would use a different resource for that purpose if I could do it again.

Here are suggestions from the book that struck me as potentially helpful to my teaching and mentoring. I have included the page numbers for ease of future reference.

Describe a behavior change (p14)

This is the second time recently that I have come across Bill Wake's INVEST mnemonic for user stories—that stories should be Independent, Negotiable, Valuable, Estimable, Small, and Testable. The other place was Vasco Duarte's No Estimates, which I wrote about earlier this year.

The idea behind this recommendation is to quantify change. For new capabilities, "start to..." or "stop..." are simple quantifiers. I feel like there's something useful in here for introducing students to user stories.

Approach stories as survivable experiments (p18)

This tip is all about the perspectives of what stories allow. I often see students mistake stories for traditional requirements, probably because traditional requirements look more like school. Framing the stories as experiments may help students see that this is more creative and exploratory.

Put a "Best Before..." date on stories (p24)

I wish I had thought of this one. I have mentored student teams where they have a story in the backlog that only makes sense to do by a certain date or milestone. It's an easy one to remember, and it falls into a common pattern in the book: make the stories your own.

Aside: It reminds me of the advice about managing my curriculum vita that I received from my colleague Mellisa Holtzman many years ago. She said that your vita is your own, and that you should use it to tell a story rather than slavishly follow a template. She was right, both philosophically and rhetorically. My university recently moved to using a one-size-fits-none digital solution, and I was disappointed that there has been no discussion about the impoverished epistemology of such a move.

Set deadlines for addressing major risks (p28)

This is similar to the previous one, and it references it as a technique: one can use learning stories that have "Best Before..." dates.

The authors distinguish between learning stories and earning stories, referencing an old presentation by Alistair Cockburn. This is a nice distinction as well that I am sure I can use.

Use hierarchical backlogs (p30)

Having spent the last year mentoring three teams with flat backlogs, it made me miss story maps, and I look forward to going back to them. Story maps come up explicitly on page 36 under the recommendation Use User Story Maps, which is just after the recommendation on page 34 to Use Impact Maps. Gojko Adzic also wrote a book on impact mapping. I have read most of it and find it intriguing, and I can see how it could be useful in a business environment. It didn't strike me as immediately helpful for my needs teaching the game design and development or the basics of agile software development.

Set out global concerns at the start of a milestone (p40)

The authors acknowledge that user stories cannot and should not capture everything that a team has to do. Writing emails or reading applicant vitae are examples of crucial work that is not directly related to a user story.

Cross-cutting concerns such as security, performance, and usability should still be discussed, but they should manifest as something like a checklist that accompanies the whole process—not embedded or wedged into a user story.

Use low-tech for story conversations (p54)

The recommendation for whiteboards and sticky notes comes as no surprise to me. What did surprise me was that they caution against using a room with a large central table. Such a configuration makes people sit and opine rather than converse, and what comes out of the conversation (what we often call the "user stories" themselves) are just markers of that conversation.

Diverge and merge (p58)

When user story analysis is being done by a large team, they recommend breaking the team into smaller groups. Each group comes up with examples of user stories, then the groups compare their results. They suggest 10-15 minutes of divergence.

In last year's game production studio, one team was much larger than the others, and I recommended that they break down story analysis by feature domain. I wanted each group to have representatives from Computer Science and from Animation. It worked reasonably well, but it was slow, and there were still a lot of holes in the analysis: cross-cutting ideas were lost. (Also, most of the students didn't know anything about user stories, and some harbored significant misunderstandings.) Next time, I will keep this advice in mind and have them work on the same context rather than different ones. I regularly have students push back on this kind of activity as redundant but this is because they have not yet experienced the productivity drop that comes from realizing later that the requirements were wrong.

Measure alignment using feedback exercises (p62)

This one jumped out to me because it sounds like it came right from a book about active learning techniques for the classroom.

During user story analysis, rather than ask if anyone has questions, give the discussants a sample edge case and ask them an open-ended question such as, "What happens next?" Each writes their response, and then the team compares them. This shows whether there is shared understanding or not.

Reading this gets me fired up to use it in all of my teaching from now on. That's awkward since I won't be teaching again until Spring 2026. That gives me all the more reason for this blog post.

Split learning from earning (p90)

As mentioned above, this separates two kinds of stories. Learning stories still need a concrete goal that is valuable to stakeholders. An interesting corollary to this advice is that every story should be either learning or earning; if not, it is not really a user story at all.

Don't push everything into stories (p98)

I foreshadowed this one earlier as well. They advise strongly against tracking non-story work with user stories. An example they give is insightful: if such work is tracked in the backlog, and the stakeholders sort the backlog, then they will deprioritize work that does not generate value. The non-story work won't get addressed, then, until it is a crisis, at which point it becomes the highest priority, so there was no value in having it in the backlog.

Avoid using numeric story sizes (p102)

The two-page spread summarizes some of the key arguments behind the No Estimates movement. Here, they recommend that if someone uses T-Shirt sizes, to only use at most three different sizes. Better, they prefer Goldilocks sizing: a story is either too big, too small, or just right. Their conclusion is similar to Duarte's: just count stories.

Estimate capacity based on analysis time (p106)

If we know that planning for an iteration takes 5%-10% of the iteration time, then we can timebox the actual planning and then only work on the things that could be covered in that planning meeting. That is, use the planning meeting's duration as a heuristic for the scope of an iteration. This is clever, and the authors acknowledge that the moderator needs to prevent the meeting from getting too far into the weeds.

Split UX improvements from consistency work

They recommend having UX changes explored by a "mini-team" that explores and prototypes. The output of this mini-team's work are the user stories for the whole team, not exhaustive design specifications. The work of this exploratory team is on the learning side of the learning/earning divide. The mini-team can slipstream back into the main team to help with implementation.

This interested me considering how often I see teams struggle with questions of gameplay and engagement. "Will this be more fun?" is a good design question that can be approached through prototyping. Unfortunately, my teams often want to do this work as if the answer is a fait accompli. I wonder if this recommendation, together with the learning/earning split, will help me frame better for them what it means to answer the design question with a prototype first. The outcome would be the stories for the whole team to actually engineer a solution. I suspect I will encounter much of the same resistance I mentioned previously, where teams assume working on the "product" is more efficient than working in prototypes because they have never seen a product fail under bad design or bad engineering. I think this is why I prefer working with students who have tried hard things and seen how mistakes have consequences beyond a scolding or a bad grade.

Check outcomes with real users (p116)

The authors proclaim this as potentially the most important thing a team does. The important corollary here is that this requires one to write user stories the way one approaches TDD, or even how one ought to design learning experiences: focus from the beginning on an assessment plan.

Tuesday, February 11, 2025

Notes from "No Estimates"

I recently finished reading Vasco Duarte's No Estimates after hearing about it on Dave Farley's Continuous Delivery YouTube channel and Allen Holub's #NoEstimates talk. I had been curious about the #NoEstimates movement for some time, reading an article here and there, but this was my first real attempt to understand it. The book itself is clear and direct, interleaving traditional content with an ongoing fictional narrative that motivates and reinforces the ideas. I found many connections to my research and teaching interests. In this blog post, I will share a few findings from my notes and reflections.

Estimates

One of the foundational principles of the book is fairly simple but not something I had considered before: an estimate communicates the peak of a probability distribution. For example, if I estimate a task to take two hours, I am saying that the most likely case is two hours, but it could take as little as zero or negligible time, but it could also take a low probability that it takes forever. The cumulative probability is the area under the curve. From this, we can conclude that the probability of being late is much higher than of being early.

Duarte classifies estimates as waste as defined by a Lean perspective, that more of it will not make the product better from a user's point of view. It clarifies that "no estimates" isn't a goal but a vision: estimates won't be axiomatically removed but minimized. The discussion of waste got me thinking about QA testing in games, a point I will return to below.

Managing time, scope, cost, and quality

I regularly talk to my students about how cost, quality, and scope are the three levers we can control in project management. Since our work is constrained by the semester's schedule, I point out that we cannot shift cost; that is, we cannot simply add more time to the end of the semester to be able to get our projects done. I also argue that quality is non-negotiable: the point of undergraduate education is to learn how to work well, so sacrificing that is against the telos of the endeavor. Therefore, the only manipulable lever is scope. This perspective seems to help students understand why we focus on user story analysis, prioritizing based on features that add value to the users.

I wish I could remember where I encountered that heuristic since it is distinct from a similar concept that dominates Web searches: the project management triangle, which is also known as the triple-constraint model or the iron triangle. This model explains how the constraints of scope, cost, and time are connected such that cutting one without changing the others will result in a loss of quality. Duarte uses this model in his book to draw a distinction between value-driven and scope-driven projects. Traditional management approaches are scope-driven, where the scope is fixed and so cost and time are unbounded. Scope-driven projects instead fix time and cost, leaving scope flexible, which leads to the approach of delivering the most value first. This is a standard agile perspective, but I previously didn't have the nomenclature of "value-driven" and "scope-driven," perhaps in part because in my academic environment I rely on that alternative model described above.

Reducing variability in throughput

In Chapter 3, Duarte provides suggestions for techniques to reduce the variable in throughput for a development team. I have used many of them before in mentoring student teams. These include using stable, cross-functional teams teams; having clearly defined priorities; not passing defects down the line; standardizing and automating when possible; freezing scope within iterations; and protecting the team from outside interruptions. He also suggests reducing dependencies so that people can work on one thing at a time. This got me thinking about how often my teams end up with coupled user stories, such that completing one requires work on another. Creating independent user stories comes up more than once in the book, and it's something I can watch for opportunities to practice and teach.

Duarte points out that good requirements must allow measuring progress early and often. They must also be flexible enough to determine which aspects of a system need to be implemented now and which can be built later, after the system is better understood.

This leads to Duarte's conclusion that the only real measurement of progress in software development is Running Tested Stories. Anything else is ambiguous or unreliable. Teams can be managed toward consistent throughput by ensuring that there are no large stories (no larger than half an iteration), several independent stories can be completed in an iteration, and the distribution of story size stays about the same throughout.

The book references a 2003 article by Bill Wake about the "INVEST" acronym, which I had not seen before. Wake describes how user stories need to be Independent, Negotiable, Valuable, Estimable, Small, and Testable. "Negotiable" here means that they deal with the essence and not the details: they are not contracts about technical details. Wake's definition of "Small" is between half a day's effort and a day's effort. Duarte adapts "Estimable" to be Essential, which is sensible given his specialization. He includes the term blink estimation, which he attributes to Angel Mednilla and was new to me. The idea is that one makes a snap judgement about whether a story fits within two weeks or not, and that this blink estimation is usually all that is needed. Regardless of which expansion I use, INVEST may be a helpful heuristic to give to teams who are breaking down a big problem such as a game design into smaller, valuable pieces.

Planning the details just in time

I started using Scrum with multidisciplinary undergraduate game development teams many years ago, and it has been a valuable practice. I was usually the Product Owner, responsible articulating the work as user stories and prioritizing the backlog. Teams pulled stories from the Product Backlog to the Sprint Backlog during our planning meetings, as per traditional Scrum. When my teams found that a one-dimensional Product Backlog makes it hard to see the big picture, we adopted Story Maps, which ameliorated the problems. Although we tracked each Sprint's progress using burndown charts, I never bothered to compute velocity. Teams tended to get a good sense of how much they could do in two weeks by around week ten, and since I was in charge of the backlog, I could cut scope to fit into the time remaining.

My preference for agility caused some friction then when trying to apply Richard Lemarchand's Playful Production Process with my last two cohorts of game development students. Although Lemarchand calls for concentric development, he has relatively little to say about how to implement that. More importantly, his approach for each phase of production is to start by enumerating all the work that is to be done, estimating how long it will take, and then moving toward that goal. A careful reader will recognize this as scope-driven management, and a cultural observer will note that the games industry is beset by death marches and crunch.

Duarte's alternative is rooted in agile principles: plan the details of the imminent iteration, getting them into user stories that can be completed in a day or two, and let the future work remain coarse-grained epic stories. He suggests not planning more than about two months' worth of work due to how much will be learned about the system in that time.

This caused some stress for me since it was quite counter to one of my ongoing research projects. I have been thinking about how to combine some of Lemarchand's ideas with some ideas I took from Allen Holub's #NoEstimates talk. One of Holub's primary arguments is that we can simplify our planning, and get equivalent results, by counting each user story as a single unit of work. I have been investigating the differences between tracking work items as single units versus tracking estimated hours remaining. For example, consider these two perspectives from the end of a team's alpha phase of production.

These are two perspectives of the same period of time, an Alpha phase that lasted about three months. The first shows the number of stories in the backlog and the second shows the total estimated number of hours. The top chart shows how the team cut a significant number of features around a third of the way through Alpha. For the next third, they added stories at about the same rate as they completed them, demonstrating how they were working to reshape the project based on the initial overestimate. We set up a nifty toolchain for tracking these data in realtime using a combination of Hacknplan and Google Sheets. I even gave a workshop about this at GDEX a few months ago. But the whole thing hinges on having those planning data at the end of August for a milestone that's coming up in early December.

Duarte suggests a radically different model. Break down the problem so that the stories for the current sprint are independent and small (taking no more than half a sprint to complete). Track how many of those the team can get done in an iteration. Do that for a few iterations, and you have a good sense of how much work the team can accomplish in future iterations, which lets you control scope. More specifically, you can measure a team's User Story Velocity and its Feature Velocity, where "Feature" here is elsewhere called an epic story or an activity.

I like the sound of that. It was clear from watching student teams try to estimate an entire Alpha phase that they knew it was shoddy. Worse, for some of them, it planted a seed in their mind that they already knew enough about their projects to plan the whole phase when, in fact, they had yet to find the fun. Switching to the #NoEstimates approach would require me to supplement or replace Lemarchand's recommendations, including new ways of using project management tools.

The medium is the message

Incorporating #NoEstimates would also mean rethinking the relationship between the developers and the artists. When I was mentoring single-semester projects with a small art team, there was seldom any trouble: artists moved fluidly from concept art, sketches, and low-fidelity assets toward production-quality assets as the semester moved on. When artists struggled to match the iterative flow of the programmers, we adopted swimlanes as recommended by Clinton Keith.

It wasn't clear to me how to scale this up to teams where half may be artists. I reached out to Duarte himself, and he was kind enough to talk with me about my questions. We had a fruitful discussion, and he helped me see something that I didn't understand before: there is a whole category of practices that, fundamentally, are symptoms of a failure to regularly integrate. Swimlanes are one example, but there are countless more, as overt as separate physical locations and as mundane as job titles and org charts. If we consider that the running tested story is the only way to measure progress, then anything that does not support that is potentially distracting from it. It is the kind of observation that would make Marshall MacLuhan smile: the presence of a swimlane says more about the team than anything in the swimlane itself.

Using social complexity to determine tactics

Buying the book also grants access to a keynote presentation that Duarte gave some years ago. It's an excellent talk and a good complement to the book. One particular element jumped off the screen and into my notebook, and that is Duarte's matrix for dealing with user stories. It deals with the problem of Social Complexity, which can be summarized as "the number of people in the organization you have to talk to about it." Here is a quick reproduction from his talk:

This captures something I have tried to express to many teams but failed to capture so clearly. It relates to the four conclusions of his presentation:

Predict progress with #NoEstimates
Break things down by value, not effort
Agree on meaning with social and technical complexity, reducing risk
Use RIDICULOUSLY SHORT timeboxes

He points out that if you can only do one of these, do the fourth one, since it is the essential practice from which the others derive.

Closing thoughts

I taught the first two cohorts of the game production sequence, but I am stepping away for the third one. There are a few reasons behind this, but primarily it's so that my new colleague has an opportunity to try his hand at it. I expect to be back in the saddle with the fourth cohort, who will start in Spring 2026. Writing up these notes took much longer than I expected, especially as I began to reflect on the substantial differences between what I have done in the past, what I did following Lemarchand, and what I might like to do in the future. For now, I need to put down this line of inquiry, but formalizing these notes gives me a point of entry when I need to refresh myself on these topics.

In the meantime, if you have thoughts, feedback, stories, or reflections, please feel free to share them in the comments.