Tuesday, August 3, 2010

Student evaluation plan for game programming

Continuing the series of posts on planning for Fall's game programming class, I would like to share my student evaluation plan. Thanks to all of my friends and colleagues who have taken the time to discuss these issues with me.

By way of background, I used both self- and peer-evaluation in my HCI class in Spring 2010, where students worked in teams on Android applications. I have always been suspect of both kinds of evaluations --- they are neither reliable nor valid, statistically speaking ---  but it seemed like the best way at getting reasonable feedback. While I believe that most of the evaluations were honest, I found the lies to be much more interesting. Of course, the cynical mind leaps immediately to bloated self-evaluations, and there were some of those. Simple tricks like taking the median of the set of self and peer evaluations can address such issues assuming most people are honest. By the end of the semester, some students had an axe to grind and gave brutally honest feedback. As an observer, I think that much of this anger was justified. However, it was the inflated peer-evaluations that I found more interesting. Without breaking teacher-student confidentiality, I will say that these led to some fantastic discussions where the reason for "inflated" evaluations was not from a desire for reciprocation, but rather sympathy, maybe even pity for those who had made mistakes.

The point relevant to this post is, as Dr. House would say, people lie. Specifically, students lie to teachers. We have set up a cultural system in which it is inevitable, in which cheating is rewarded with material gain and, as The Spirit of Christmas put it so eloquently, "the rewards of virtue are largely spiritual, but all the better for it."

However, there is a force more powerful than material gain, a force that can strengthen leaders or destroy futures: peer pressure.

Without further ado, then, my experimental evaluation method is rubric-based, using the rubric shown below.
Category 3 2 1 0
Code Quality Uses pair programming or all commits are signed by a team member. Usually uses pair programming or most commits are signed by a team member. Rarely follows pair programming or few commits are signed by a team member. Has not contributed code
Commitment to Meetings Attends and is prompt and attentive for all meetings. Attends most meetings but is either late with significant frequency or is not focused during meetings. Attends most meetings but is routinely late or unfocused. Regularly misses meetings.
Facilitation (ScrumMasters only) Facilitated the success of the team. Mostly facilitated the success of the team with some problems such as failure to remove impediments or unnecessarily prioritizing owned tasks before team tasks. Regularly failed to facilitate the team's success Did not adequately serve the team.

The numbers are based on my general grading rubric. In a nutshell, 3=A, 2=C, 1=D, 0=F.

As part of the Sprint Retrospective, students will grade themselves according to this rubric. Then, in a meeting with the team, they will announce what values they assigned to themselves. While students may lie to me in a moment of weakness, I think it's less likely that non-sociopaths would do this in front of their team of peers. Teams can reorganize between sprints, and so I will collect all of the self-evaluations in a spreadsheet that is shared with the other teams.

I do not have any middle management in the course design: there is no one between me (Scrum Product Owner) and the students (Scrum Developers / Team Members) that externally evaluates them. In the event that there are students who are not carrying their weight on the project, I have set up a mechanism for firing them from the studio. Here's how I put it in the draft of the course description:
If I receive reliable reports that a student is not contributing adequately, I will schedule an individual meeting with the student to discuss the matter. Failure to attend this meeting or resolve the problems will result in the student's being fired from the studio: the student fails and may no longer contribute to the project.
I am hopeful that the combination of rubrics and public accountability will contribute to students' sense of ownership over the project. I am putting a great deal of trust in these students, and I want the evaluation plan to reflect that. As I intend to introduce the course, we're not just playing pretend: we are a game studio, and we have work to do.

As always, I welcome feedback and discussion.


  1. firing a student? that seems interesting, but hard to visualize , in a classroom environment. What will that particular student do if he is fired from the class?

  2. Since the class is a development studio experience, it will work just like in any other development studio: the student will no longer be welcome to participate in any aspect of the project. It may sound harsh at first, but it is the only way to make it a legitimate studio environment, to enforce accountability.

    Of course, I expect 99% of problems to be solved through consultation. Keep in mind that, as part of Scrum, students have to commit to tasks at the beginning of each Sprint and log their progress. It will be quite clear if someone's problem is related to time-management --- which is where a lot of problems come from with students, from my experience --- and so it will be straightforward to sit with the student and look at how to more wisely schedule productive hours.