Saturday, April 27, 2019

Two missing specifications in HCI

I finished grading my students' final projects for the Spring 2019 HCI class (CS445, used to be CS345). Before the start of the semester, I wrote about how I would try specifications grading in the course. After the afternoon of grading, I realize that I missed two important specifications. I will share them here so that I have a better chance of remembering when planning Fall's class, since I've been assigned to teach the course again.

I should have had a specification requiring all non-trivial processing to be done off of the event thread. This is, of course, a requisite for any kind of multi-threaded UI programming. I specifically chose a data source that would require handling slow load times and long processing times so that my students could practice this technique. I developed a sample project in the first half of the semester based around this common practice, and I explained to them why it was important. However, I neglected to have a specification about it. Three hours before the final project was due, I had a student ask for some last-minute troubleshooting. He said that he added a spinner while some images loaded, but it wasn't showing up. Of course, it wasn't showing up because he was loading the image on the event thread. I showed him (again) the example from earlier in the semester and explained (again) why this pattern was necessary. From their final technical presentations, it was clear that he was the only person in the class of roughly twenty students who understood this crucial point. I believe this is an instance of the old standard motto: if it's important, make it worth points. I simply missed it in my specifications.

The other specification deals with acceptance testing. There are two relevant specifications in the evaluation plan, one at B-level and one at A-level. Specification B.5.R says that the final report " describes the methods by which the solution was evaluated," and A.2.R says that "The documented solution evaluation includes both quantitative and qualitative elements that explicitly align with this semester's readings." The B-level specification is designed to be broad: you can earn a B on the project by doing any kind of acceptance testing. The A-level specification is designed to be more focused: do a mixed methods evaluation based on a theory we studied this semester. None of the five teams explicitly aligned their evaluations with the semester's readings. This didn't stop two of the groups from marking that specification as complete in their respective checklists, casting serious doubt on the implicit claim that they had conducted the required self-assessment for which the checklist is a result. (Perhaps, then, I need to add more rigor to the self-assessment itself, requiring them to link their claim to the artifacts.)

The problem with the acceptance testing actually goes much deeper than dishonest claims of completion. Among those who conducted any kind of acceptance testing, there was no evidence of their having learned anything from the assigned readings and exercises relating to Steve Krug's and Jakob Nielson's theories. Instead, they followed ad hoc approaches that were poorly designed and yield unreliable results. They did actually use quantitative and qualitative approaches, in keeping with A.2.R, but they did not do these well. For example, many groups asked questions like "What did you think of the application?" and then reported "3/6 users say they liked it." I pointed out in my feedback that 50% of users claiming they liked it is different from 50% of users liking it. More importantly, "liking" the application was not one of our design goals: we were designing systems to be used. Yet, only one of the groups conducted a task-based evaluation, where the user was asked to accomplish a goal using their system. Task-based evaluation is what I expected, and task-based evaluation is what I wanted. However, I wanted the students to realize that this kind of evaluation was the right choice, so I left the specification open to other options. The other options were demonstrably worse. Hence, in the future, and particularly in this introductory HCI course, I should just require them to follow the best practice rather than give them the choice to shoot themselves in the proverbial feet.

I have to wonder if the students would have spontaneously met these criteria if they had taken notes during our discussions

No comments:

Post a Comment