Friday, February 15, 2019

Counting plays with a specific player on Board Game Geek

I started logging all my board game plays on BoardGameGeek in January 2016. In writing my end-of-year game reports (2016, 2017, 2018), I have wondered how many games I played with specific players during the year, but this proved to be a difficult number to find. The unofficial BoardGameGeek app that I use tells me how many games I have played with players in total, but this cannot be filtered by date. (Incidentally, I just discovered, when attempting to link to the app, that it is currently missing from the Google Play Store but that the author is working on getting it updated.) I asked a few friends over the years, but nobody knew of a way to get the data I wanted back out of BoardGameGeek. However, I am working on a scholarly piece for which this is important information, and so I decided it was worth some effort to figure it out.

BoardGameGeek provides an XML API for programmatic access to its content, including play data. Let's say for the sake of example that there is a user named "sample", and I want to get all their plays between January 1, 2016 and December 31, 2018. This query will provide a good start:

 https://boardgamegeek.com/xmlapi2/plays?username=sample&mindate=2016-01-01&maxdate=2018-12-31  

The results are limited to 100 per page, but there is no way to ask how many entries exist. Hence, to get all the data, one has to go page by page until the page is empty. Accessing a particular page of data is a simple matter of adding a page=N parameter to the end of the query.

I'm not interested in all plays, though; I am only interested in plays with a particular user. In my case, I am looking for matches by a given name and not a BoardGameGeek username, since this is how players are most easily added through my app. If I wanted to get all the plays with, say, Norm, than I need to dig into the XML and only count those where there is a "player" element with the value "Norm". For this, I can use XPath via xmllint.

After refreshing myself on some Bash fundamentals and finding this beautiful, idiolectic way of creating a do loop, I ended up with this script, which counts plays between BoardGameGeek fake user "sample" and his imaginary friend "Norm" between January 1, 2016 and December 31, 2018.
 #!/bin/bash  
 COUNT=0  
 PAGE=1  
 while  
   echo "Processing page $PAGE"  
   xml=`curl -s "https://boardgamegeek.com/xmlapi2/plays?username=sample&mindate=2016-01-01&maxdate=2018-12-31&page=$PAGE"`  
   plays=`xmllint --xpath 'count(/plays/play)' - <<< "$xml"`  
   pagecount=`xmllint --xpath 'count(/plays/play/players/player[@name="Norm"])' - <<< "$xml"`  
   COUNT=$(($COUNT + $pagecount))  
   PAGE=$(($PAGE + 1))  
   echo "Page $PAGE result is $pagecount out of $plays, so total is $COUNT"  
   [ "$plays" -gt 0 ]  
 do  
   :  
 done  

I learned a few interesting things writing this script and sharing it with friends. Script-wizard Ben Dean helped me revise the script so that it would not generate temporary files on the filesystem, and in pursuing this goal, I learned about bash herestrings. I had also not previously scripted with XPath via xmllint, only within larger applications. This script got me what I needed to know: I logged 937 plays with my eldest son in the three full years I have been keeping track, but not counting the plays since January 1 of this year.

Hopefully this script will be useful to you. If nothing else, it will be useful to me next time I need to ask this question!