Friday, February 15, 2019

Counting plays with a specific player on Board Game Geek

I started logging all my board game plays on BoardGameGeek in January 2016. In writing my end-of-year game reports (2016, 2017, 2018), I have wondered how many games I played with specific players during the year, but this proved to be a difficult number to find. The unofficial BoardGameGeek app that I use tells me how many games I have played with players in total, but this cannot be filtered by date. (Incidentally, I just discovered, when attempting to link to the app, that it is currently missing from the Google Play Store but that the author is working on getting it updated.) I asked a few friends over the years, but nobody knew of a way to get the data I wanted back out of BoardGameGeek. However, I am working on a scholarly piece for which this is important information, and so I decided it was worth some effort to figure it out.

BoardGameGeek provides an XML API for programmatic access to its content, including play data. Let's say for the sake of example that there is a user named "sample", and I want to get all their plays between January 1, 2016 and December 31, 2018. This query will provide a good start:

 https://boardgamegeek.com/xmlapi2/plays?username=sample&mindate=2016-01-01&maxdate=2018-12-31  

The results are limited to 100 per page, but there is no way to ask how many entries exist. Hence, to get all the data, one has to go page by page until the page is empty. Accessing a particular page of data is a simple matter of adding a page=N parameter to the end of the query.

I'm not interested in all plays, though; I am only interested in plays with a particular user. In my case, I am looking for matches by a given name and not a BoardGameGeek username, since this is how players are most easily added through my app. If I wanted to get all the plays with, say, Norm, than I need to dig into the XML and only count those where there is a "player" element with the value "Norm". For this, I can use XPath via xmllint.

After refreshing myself on some Bash fundamentals and finding this beautiful, idiolectic way of creating a do loop, I ended up with this script, which counts plays between BoardGameGeek fake user "sample" and his imaginary friend "Norm" between January 1, 2016 and December 31, 2018.
 #!/bin/bash  
 COUNT=0  
 PAGE=1  
 while  
   echo "Processing page $PAGE"  
   xml=`curl -s "https://boardgamegeek.com/xmlapi2/plays?username=sample&mindate=2016-01-01&maxdate=2018-12-31&page=$PAGE"`  
   plays=`xmllint --xpath 'count(/plays/play)' - <<< "$xml"`  
   pagecount=`xmllint --xpath 'count(/plays/play/players/player[@name="Norm"])' - <<< "$xml"`  
   COUNT=$(($COUNT + $pagecount))  
   PAGE=$(($PAGE + 1))  
   echo "Page $PAGE result is $pagecount out of $plays, so total is $COUNT"  
   [ "$plays" -gt 0 ]  
 do  
   :  
 done  

I learned a few interesting things writing this script and sharing it with friends. Script-wizard Ben Dean helped me revise the script so that it would not generate temporary files on the filesystem, and in pursuing this goal, I learned about bash herestrings. I had also not previously scripted with XPath via xmllint, only within larger applications. This script got me what I needed to know: I logged 937 plays with my eldest son in the three full years I have been keeping track, but not counting the plays since January 1 of this year.

Hopefully this script will be useful to you. If nothing else, it will be useful to me next time I need to ask this question!

5 comments:

  1. You say that you needed to redirect stderr. How come?

    I was able to get this to work on my Mac with no redirection and a pipe:

    play=`curl "https://boardgamegeek.com/xmlapi2/plays?username=sample&mindate=2016-01-01&maxdate=2018-12-31&page=$PAGE" | xmllint --xpath 'count(/plays/play)' -`

    For xmllint to read from stdin, you need a '-' in place of a file name.

    ReplyDelete
    Replies
    1. Sorry, I wrote the post across two writing sessions. When I sat down to write this, I thought I had needed to do a 2>&1 redirection in the final script, but clearly I did not---since it's not in the script!

      The stream redirection was related to the pipe I was having trouble with, but you're right that the one you show works (although I have to add -s to prevent a progress dashboard from being rendered by curl).

      The original script I posted made use of a temporary file, but as I mention in the revised post, my friend Ben helped me see that I could just use a variable instead. I like this approach better, since I was unhappy with creating a temporary file without first ensuring there wasn't something on the filesystem with that name. The herestrings are a new one on me, but they seemed to be necessary to pass the xml data to xmllint.

      Delete
    2. For what it's worth, zsh has a nice substitution feature for those cases where our tools accept only files and refuse to read from stdin. The =(tool arg1 ...) captures the output of running tool in a temporary file and then expands to the temporary file's name.

      I use this often to produce diffs for files that I don't actually have stored anywhere. Like if I've got a remote version that I want to compare to:

      vimdiff localfile =(ssh host "cat remotefile")

      Delete
    3. Neat! I have no real experience with zsh.

      Delete