Are photographs becoming easier to classify?
One of the most common (but generally unfounded!) critiques of citizen science is that data collected by citizens may not be as reliable as data collected by professional scientists. We are currently wrapping up an analysis of image classification accuracy, and the evidence suggests that the crowd-based species consensus is generally very accurate. Keep up the great work!
Our comparison is based on Snapshot Wisconsin Season 1, and there are a few reasons we might expect classification accuracy to differ in subsequent seasons. First of all, the resolution of Season 1 images often, uh, left something to be desired. We have since updated camera firmware and images now look cleaner. Secondly, Season 1 would have been the first opportunity for many users to take part in the project; it seems reasonable that participants would get better at classifying as they become more familiar with the workflow and Wisconsin’s fauna.
One of the most useful indicators of whether the Zooniverse consensus classification is accurate is the level of agreement between users: if all users select the same species, that consensus is more reliable than if only 40% of users agree. What we see below is that there is complete user agreement on the majority of photographs (density in this case refers to probability density, and the height of the peak reflects the proportion of photos):
User agreement seems higher for Season 2 than Seasons 1 and 3, but this is likely driven by “nothing-here” photographs, which are retired differently than photos with animals in them. As we see below, there were more “nothing-here” classifications in Season 2 than Season 1, and far fewer “nothing-here” classifications in Season 3:
So, if we control for “nothing-here” images, will we find an increase in classification agreement across seasons? Let’s take a look at a few species of interest (blue = Season 1, red = Season 2, black = Season 3):
If agreement among users (and, by extension, classification accuracy) is increasing over time, we would expect the humps and peaks to shift rightwards. Visually, it looks like this is the case, particularly for raccoons and squirrels, moderately for deer, bears, bobcats, and porcupines, and not so much for wolves. This may reflect the inherent differences in the difficulty of identifying species–deer are usually easier to ID than wolves.
But what do you think? Which species do you find most difficult? Do you feel as though classification is becoming easier, and if so, why?