Are photographs becoming easier to classify?

One of the most common (but generally unfounded!) critiques of citizen science is that data collected by citizens may not be as reliable as data collected by professional scientists. We are currently wrapping up an analysis of image classification accuracy, and the evidence suggests that the crowd-based species consensus is generally very accurate. Keep up the great work!

Our comparison is based on Snapshot Wisconsin Season 1, and there are a few reasons we might expect classification accuracy to differ in subsequent seasons. First of all, the resolution of Season 1 images often, uh, left something to be desired. We have since updated camera firmware and images now look cleaner. Secondly, Season 1 would have been the first opportunity for many users to take part in the project; it seems reasonable that participants would get better at classifying as they become more familiar with the workflow and Wisconsin’s fauna.

One of the most useful indicators of whether the Zooniverse consensus classification is accurate is the level of agreement between users: if all users select the same species, that consensus is more reliable than if only 40% of users agree. What we see below is that there is complete user agreement on the majority of photographs (density in this case refers to probability density, and the height of the peak reflects the proportion of photos):

propdensity

User agreement seems higher for Season 2 than Seasons 1 and 3, but this is likely driven by “nothing-here” photographs, which are retired differently than photos with animals in them. As we see below, there were more “nothing-here” classifications in Season 2 than Season 1, and far fewer “nothing-here” classifications in Season 3:

propnothinghere

So, if we control for “nothing-here” images, will we find an increase in classification agreement across seasons? Let’s take a look at a few species of interest (blue = Season 1, red = Season 2, black = Season 3):

rplot05

If agreement among users (and, by extension, classification accuracy) is increasing over time, we would expect the humps and peaks to shift rightwards. Visually, it looks like this is the case, particularly for raccoons and squirrels, moderately for deer, bears, bobcats, and porcupines, and not so much for wolves. This may reflect the inherent differences in the difficulty of identifying species–deer are usually easier to ID than wolves.

But what do you think? Which species do you find most difficult? Do you feel as though classification is becoming easier, and if so, why?

« Previous post

nlgoodman says : February 15, 2017 at 11:38 pm
I don’t understand the y-axis on the plots – probability density % isn’t a calculation I’m familiar with. I get that ideally you want consensus = 1 and the higher the peak, the better, but what value of “density” is acceptable for your purposes?

LikeLike

Reply
John Clare says : February 23, 2017 at 6:35 pm
Hi, and sorry for the slow response: so, probability density is a way of describing the likelihood of different possible outcomes for continuous random variables. The space between the x axis and the curve integrates to 1 (i.e., some possible level of user agreement between 0% and 100% must occur); the height actually doesn’t have any direct interpretation. Instead, the value of the integral between some range of outcome values describes the probability that a random outcome will fall within this interval. In an ideal world, all of the probability density would fall in the agreement interval between, say, 0.9 and 1.0 (i.e, all consensus classifications would be agreed upon by greater than 90% of the viewers).

LikeLike

Reply
Mitch56 says : April 14, 2017 at 6:07 am
Most definitely yes. When I first started doing Wisconsin Wildlife Watch it was with single images and the meta data was meaningless to use classifiers. Now the image sets are multiples of 3 images that can be stepped through automatically or manually which is so much better. The meta data is detailed and important during classification. In fact a couple of the cameras had been relocated without updating the information so there’s going to be some incorrect classifications from this past season. For instance, I classified a snowshoe hare, that I knew was correct, but I changed my mind because the meta data had put it way too far south and it was still in brown coat in the winter time. Another one I found was a fisher in Sheboygan county. Again, I knew better but I classified it a mink. Later it was discovered to actually be in St.Croix I believe it was. Also, we had zoom capability but it stopped working. It was so awesome to use my mouse wheel to zoom in and out. You might want to see about getting that back before the new season starts.

LikeLike

Reply
nlgoodmansnapshotbob says : April 18, 2017 at 7:06 pm
Your classification system needs another selection. It should be different then no animal. I have shot showing a pair of ground level eyes reflecting the light of a flash. There clearly is an animal yet there is no way to identify it. Unable to ID should be a choice.

LikeLike

Reply
- John Clare says : April 18, 2017 at 10:20 pm
  A common question, but we’re not gonna add it. See the FAQ/and the talk boards for justification.
  
  LikeLike
  
  Reply