Some academic has written a paper suggesting global warming (AGW) skeptics are moon-landing conspiracy theorists, and this has sparked a bit of a controversy. One of the many complaints about it is that it recruited subjects online through a survey posted on blogs, and therefore is completely unrepresentative of skeptics in the community. I’m going to examine this a bit here, in the context of the problem of studying online communities. I’m going to do this through everyone’s favourite context: gaming.
Suppose that you’re a lefty tree-hugging academic who wants to do a study of attitudes towards women in role-playing games. You want to find female gamers and you want their voice to be representative of all gamers in the community. There is basically only one robust way to do this: a simple random sample of the community. Since this is impossible, we usually use something that can be forced to approximate it through statistical tricks and a bit of hand-waving: the cluster-sampled face-to-face survey or the random sampled phone survey. These can be extremely resource-intensive, and a typical poll in Australia will involve 800 -1500 people; all the polling goodness for Australia can be found here. So let’s suppose some well-funded researcher can pay Roy Morgan or Newspoll to tack a few questions about gaming onto the end of a poll (companies do this all the time). They will then get to ask the question they want to ask of about 1000 randomly selected Australians of all ages over 16 and both sexes. This means they will identify about 20 role-players, 1 of whom will be a woman. They could design a special poll that they commission separately, which oversamples 20-40 year olds, which will get them about 50 roleplayers (3 women); but this yields diminishing returns because for statistical reasons the weighting that gets applied to an over-sampled poll reduces its accuracy. In either case the sample of gamers will be “representative” of the population, but the precision will be so poor that we will be able to say something like “0-100% of women who game think that the gaming industry is sexist.” The only way to up the accuracy is to recruit enough role-players that we get about 30 women; that is, about 600 role-players in a randomly-selected sample of size 30,000. At this size we can do prevalence estimates but no regression comparisons of males and females – for that we need probably another 30 women, or a sample size of about 60,000. We need to sample a quantifiable proportion of the Australian population to find out that, yes, female gamers think the industry is sometimes sexist.
That was worth it, wasn’t it? Why hasn’t someone funded this research? Governments have really crappy priorities these days – they’ll fund some guy in WA to do an internet survey of mere <i>climate skeptics</i> but they won’t fork out the cash for a decent survey of Aussie role-players! Maybe we need to get smarter with our grant applications … so instead we notice that gamers gather in clubs and shops, and realize that we could get a reasonably representative sample of gamers by recruiting subjects there. A couple of weekends and some hard yards later, and we’ve recruited our 600 gamers. Of course, our sample is no longer strictly representative of either the gaming community or the general community, because some gamers don’t play at shops (the <i>Vampire</i> crowd are hanging around the graveyard, and the cool kids are doing it at heavy metal gigs or at their local coven). Also, we’ve given up the ability to estimate population prevalences, because we don’t know how many gamers we missed in our study. But if we know something about our topic, and we work hard to recruit, and we also put up adverts in the right places and do a bit of snowball sampling (get them to invite their friends) we may be in with a chance of covering enough of the community, and getting a diverse enough range of gamers, that we’ve got something that if not completely representative, is at least robust to criticism. The only reason to do this is that it’s much cheaper, but this is a common problem in modern research: Michael Mann may be up to his ears in NASA-funded cocaine, dancing girls and cadillacs, but the rest of us are just struggling to recruit 100 sweaty-palmed nerds to fill in a two page survey.
This is pretty much the standard way that one recruits “hard-to-reach” groups: role-players, street-based sex workers, injecting drug users, hamster-fetishists, AGW skeptics … sex work is legal (or decriminalized/licensed) in Australia now but good luck trying to recruit a nationally-representative sample of sex workers over the phone. No, you have to do the hard yards, slogging through brothels and asking if you could interview the pretty girl at the back in the cherry boots … your sample will never be nationally representative if you do this, but it will be representative of <i>something</i>, and if you target your survey selection well and do the right work, you can make your findings valid in some sense.
We could extend this basic principle to online gaming, though online gamers have a registration system and a defined world they operate in, so if we were to get the cooperation of the gaming companies it would be possible to run simple random samples of gamers and get quite a good response. To do this we would need the cooperation of the gaming community’s custodians: the companies that run the games. With their help it would be easy to distribute a survey to the gamers who use their servers, getting a large and robust sample. We wouldn’t be able to get prevalence estimates because to do that you need to randomly sample the whole community and ask them about their gaming habits; but it would be enough to examine relationships between gamers and their opinions of stuff.
The same thing applies to other online communities, and perhaps more so because online communities can be very fragmented compared to other communities: they can be international, for starters, dislocated geographically and never meeting in person. These communities can have very strong shared bonds, such as the people who comment at acrackedmoon‘s website, but know nothing about each other’s physical lives. And they may be bound together by very strong political ties but completely socially unconnected. Surveying these people physically is almost impossible. We see this at both warmist and skeptic websites: the online AGW-debating world is a world that has no physical analog, and can’t be sampled through physical means. The only way to sample it with any accuracy is to sample it online.
However, there is a problem here. To justify using an online survey to recruit online skeptics/warmists for research, we need to prove that the online community of skeptics/warmists is different to the rest of the community. That is, if I select 1000 ordinary Australians and get their opinions of global warming, I should expect that their responses will be different to the online community of skeptics/warmists – presumably, less inflammatory and less extreme. If I can be confident that the online community is special, contained within itself, rare and not necessarily representative of the community as a whole, then I can be fairly confident that I need to recruit them using special convenience sampling methods – but I can also be fairly confident that existing research on the issues at play cannot be applied to them, which I think is what Lewandowsky did with his assumption about pre-existing factor structures.
I think that in order to understand the modern skeptic/warmist debate we need to recruit these people online. But the only reason to do this is that these groups are different, which means that we can’t apply existing cognitive models to them, we need to make new ones from an exploratory perspective. Lewandowsky seems not to have done the latter, but he tried to do the former. Sadly, the skeptic blogs didn’t accommodate him in his efforts, and anyone who has done research with hard-to-reach groups knows that you need your research to be supported by trusted peers before you can implement it successfully. As a result, the survey was only conducted on warmist sites, and I would challenge any skeptic reading this to toddle over to Deltoid, look at the comments from the skeptics posting there, and then tell me with a straight face that you want those skeptics speaking on your behalf. If you want your voice heard in research, you need to take part in research. Otherwise the weirdos at Catallaxy will do it for you, and before you know it this guy will be telling Stephan Lewandowsky that AGW is a myth because the Aztecs faked the moon landing.
fn1: Incidentally, this points us at the inherent efficiency of women’s studies as a discipline. One junior academic sitting in her office could have told you that from looking at back issues of <i>Dragon</i> magazine. And she could have taught her first year students where their clitoris is before lunch. Much cheaper!
fn2: Actually I’m not really convinced that there are 600 gamers in Australia.
fn3: which I would *never* do with a 5-point scale, because all Australians would just say “3,” “don’t really care”