Education


The media this week are exploding with news that a company called Cambridge Analytica used shadily-obtained Facebook data to influence the US elections. The data was harvested by some other shady company using an app that legally exploited Facebook’s privacy rules at the time, and then handed over to Cambridge Analytica, who then used the data to micro-target adverts over Facebook during the election, mostly aimed at getting Trump elected. The news is still growing, and it appears that Cambridge Analytica was up to a bunch of other shady stuff too – swinging elections in developing countries through fraud and honey-traps, getting Facebook data from other sources and possibly colluding illegally with the Trump campaign against campaign funding laws – and it certainly looks like a lot of trouble is deservedly coming their way.

In response to this a lot of people have been discussing Facebook itself as if it is responsible for this problem, is itself a shady operator, or somehow represents a new and unique problem in the relationship between citizens, the media and politics. Elon Musk has deleted his company’s Facebook accounts, there is a #deleteFacebook campaign running around, and lots of people are suggesting that the Facebook model of social networking is fundamentally bad (see e.g. this Vox article about how Facebook is simply a bad idea).

I think a lot of this reaction against Facebook is misguided, does not see the real problem, and falls into the standard mistake of thinking a new technology must necessarily come with new and unique threats. I think it misses the real problem underlying Cambridge Analytica’s use of Facebook data to micro-target ads during the election and to manipulate public opinion: the people reading the ads.

We use Facebook precisely because of the unique benefits of its social and sharing model. We want to see our friends’ lives and opinions shared amongst ourselves, we want to be able to share along things we like or approve of, and we want to be able to engage with what our friends are thinking and saying. Some people using Facebook may do so as I do, carefully curating content providers we allow on our feed to ensure they aren’t offensive or upsetting, and avoiding allowing any political opinions we disagree with; others may use it for the opposite purpose, to engage with our friends’ opinions, see how they are thinking, and openly debate and disagree about a wide range of topics in a social forum. Many of us treat it as an aggregator for cat videos and cute viral shit; some of us only use it to keep track of friends. But in all cases the ability of the platform to share and engage is why we use it. It’s the one thing that separates it from traditional mass consumption media. This is its revolutionary aspect.

But what we engage with on Facebook is still media. If your friend shares a Fox and Friends video of John Bolton claiming that Hilary Clinton is actually a lizard person, when you watch that video you are engaging with it just as if you were engaging with Fox and Friends itself. The fact that it’s on Facebook instead of TV doesn’t suddenly exonerate you of the responsibility and the ability to identify that John Bolton is full of shit. If Cambridge Analytica micro target you with an ad that features John Bolton claiming that Hilary Clinton is a lizard person, that means Cambridge Analytica have evidence that you are susceptible to that line of reasoning, but the fundamental problem here remains that you are susceptible to that line of reasoning. Their ad doesn’t become extra brain-washy because it was on Facebook. Yes, it’s possible that your friend shared it and we all know that people trust their friends’ judgment. But if your friends think that shit is reasonable, and you still trust your friend’s judgement, then you and your friend have a problem. That’s not Facebook’s problem, it’s yours.

This problem existed before Facebook, and it exists now outside of Facebook. Something like 40% of American adults think that Fox News is a reliable and trustworthy source of news, and many of those people think that anything outside of Fox News is lying and untrustworthy “liberal media”. The US President apparently spends a lot of his “executive time” watching Fox and Friends and live tweeting his rage spasms. No one forces him to watch Fox and Friends, he has a remote control and fingers, he could choose to watch the BBC. It’s not Facebook’s fault, or even Fox News’s fault, that the president is a dimwit who believes anything John Bolton says.

This is a much bigger problem than Facebook, and it’s a problem in the American electorate and population. Sure, we could all be more media savvy, we could all benefit from better understanding how Facebook abuses privacy settings, shares our data for profit, and enables micro-targeting. But once that media gets to you it’s still media and you still have a responsibility to see if it’s true or not, to assess it against other independent sources of media, to engage intellectually with it in a way that ensures you don’t just believe any old junk. If you trust your friends’ views on vaccinations or organic food or Seth Rich’s death more than you trust a doctor or a police prosecutor then you have a problem. Sure, Facebook might improve the reach of people wanting to take advantage of that problem, but let’s not overdo it here: In the 1990s you would have been at a bbq party or a bar, nodding along as your friend told you that vaccines cause autism and believing every word of it. The problem then was you, and the problem now is you. In fact it is much easier now for you to not be the problem. Back in the 1990s at that bbq you couldn’t have surreptitiously whipped our your iPhone and googled “Andrew Wakefield” and discovered that he’s a fraud who has been disbarred by the GMA. Now you can, and if you choose not to because you think everything your paranoid conspiracy theorist friend says is true, the problem is you. If you’re watching some bullshit Cambridge Analytica ad about how Hilary Clinton killed Seth Rich, you’re on the internet, so you have the ability to cross reference that information and find out what the truth might actually be. If you didn’t do that, you’re lazy or you already believe it or you don’t care or you’re deeply stupid. It’s not Facebook’s fault, or Cambridge Analytica’s fault. It’s yours.

Facebook offers shady operatives like Robert Mercer the ability to micro-target their conspiracy theories and lies, and deeper and more effective reach of their lies through efficient use of advertising money and the multiplicative effect of the social network feature. It also gives them a little bit of a trust boost because people believe their friends are trustworthy. But in the end the people consuming the media this shady group produce are still people with an education, judgment, a sense of identity and a perspective on the world. They are still able to look at junk like this and decide that it is in fact junk. If you sat through the 2016 election campaign thinking that this con-artist oligarch was going to drain the swamp, the problem is you. If you thought that Clinton’s email practices were the worst security issue in the election, the problem is you. If you honestly believed The Young Turks or Jacobin mag when they told you Clinton was more militarist than Trump, the problem is you. If you believed Glenn Greenwald when he told you the real threat to American security was Clinton’s surveillance and security policies, the problem is you. If you believed that Trump cared more about working people than Hilary Clinton, then the problem is you. This stuff was all obvious and objectively checkable and easy to read, and you didn’t bother. The problem is not that Facebook was used by a shady right wing mob to manipulate your opinions into thinking Clinton was going to start world war 3 and hand everyone’s money to the bankers. The problem is that when this utter bullshit landed in your feed, you believed it.

Of course the problem doesn’t stop with the consumers of media but with the creators. Chris Cillizza is a journalist who hounded Clinton about her emails and her security issues before the election, and to this day continues to hound her, and he worked for reputable media organizations who thought his single-minded obsession with Clinton was responsible journalism. The NY Times was all over the email issues, and plenty of NY Times columnists like Maureen Dowd were sure Trump was less militarist than Clinton. Fox carefully curated their news feed to ensure the pussy-grabbing scandal was never covered, so more Americans knew about the emails than the pussy-grabbing. Obviously if no one is creating content about how terrible Trump is then we on Facebook are not able to share it with each other. But again the problem here is not Facebook – it’s the American media. Just this week we learn that the Atlantic, a supposedly centrist publication, is hiring Kevin D Williamson – a man who believes women who get abortions should be hanged – to provide “balance” to its opinion section. This isn’t Facebook’s fault. The utter failure of the US media to hold their government even vaguely accountable for its actions over the past 30 years, or to inquire with any depth or intelligence into the utter corruption of the Republican party, is not Facebook’s fault or ours, it’s theirs. But it is our job as citizens to look elsewhere, to try to understand the flaws in the reporting, to deploy our education to the benefit of ourselves and the civic society of which we are a part. That’s not Facebook’s job, it’s ours. Voting is a responsibility as well as a right, and when you prepare to vote you have the responsibility to understand the information available about the people you are going to vote for. If you decide that you would rather believe Clinton killed Seth Rich to cover up a paedophile scandal, rather than reading the Democratic Party platform and realizing that strategic voting for Clinton will benefit you and your class, then the problem is you. You live in a free society with free speech, and you chose to believe bullshit without checking it.

Deleting Facebook won’t solve the bigger problem, which is that many people in America are not able to tell lies from truth. The problem is not Facebook, it’s you.

 

Advertisements
Mushroom man on the spit!

Mushroom man on the spit!

I just finished reading episode 1 of this entertaining and weird manga, called Dungeon meshi in Japanese, by Ryoko Kui. It’s the tale of a group of adventurers – Raios the fighter, Kilchack the halfling thief, and Marshille the elven wizard – who are exploring a dungeon that is rumoured to lead to a golden kingdom that will become the domain of whichever group of adventurers kill the evil wizard who has taken it over. The story starts with them having to flee a battle with a dragon, which swallows Raios’s little sister whole. She manages to teleport the rest of the party out of the dungeon in an act of self sacrifice, and they decide that they should go back in and save her from the dragon. They could wait and resurrect her from its poo, but they decide they would rather go in, kill it and cut her out of its belly (dragon digestion is very slow). No answers are forthcoming to the question of why she can’t just teleport herself out as well, or how she will survive in a dragon’s belly, but I’m sure the reasons are clear.

Anyway, because they left all their gear and loot behind when they fled, they would need to sell their armour and weapons and downgrade in order to make enough money to buy supplies for the return trip. Also they don’t have time to go back to town and get more stuff. So they decide to go straight back into the dungeon and live on a subsistence diet of whatever they can gather and kill in the dungeon. This is particularly appealing to Raios, who has always secretly wanted to eat the creatures he kills (when he tells them this, Marshille and Kilchack decide that he’s a psychopath, but they ain’t seen nothing yet …) Off they go!

They soon run into a dwarf called Senshi who has spent 10 years exploring the dungeon and learning to cook its monsters. Raios has a book of recipes but Senshi tells him that’s all bullshit, and teaches them to cook as they go. Senshi has always wanted to eat a dragon, so he offers to join them and help in their quest. Thus begins the long process of returning to the deepest levels of the dungeon, one meal at a time …

The food chain, in the dungeon

The food chain, in the dungeon

This manga is basically a story about a series of meals, with some lip service to killing the monsters that go in the meals. It starts with a brief description of the ecology of dungeons, which sets out a nice piece of Gygaxian naturalism, along with the food pyramid suitably reimagined for mythical beasts, and gives us a tiny bit of background about the dungeon crawling industry, which is so systematized as to be almost industrial in its scope. Once we have this basic background we’re off on a mission to eat everything we can get our hands on: Mushroom men, giant scorpions, giant bats, basilisk meat and eggs, green slimes (which make excellent jerky apparently), mandrake, carnivorous plants and ultimately a kind of golem made of armour. In the process they make some discoveries about the nature of the beasts – for example, Marshille discovers that you can use giant bats to dig up mandrake and that a mandrake tastes differently depending on whether you get it to scream or not, and the golem is actually armour that has been animated by a strange colony of mollusc-like organisms that are excellent when grilled in the helmet or stir-fried with medicinal herbs.

Giant scorpion and mushroom man hot pot

Giant scorpion and mushroom man hot pot

Plus, we get recipes, which are detailed and carefully thought-out and also slightly alarming. For example, for the mushroom man and giant scorpion hot pot (pictured above) we get to see the team slicing open the body of a mushroom man, which is kind of horrific. The final meal of this issue, the walking armour, is particularly disturbing, since the crew basically sit around in a room plying mollusc flesh out of the pieces of an empty suit of armour, then grill them, except the head parts, which they cook by simply sticking the entire helmet on the bbq and waiting for them to fall out as they roast. It’s made clear that the armour is operated by an interlocking network of separate mollusc-things that have some kind of group sentience, but then once they manage to drag some out of the armour they slip them into a bowl of water and declare happily “they drowned!” Really it’s just like eating a big sentient shellfish. i.e. completely disgusting, in a disturbingly fascinating way.

Each recipe also comes with a disquisition on its nutritional benefits (and the importance of a balanced diet), along with a spider diagram showing the relative magnitude and balance of different ingredients (in the bottom right of the picture above, for example). In some cases special preparation is required – the green slime needs to be dried for several weeks, but fortunately Senshi has a special portable net for this task, and a green slime he prepared earlier which the crew can sample. In other cases, such as the basilisk, medicinal herbs of various kinds need to be included with the meal, which sadly makes it impossible for the reader to make their own roast basilisk, lacking as we do the necessary ingredients to neutralize the poison in the basilisk after we catch it. There are also tips on how to catch the ingredients – the basilisk has two heads for example but only one brain, so you can confuse it if you attack both heads at once – and some amusing biological details too. For example, it is well known that chimaera made from more than two animals are not good to eat because they don’t have a main component of their structure, while chimera of just two animals – like the basilisk – will adopt the taste and general properties of whatever their main animal is (in this case, a bird)[1].

In addition to the rather, shall we say, functional, approach to non-human creatures, the story also has some quite cynical comments on the adventuring business. During the encounter with the carnivorous plant, for example, they find a half-digested body. They feel they should return this body to the surface, but just like climbing Everest, they don’t want to go back up till they reach their goal, so instead they leave it in the path for a returning group to deal with. Realizing this might cause someone to trip, they arrange to hang it from a tree by a rope in what is, essentially, a mock execution, and then they go to sleep underneath it (Marshille, unsurprisingly, has bad dreams). To counter this cynicism Marshille acts in part as the conscience of the group, spinning on her head in rage at one point when they suggest eating something, and refusing outright to eat humanoids, but she is usually overruled and then forced to admit that yes actually this meal is quite delicious. Marshille seems to be the stand-in for the reader, since she generally expresses the disgust that the reader is likely (I hope!) to feel, and also gets things explained to her obviously for our benefit (this comes across as very man-splainy, since it’s the male fighter telling her how the world really is, but since she spends most of her time responding in apopleptic rage, it’s bearable).

Beyond its cynical but loving commentary on the world of dungeon crawling, its fine recipes and detailed exposition of dungeon ecology, this book is also a careful retelling of a staple of Japanese television entertainment – the cooking variety show. Anyone who has spent more than about a minute in Japan will have noticed that Japanese television is heavily dominated by variety shows about food, and a common format is for a group of stars and starlets to go to a remote town and sample its local delicacies. Usually this happens in rural Japan, though it can also often be seen in overseas settings, and it always involves a brief description of what is special about how the food is prepared and the ingredients obtained, and then a scene where everyone eats it and says “delicious”, and if there is a starlet involved she will be the one asking the questions while an older person (usually male) explains things to her. So this manga is an almost perfect recreation of that format, except with adventurers instead of starlets and magical creatures instead of standard ingredients. Also, the food shows usually don’t go beyond saying oishii over and over, but in the book we get more detailed expressions of the nature of the food, its texture and taste, which is just great when you’re talking about a humanoid mushroom.

Part RPG dungeon crawl, part variety show, part ecological textbook, this manga is a simple, pleasant read with an engaging story and two entertaining characters (the dwarf and the elf). It’s a really good example of the special properties of manga as a story-telling medium, since the entire idea and its execution would be almost impossible in short story or novel form, but is really well-suited to words with pictures. The pictures give it a more visceral feeling than if you were simply reading a short story about a dungeon cooking show, but the manga format gives I think more detail to the food and science descriptions than you would get in a TV drama. It’s a great balance, and an entertaining read. From a non-native Japanese perspective, it has the flaw that the kanji don’t have furigana (the hiragana writing by the side of the kanji which makes them easy to read), so it takes a while for a non-expert reader to get through, but it doesn’t have the heavy use of slang language and transliteration of rough pronunciation that you see in comics like One Piece, which makes them almost unreadable to non-experts. In general the grammar is simple and straightforward, though sometimes Senshi’s speaking style is overly complex and he uses weird words. In some manga, and especially in novels, the sentences are long and complex and very hard to read for slow readers, but here the sentences are short and straightforward, and the language is mostly standard Japanese. I found I could read in ten page blocks without too much difficulty, using a kanji lookup tool on my phone (I use an app called KanjiLookup that enables me to write them with my finger, which I’m not very good at but a lot better at now I have read this whole manga). After about 10 pages I get sick of constantly referencing the app and put the book down, but it’s not so challenging that I gave up entirely, probably because of the simple language and the short sentences and the very clear link between what is being said and what is being depicted. So as a study exercise I recommend it. As a cookbook or a moral guide, not so much …

 

 


fn1: Actually I’m pretty sure the “basilisk” in this story was actually a cockatrice.

Hot on the heels of a (probably wrong) paper on ivory poaching that I criticized a few days ago, Vox reports on a paper that claims schools that give away condoms have higher teen pregnancy rates. Ooh look, a counter-intuitive finding! Economists love that stuff, right? This is a bit unfortunate for Vox since the same author has multiple articles from 2014 about rapidly falling birth rates that are easily explained by the fact that teenagers are really good at using contraceptives. So which Vox is correct, 2014 Teens-are-pregnancy-bulletproof Vox that cites national pregnancy and abortion stats, or 2016 give-em-condoms-and-they-breed-like-rabbits Vox that relies on a non-peer-reviewed article by economists at NBER? Let’s investigate this new paper …

The paper can be obtained here. Basically the authors have found data on school districts that did or didn’t introduce free condom programs between 1989 and 1993, and linked this with county-level information on teen birth rates over the same period. They then used a regression model to identify whether counties with a school district that introduced condom programs had different teen pregnancy outcomes to those that didn’t. They used secondary data, and obtained the data on condom distribution programs from other journal articles, but because population information is not available for school districts they used some workarounds to make the condom program data work with the county population data. They modeled everything using ordinary least squares (OLS) regression. The major problems with this article are:

  • They modeled the log of the birth rate using OLS rather than directly modeling the birth rate using Poisson regression
  • Their tests based on ratios of teen to adult births obscures trends
  • They didn’t use a difference in difference model

I’m going to go through these three problems of the model, and explain why I think it doesn’t present the evidence they claim. But first I want to just make a few points about some frustrating weaknesses in this article that make me think these NBER articles really need to be peer-reviewed before they’re published.

A few petty complaints about this article

My first complaint is that the authors refer to “fighting AIDS” and “AIDS/HIV”. This indicates a general lack of familiarity with the topic: in HIV research we always refer to the general epidemic as the HIV/AIDS epidemic (so we “fight HIV/AIDS”) and we only refer to AIDS specifically when we are referring to that specific stage of progression of the disease. This isn’t just idle political correctness: patterns of HIV and AIDS differ widely depending on the quality of notification and the use of treatment (which delays progress to AIDS), and you can’t talk about AIDS by itself because the relationship of AIDS and HIV prevalence depends highly on the nature of the health system in which the disease occurs. The way the authors describe the HIV epidemic and reponses to it suggests a lack of familiarity with the literature on HIV/AIDS.

This sloppiness continues in their description of the statistical methods. They introduce their model as follows:

Condom model

But on page 10 they say that the thetas represent “county and year dummies” and that the Tc represents “county-specific trends”. These are not dummies. A “dummy” is a variable, not a parameter, and “dummies” for these effects should be represented by an X multiplied by a theta. In fact the theta and Tc are parameters, and in any kind of rational description of a statistical model this model is written wrong. It should be written with something like ThetacXc where Xc is the dummy[1].

This kind of sloppiness really offends me about the way economists describe their models. This is a simple OLS regression of the relationship between the log of birth rate and some covariates. In epidemiology we wouldn’t even write the equation, we would just list the covariates on the right hand side. If anyone cares about the equation, it’s always the same and it’s in any first year textbook. You don’t make yourself look smart by writing out a first year sociology equation and then getting it wrong. Just say what you did!

So, with that bit of venting out of the way, let’s move on to the real problems with the article.

Another model without Poisson regression

The absolute gold standard correct method for modeling birth rates is a Poisson regression. In this type of equation we model counts of births directly, and incorporate the population as an offset. This is a special case of a generalized linear model, and it has a special property that OLS regression does not have: the variance of the response is directly related to the magnitude of the response. This is important because it means that the uncertainty associated with counties with small numbers of births is not affected by the counties with large numbers of births – this doesn’t happen with OLS regression. Another important aspect of Poisson regression is that it allows us to incorporate data points with zero births – zero rates are possible.

In contrast the authors chose to use an OLS regression of the log of the birth rate. This means that there is a single common variance across all the observations, regardless of their actual number of births, which is inconsistent with the behavior of actual events. It also means that any counties with zero births are dropped from the model, since they have no log value. It also means that there is a direct linear relationship between the covariates on the right hand side of the model and the outcome, whereas in the Poisson regression model this relationship is logarithmic. That’s very important for modulating the magnitude of effects.

The model is, in fact, completely inappropriate to the problem. It will give the wrong results wherever there are rare events, like teenage births, or wherever there are big differences in scale in the data – like, say, between US counties.

Obscuring trends with a strange transformation

I mentioned above that the article also uses the ratio of teen to adult births (in age groups 20-24) to explore the effect of condom use. Figure 1 shows the chart they used to depict this.

Figure 1: The weird condom diagram

Figure 1: The weird condom diagram

 

Note that the time axis is in years before and after implementation of the program. This is a highly deceptive figure, because the schools introduced condom programs over 4 years, from 1989 to 1993. This means that year 0 for one school district is 1989, while for another it is 1992. If teen births are increasing over this period, or adult births are decreasing, then the numbers at year 0 will be rates from four different years merged together. This figure is the mean, so it means that four years’ worth of data are being averaged in a graph that only covers ten years’ worth of data. That step at year 0 should actually occur across four different points in time, within a specific time trend of its own, and can’t be simplified into this one diagram.

Note that the authors only show this chart for the schools that introduced a condom program. Why not put a similar line, perhaps in a different color, for school districts that didn’t? I suspect this is because the graph would contradict the findings of the model – because either the graph is misrepresentative of the true data, or the model is wrong, or both.

This graph also makes clear another problem with this research: the authors obviously don’t know how to handle the natural experiment they’re conducting, since they don’t know how to represent the diverse start points of the intervention, or the control group.

Lack of a difference in difference model

The authors include a term for the effect of introducing condom distribution programs, but they don’t investigate whether there was a common effect across condom distribution and non-condom distribution regions. It’s entirely possible that school districts without condom distribution programs also saw an increase in teen pregnancies (1989 is when MTV came out, after all, and all America went sex crazy. It’s also the year of Like a Prayer, and Prince’s song Cream was introduced in 1991. Big things were happening in teen sexuality in this period, and it’s possible these big things were way bigger than the effect of government programs.

Statistics is equal to any challenge, though[2]. We have a statistical technique for handling the effect of Miss Calendar grooving on a wire fence. A difference-in-difference model would enable us to identify whether there was a common effect during the intervention period, and the additional effect of condom promotion programs during this period. Difference-in-difference models are trivial to fit and interpret, although they involve an interaction term that is annoying for beginners, and they make a huge difference to the interpretation of policy interventions – usually in the direction of deciding the intervention made no difference. Unfortunately the authors didn’t do this, so we see that there was a step change in the intervention group, but we don’t see if there might have been a similar step change in the control group. This effect is exacerbated by having county-specific time trends, since it better enables the model to adapt to the step in the control group through adaptively changing these county-specific trends. This means we don’t know from the model if the effect in the intervention group was really confined to the intervention group, and how big it really was.

The correct model

The correct model for this problem is a Poisson regression modeling teen births directly with population as an offset, to properly capture the way rates change. It would be a difference-in-difference model that enables the effect of the condom programs to be extracted from any general upward or downward steps happening at that time. In this model, figure 1 would be replaced by a spaghetti plot of all the counties, or mean curves for intervention and control not rescaled to ensure that the intervention happens at year 0 for all intervention counties, which is misleading. Without doing this, we simply have no evidence that the condom distribution programs did what the authors claimed. The ideal model would also have a further term identifying whether a condom program did or didn’t include counselling, to ensure that the authors have evidence for their claim that the programs with counselling worked better than those without.

I’m partial to the view expressed that counselling is necessary to make condom programs work, but Vox themselves have presented conflicting evidence that teenagers are perfectly capable of using condoms. Given this, explicitly investigating this would have provided useful policy insights. Instead the authors have piled speculation on top of a weak and poorly-designed statistical model. The result is a controversial finding that they support only through very poor statistical modeling.

The correct model wouldn’t have been hard to implement – it’s a standard part of R, Stata, SPSS and SAS, so it’s unlikely the authors couldn’t have done it. It seems to me that this poor model (and the previous one) are indicative of a poor level of statistics and research design teaching in economics, and a lack of respect for the full diversity of statistical models available to the modern researcher. Indeed, I have a Stata textbook on econometrics that is entirely OLS regression – it doesn’t mention generalized linear models, even though these are a strong point of Stata. I think this indicates a fundamental weakness in economics and econometrics, and leads me to this simple bit of advice about models of health and social behavior prepared by economists: they’re probably wrong, and you shouldn’t trust them.

I hope I’m wrong, and Vox don’t keep vexing me with “explainers” about research that is clearly wrong. I don’t hold out much hope …


fn1: for those digging this far, or who often stumble across this horrible term in papers they read, a “dummy” is just a variable that is either 0 or 1, where 1 corresponds to the event of interest and 0 to not the event of interest. In epidemiology we would just say “we included sex in the model”. In economics they say “we included a dummy for sex.” This is just unnecessary jargon.

fn2: Except the challenge to be fun.

Save

Not militaristic at all ...

Not militaristic at all …

I am not a big fan of baseball, and I didn’t enjoy my high school days overmuch. Combining these two seems like a recipe for a bullying and unpleasant experience, and definitely not something I would have any interest in.

The Koshien, however, changed my mind about high school baseball. The Koshien (甲子園) is an annual high school baseball contest that takes place across all of Japan, and comes to its glorious, bittersweet climax during the hottest months of the year – this week, in fact, in mid-August. High school baseball teams compete to become prefectural champions, and champions from each prefecture – two from Tokyo – then converge on Kobe in August for the finals. The finals are a knockout, with four matches played every day to whittle the teams down from 48 to 32, then through knockout rounds to the final, which happens to be tomorrow. Each match is 1.5 to 2 hours long and is played under the punishing August sun, in extremely harsh conditions[1]: temperatures above 32C (often over 35 this year!) and very high humidity. Today, for example, was 32C with 82% humidity and much, much more pleasant than last week when the quarter finals were being decided. The teams have to play continuously too: the semi final was today and the final is tomorrow, which means that the pitchers in the final will have been playing every second day now for a week or more in this heat.

When I first saw the Koshien a few years ago I dismissed it without watching it. Baseball in Japan is renowned for its bullying atmosphere, which verges on militaristic at times, and the idea of making schoolboys of 16-18 years of age play a contest in the middle of the day in this heat is a classic representation of just how callous and brutal its culture is. But this year one of my students revealed to me her passion for it, showed me the website and sang the praises of its passion and energy. Since I had a week off for the summer break I thought I’d check it out – and I was hooked immediately. It’s amazing.

It isn’t just the contest itself that is great – in fact that’s barely part of it at all. Rather, the culture and the style and excitement of the entire series gives it a feeling that ordinary baseball just can’t get. Similar to cricket at its best, it has its own sound and pace, and the crowd are as much a part of the event as the teams. Every team brings a huge contingent of supporters, wearing school colours and usually including a school band and cheerleaders, who make a constant racket throughout the game. This highlight reel is a good a example of the sound of the game – the school song (or a supporter’s chant) playing in the background, drums, pipes, cheering, and the flash of pom-poms as the cheerleaders go wild on a home run. At the end of the reel you can just hear the announcer in a classic, high-pitched voice introducing the next batter, with the honorific “kun” at the end to remind everyone that these heroes of ours are actually just high school kids. During the match the commentators prowl the stands interviewing fans, and showing the world what ingenious support methods the schools have thought up; they read support messages from school children and adults around the country, and every day they have a different pro-baseballer on to help with the commentating. This year the commentators have identified a man they call “Rugger san” (Mr. Rugger) who sits in the same place directly behind the batter in the front row, and is so named because he wears a rugby shirt every day – he has been there the entire two week period. It’s a serious, extravagant two week festival of sport, very similar to the Ashes or Sumo in the strength of its associated support culture, its deep connection with a season, and its importance to ordinary sports fans. But in this case it has its own bittersweet feel, because these are boys near the end of high school, who are going to get one – maybe two, for the younger ones – shots at glory, then graduate and move on with their lives and leave this fleeting moment of fame and joy behind them forever.

And this is where the Koshien really makes its mark, because it captures something about the strange and furious passion with which Japanese people look back on their high school days. From the west looking in we are often led to believe that Japanese high school is a terrible place, strictly regimented, heirarchical, full of bullying, where the creativity is drained out of little humans ready to turn them into drones for Japan’s massive corporate machine. But Japanese people see it very differently – to them High School is a period of freedom, openness, and passion, this sunny couple of years of freedom before they hit the regimentation of the outer world. High School is where a lot of Japanese people experience first love, and it is also the time when they form deep bonds of friendship that will last them through many years, even though they will likely move away from home for university and work, and only see those old high school friends once a year. This disparity between the western view of Japanese school and the local view is really striking – Japanese people I speak to are very often deeply nostalgic for their high school days, which they describe to me as a time of freedom and happiness. This is especially noticeable when you mention the Koshien to anyone who is old enough to have begun forgetting their high school days: they will become instantly, powerfully nostalgic, and it’s clear that the word conjures up sounds and scenes that remind them instantly of everything they left behind when they left school. On the weekend I mentioned that I had watched the Koshien to my hairdresser, and even though he was a rugby player at school[2], not a baseball player, he immediately became misty-eyed, singing the praises of the event and its special meaning in the same way as my student.

This passion I think also explains the special role of high school in anime. From the outside looking there appears to be a strong strain of schoolgirl fetishism, but there’s much more to it than that – anime and manga is also packed with stories about male high school sports clubs, which to me seem like they must be singularly boring tales, and also love stories about high school students. TV shows and manga that feature these high school groups and love affairs and dramas are actually appealing not to some weird fetish for children, but to a strong, nostalgic streak in adults. High school is also the setting in which first love occurs in Japan, and at least historically may have been the only time when Japanese people were truly free to form partnerships out of love rather than convenience and good sense. This is why so much of anime and manga incorporates this setting, and this is why the schoolgirl’s uniform and the schoolboy’s baseball kit are so powerfully evocative in this medium. Watching the Koshien helps to make sense of the power of high school in Japanese popular culture. The Koshien packs all those years of yearning for the change to come, of waiting for something to happen, that sense that you are someone special who is ready to bud and explode into the world, into two weeks of intense emotion and self expression, all while sharing that deep bond with your peers that only late adolescents can genuinely and uncynically revel in.

And so, it can even make baseball interesting. Truly, Japanese high school students have magical powers! The final is tomorrow at 1pm Japan time, and I think it can be viewed live on the Asahi TV website. It’s the 100th anniversary of the Koshien, the final contest is between Kanagawa and Miyagi prefectures. Tune in, and enjoy the unrestrained passions of high school once more!

fn1: People who haven’t spent time in Japan in August tend to poo-poo reports of just how oppressive the heat is, but once one has spent a day here in that season, and wilted under the intensity of the heat, one readily adapts one’s view. Australians really aren’t used to the humidity, so for example although I grew up in a town where daytime temperatures are routinely 8C hotter than Japan in summer, without airconditioning, I find Tokyo in summer far worse. It’s not just the urban heat island effect, which in Tokyo is extreme: basically it’s as if a huge mass of hot air rolled in off the ocean at the end of July, squatted down and decided to stay. There is very little wind, night time temperatures do not drop below 25 or 26 C, and usually there are very few clouds, but it is still so hot that everyone sweats just sitting still. It’s exhausting at 32C, but when it hits 35C it’s potentially dangerous …

fn2: In Japan hairdressing is a macho job and male hairdressers are rough, macho figures, so this makes perfect sense.

British elections primarily interest me from a watching-the-train-continue-to-crash perspective, because I don’t think the UK has much to teach the rest of the world on how to run a social democracy well. The electoral system is completely broken; their Tories are the very picture-perfect image of the born-to-rule upper class who don’t care, their Labour party is weak and achieved its only long run in modern politics by electing a vampire; their only “functioning” industry is banking, and by extension the only economic plan either party has is to keep bankers rich and use the taxes to buy off everyone else; and their media are rotten. However, there are two aspects of British elections that interest me from a policy perspective: what they are going to do about the NHS, and what they are going to do about their terrible education system.

Before the election I was going to write about both of these, but got lazy. My first post was intended to be about the perils for Labour of “weaponising” the NHS (which I think they obviously have done), but the election outcome kind of made my point for me on that regard. However my second post was going to be about Labour’s education policy, which seemed to be the most sensible thing anyone had presented in the entire election period and thus, of course, the only thing that got no coverage. Sadly, that election policy is now going to be dead for at least five years, which leaves the Tories free to pursue their ideologically-driven and intellectually bankrupt, evidence-free Free Schools Policy.

The Labour education policy included two interesting and positive moves, and one very realistic and sensible principle. The first, and in my opinion biggest, move was a plan to make mathematics education compulsory to 18 years. As someone with a strong bias towards maths education, and someone who thinks that mathematics ability is more about education than talent, this plan really appealed to me as a way to turn around Britain’s woeful mathematics performance. The policy received support from an Oxford mathematics professor, du Simonyi, who is kind of famous, and also from the head of Britain’s National Numeracy charity, who said

We really need to challenge negative attitudes that assume that maths is a ‘can do’ or ‘can’t do’ subject. It is not. Everyone can – with effort and persistence – learn the maths they need for everyday life and work

Which is something I very strongly agree with, but something which apparently many British children are struggling to realize, with the result that Britain consistently underperforms its OECD peers in mathematics. It’s really sad to me that the country that did more than any other to advance statistics and mathematics has decided to abandon the census, and basically given away all its mathematical advantages to the USA and Europe, and Hunt’s policy seems like it would have been a first step to undoing this problem. I guess it’s just as well 16 year olds can’t vote though, because that policy alone would be enough to have the entire age cohort rushing to vote Tory …

The second policy, perhaps much less comprehensible outside of the UK, was a plan to abolish GCSEs and introduce a 10-year reform of education. This would break the long-standing division of British schools into technical and academic grades, recognizing that education in the 21st century isn’t just about getting a job and that a formal education until 19 is valuable to everyone in the modern world, not just those planning on going on to further education. This kind of reform finally breaks down an old-fashioned idea derived from Britain’s class structure, and essential to getting rid of that structure. Of course it’s not enough, but it’s a start. Furthermore, Tristram Hunt, the education spokesperson, made clear that they would not set forth on these reforms straight away, but would aim to enact them over two parliaments, giving teachers a break from the constant annoying reorganizations they are forced through every five years and building a coherent, long-term strategy for the system. This kind of long-term thinking is rare in any policy area from modern politicians, and when I read it before the election I was very surprised and hopeful that Britain might finally be making a positive step out of its education duldrums, and maybe even towards sensible policy.

Sadly, though, the election was dominated by Labour talking about the NHS and the Tories wailing about blue-skinned picts invading the mainland, and rational policy-making didn’t get a look in. So I guess now Britain gets the Tory bootheel it asked for. With a Tory majority you can bet that sensible education for the masses will not be part of the policy mix … I wonder if Tristram Hunt even kept his seat?

Another perfect moment in British colonial development

Another perfect moment in British colonial development

I am in London for a week doing some research with small area analysis, and on the weekend had a brief opportunity to actually see the city. As is traditional by now on my annual trips to London, I visited the World Wildlife Photography Exhibition (which was a bit weak this year, I thought), and having a bit of time to kill wandered up the road to the Science Museum. Here I stumbled on a small and interesting exhibition entitled Churchill’s Scientists, about the people that worked with Winston Churchill before, during and after the war on various projects, and Churchill’s powerful influence on British science.

This year will see the 70th anniversary of the end of the war, and you would think by now that popular culture of the victorious countries would finally have got to the point where it is able to handle a more nuanced analysis of the politics of that time than mere hagiography. It’s clear that the allied powers were uncomfortable about some of their actions during the war: the careful elision of Arthur “Bomber” Harris and his fliers from peacetime awards is an example of British squeamishness about the morality of the bomber war, but this squeamishness doesn’t seem to have manifested itself in any kind of clear critical reevaluation of the behavior of the allies at war, at least in popular culture. This silence is starting to be broken by, for example, Antony Beevor’s uncomfortable discussion of rape in Berlin, or his discussion of the treatment of collaborator women in Normandy; but it is generally absent from public discussion. Churchill’s Scientists is, sadly, another example of this careful and deliberate overlooking of the flaws of wartime leaders and their politics when presented in popular culture.

The exhibition itself is small and interesting, walking us through various aspects of the scientific endeavours of the pre-war and post-war eras. It describes the scientists who worked with Churchill, their relationship with him and the public service, and how science was conducted during the war. Churchill was very close friends with a statistician who advised him on all aspects of war endeavours, and also was very supportive of operational research, which was basically an attempt to revise wartime strategy on the basis of evidence. The achievements of these scientists given their technological limitations are quite amazing: drawing graphs by hand on graph paper to attempt to explain every aspect of the statistics and epidemiology of rationing, conducting experiments on themselves to understand the effects of low-calorie diets, and feverishly working to improve tactics and technologies that were valuable to the war. The post-war efforts were also very interesting: there is a life-size installation showing the original model of myoglobin, which was studied using x-ray crystallography and then built by hand using cane rods and beads to create the three-dimensional structure. There is a telling quote about how scientists became used to asking not “how much will it cost” but “how quickly can we get it done and what do we need?” There are also some interesting examples of how the wartime expectations of scientists translated into peacetime success: they had contacts in the ministries from their wartime work, they were used to having funds and knew how to raise money, and they had access to hugely increased resources as the ministries dumped wartime surplus in universities and research institutes. In the 1950s this translated into rapid advances in medicine, genetics, nuclear power and astronomy, all of which are documented in the exhibition.

There are, however, some political aspects that are overlooked. Currently in the UK there is an ongoing debate about whether to stop conducting the Census because it costs too much, and it is clear that since the war there has been a shift in funding priorities and a move away from the idea that science should be funded at any cost. I would have been interested to find out how this happened: did Churchill change his attitude towards funding for science or was this a post-Churchill trend? Was Churchill the last of the Great Investors? What did subsequent conservative party leaders make of his legacy and how do they talk about it? Why is it that the country that invented radar, that perfected antibiotic production, and that contributed more than any other to modern geographical statistics and demography, can no longer “afford” the census? Was the war a high point and an aberration in the history of British science funding? Did its successes distort the post-war scientific landscape and expectations? None of this is really described in the exhibition, which limits itself to Churchill’s positive legacy, and doesn’t seem to want to explore how it was undone. There is also a bit of attention paid to female scientists in Churchill’s war efforts, including women who developed X-ray crystallography and did important nutritional epidemiology research. But we know that much of the computational work done in the war and immediately after was also done by women, but they were slowly squeezed out of the industry after the war. I would have been interested in some description of what happened to all those female scientists and ancillary staff after the war – were they forced out of science the way women were forced out of factory work, or did Churchill’s support for women in science during the war permanently change the landscape for women in science? It seems clear that Watson and Crick’s work – initially sparked by x-ray images of the DNA that are shown in this exhibition – must have been built on the work of crystallography’s pioneers, who were women. But where did those women end up when the war effort wound down?

The other aspect of this exhibition that is sadly missing is a discussion of Churchill and his scientists’ darker sides. We are introduced to the exhibition through Churchill’s love of flying; the website for the exhibition quotes him talking about new technologies in aerial bombing; and the exhibition itself talks about his support for a British nuclear weapon. But nowhere in the exhibition is his enthusiasm for terror bombing discussed, nor the unsavoury way in which he developed this enthusiasm, running terror bombing campaigns against Iraqi tribespeople in the 1920s. Arthur Harris is only presented once in the exhibition, dismissing a biologist who proposed a campaign of tactical bombing of railway junctions (he “wasted his time studying the sexual proclivities of apes,” was the dismissal); but nowhere is the corollary of this position – Harris’s lust for destroying cities – mentioned, or the extensive scientific work that went into developing the best techniques for burning civilians alive. In the year that western governments will demand Japan apologize for its wartime atrocities (again!), one would think they could at least mention in an exhibition on wartime science the extensive research that went into perfecting the practice of burning Japanese civilians alive.

In case one thinks this might have been just an oversight on the part of the curators, later we see a more direct example of this careful elision, when the exhibit focuses on Britain’s post-war nuclear weapons program. Again, we have been presented with Churchill’s direct interest in blowing stuff up; here we are shown video of a nuclear test, and discussion of the research that scientists were able to do on the environmental and physical effects of the bombs. The exhibition doesn’t mention that many of these tests, conducted in Maralinga in Australia, were conducted on land that Aborigines had been expelled from and were unable to return to. It also doesn’t mention the contamination of Aboriginal customary lands, any possible harmful health effects for Aborigines living in the area, and the controversies of the Maralinga inquiries and subsequent compensation for soldiers and workers. Not even a one sentence reference.

Given that we know Churchill was a deeply racist man who supported colonialism and had no interest in the rights of non-white British, it seems hardly surprising that he might have had a slightly cavalier attitude towards ethics in research and military tactics where it was directed against Iraqi tribesmen or Australian Aborigines. It seems like 70 years after the end of the war it might be possible to start talking about this stuff honestly outside of academia, and to publicly reevaluate the legacy of men like Churchill, and many of his senior scientists, in the light of everything we know now, rather than simply portraying all their efforts through only the lens of wartime heroism. Churchill was undoubtedly a great man and a powerful leader, and the world owes him a debt of gratitude. He was also a racist and a colonialist, and some of the decisions he made before, during and after the war may not have been either right or the best decisions for the time. It also appears that despite his greatness, the legacy of his interest in science and education was soon undone, and the reasons for this are important for us to consider now. What does it say about Britain that 70 years after the end of the war it is still not possible to honestly assess Churchill’s wartime efforts but only to extol his great contribution to science; yet 70 years later his contributions to science have been so far wound back that the government is considering abolishing the Census? Does such hagiography benefit Britain, or British science? I would suggest not.

This year is the 70th anniversary of the end of the war, and we are going to see a lot more public discussion of the actions and contributions of the great people of that time. I fear that this discussion is going to be very shallow, and sadly empty of any attempt to critically reassess the contributions of the people involved, and how they shaped our post-war culture. This exhibition is a good example of how the war will be presented this year: stripped of moral context, all uncomfortable truths banished from discussion, and all long-term ramifications for post-war politics and culture carefully sanitized to ensure that no difficult questions are asked, or answered. Perhaps we aren’t doomed to repeat history, but I think this year at the very least we are going to be bored stiff by it.

Every year I have to teach a basic statistics class to new Master’s students, but every year I find my students come from very diverse mathematical and science backgrounds without necessarily any understanding of the fundamentals necessary to grasp a classical statistics course. I have one year to polish these students up to a level where they can complete a fairly demanding research thesis in their second year, and I also have to get them understanding the fundamental principles of statistics so that when they move on from my department they don’t embarrass themselves or others. Of course, I started teaching statistics to these students from the framework in which I learnt it, but I soon realized that the concepts just weren’t sticking. Not just because the students didn’t understand some of the maths, but because translating ideas from mathematical notation into solid concepts is tough for people who know the maths but don’t have a really strong background in it. It’s like learning something in a second language – you can’t think about the language and grasp the concepts at the same time. But a lot of statistics ends up being done on computers, and in practice people don’t need to know the maths as much as they need to have a good grasp of the concepts.

In addition, I noticed that a lot of what I was teaching based on my classical experience of learning stats in the 1990s was basically deadweight, and some of this deadweight was tough to grasp. So I started thinking about changing the way that I taught the principles, to try and move away from unnecessary mathematics, to remove some of the historical details that crowd a basic stats course long after their expiry date, and to try to find new, practical ways to teach some of the core principles. Because when I sit back and think about the core principles of statistics, there are really only two parts that are tough, and it is those two parts that are, I think, most commonly taught in a clunky and old-fashioned way – but they’re also crucial components to the whole edifice of basic statistics, and I think the alternative to teaching them is often seen as jumping straight to computers, which is in many ways worse. So here I want to outline my revisions to teaching statistics, and the principles behind them.

In a nutshell, I have decided to teach distribution theory by starting with a practical class based on dice; and I have also completely ditched the use of standard tables of distributions. I’m in the midst of thinking what else in the classic statistics curriculum is unnecessary or needs to be radically re-taught.

Teaching distribution theory with dice

This year I trialed a class on distribution theory that I taught using 10-sided dice. My distribution theory class is 3 hours long, so I spent 1.5 hours on a practical with dice, and then I introduced the mathematics of the distributions, as an addendum really to playing with the dice. For the dice class, I divided the students into pairs and gave them 10d10 each. I also handed out an excel spreadsheet that was pre-designed to enable them to generate probability distributions from counts of values they rolled – they could have written this themselves at the beginning of class but this always slows class down, and I don’t want the students wasting time on or getting confused by something which is at this stage just a tool, so I prepared the basic spreadsheet for them.

The practical was then divided into these stages:

  1. Generate a uniform distribution: Choose ONE ten-sided die and roll it multiple times (I suggested 30 times), counting the number of times each number was rolled and entering them in the spreadsheet. Graph the resulting probabilities. Is the die loaded? [In fact most students had loaded dice – one group managed to roll almost entirely 1s and 4s (I think) and another group rolled very few high numbers]. Then show them the theoretical distribution on the board. This distribution is so simple that students immediately understand it. This is the key to linking a practical sense of numbers in with the principles of distribution functions; we will build up to more complex distributions, and we will also be leveraging this question about whether the die is loaded for a bit of a Bayesian chat.
  2. Generate a bernoulli distribution: Ask one person from each group to pick a number between 6 and 10; this is the threshold on their d10 for a success. Make sure they use the same d10 that they just built a uniform distribution from. Again, get them to roll about 30 times, and generate the resulting distribution. This distribution is so trivial that the students will be wondering what the point is, but it gets you to a very simple couple of questions that bear on the nature of statistical tests. After about 30 rolls the proportion of successes will be pretty close to the “true” proportion – unless their die is loaded. So I asked the students what they thought the probability of success should be, and they all immediately calculated it as the sum of probabilities in the prior uniform distribution. I asked them what the theoretical probability should be, and again they could easily answer this trivial question – and then I asked them to suggest ways to test whether their die differed from the theoretical probability. This is all preparatory to talking about cumulative distribution functions, probability mass and (later) methods for statistical testing. Often at this stage in my class some students don’t even really know why we would do a statistical test, and by posing these questions I present a natural example of a test you might want to do. I also gave a brief explanation of Bayesian statistics here (in a very heuristic way), explaining the relationship between the Bernoulli distribution and the prior Uniform distribution they had rolled, and pointing out how their knowledge about that prior affected their judgment of the true distribution of the bernoulli. This is all so intuitive with the dice in your hand, that it’s impossible to confused by the theory. Whereas if I had started from Bayes’ theorem and the formula for a Bernoulli distribution the students would be in great pain, even though the maths of each of these ideas is not complex in and of itself.
  3. Generate a sum of uniform distributions: roll 2d10 30 times. Plot the resulting distribution. Of course this distribution is already halfway to being normal (it looks normal), and although we haven’t introduced the maths of the normal distribution everyone knows it from popular culture, so when you say it looks a bit like the bell curve they immediately get it. You can also ask students the probability of a 2, and explain the probability of e.g. a 10; this helps everyone to see in a very practical way just how distorted the probability distribution becomes from just adding two uniform distributions together (I have been doing stats for 20 years and I still think this is a really cool kind of magic!) They can see it in their distribution plot, and they can calculate the probabilities easily from just thinking about the dice. By now everyone is thinking about distributions in a natural and intuitive way, and we haven’t come up with a single actual formula yet.
  4. Generate a binomial distribution: by rolling all 10 dice and adding together the successes, following the success rule in 2. Again, this is an example of the Central Limit Theorem, but the probability calculations for the extreme values are even more potent examples of how adding together random variables makes them behave very differently. They’re also now building a real distribution, and can get a real sense of how probability distributions work to describe the real probability of particular events.

Finally, I have the students build cumulative distribution functions, and relate the calculation of probabilities in the cumulative distribution function for the uniform distribution to the calculation they performed in step 2. Having done all of this they are very comfortable with the concept and application of distribution functions. For the second 1.5 hours I then introduced the equations for these distributions, then introduced the normal distribution and plotted it, and talked about its properties. Where they would previously have been looking at equations that are quite daunting for people without much mathematical background, now they were looking at equations they were already familiar with. Knowing the shape and method of forming these distributions, they can focus on the only important point, which is the relationship between x and the probability that comes out of the function.

Ditching tables

The next step of distribution theory in a traditional stats class is then the tedious task of learning how to calculate cut points of distributions from tables. Having been through the dice exercise the students already have an intuitive feel for cut points and for cumulative distribution functions, so I don’t bother showing them tables. Instead, I give them an excel spreadsheet that contains the functions to do these calculations, and we work through some examples together. I then explain about why we used to have to use tables, but don’t anymore. I explain that the properties of the normal distribution (stability under shifts of location and scale) were useful back in the day when we only had one table to work from, but they’re not anymore. In the past I have noticed that this transformation of the normal distribution really kills a lot of students, it’s really hard for non-mathematicians to think about. But it’s really not important anymore to have to learn about tables. I have new textbooks which still have tables in the back. Why? When was the last time you used a statistical table?

Putting history in its proper place

This shift to teaching cut-points of distributions first practically and then using Excel is part of a move to dump some of the parts of statistics that are largely of historical value only. A lot of classical statistics was invented for a period when experiments were hard to do and very expensive, but they just aren’t as important anymore, or they have been superseded. For example no one uses correlation as a measure of the relationship between two variables anymore – we just use regression, because it’s much more flexible and by associating the relationship between two variables directly with the line through their scatter plot you force students to think about the possibility that a linear model is inadequate. So why bother teaching correlation in this context at all? I teach correlation as a stepping stone to understanding the challenges of longitudinal modeling, and so that students can understand the concept of non-independent observations – not because correlation is a useful tool in its own right – but I think a lot of courses teach it as if it still had the importance it had when it was first used back in the day. I think we could probably even – as a whole community – rejig the way we write basic statistical tests (such as the Z test) so that they don’t rely on calculating a standardized test statistic – there’s no reason modern statistical software needs to calculate a test statistic standardized to N(0,1), but the need to standardize adds a layer of complexity to understanding the theory of testing. Could we rejig our statistical practice so that this standardization process is recognized as a throwback to a time when we only had tables of cut-off points for N(0,1)?

Should we forget about the T-test and non-parametric tests?

Following on from these questions, I wonder about the T-test and non-parametric tests. If you are working in epidemiology it is highly unlikely that in the modern era you will only have 30 or 40 observations. You won’t get into the Lancet without doing a major multi-country study with tens or hundreds of thousands of observations. In this case, the difference between a T-test and a Z-test for a mean is going to be … irrelevant. Should we consider teaching T-tests as a historical oddity, something that you only really need to care about in a few rare fields of modern science? Every other field of physical science makes approximations all the time, but for some reason in statistics we insist on carefully distinguishing between Z- and T-tests, instead of saying “the assumptions of the Z-test don’t work in small samples (that you shouldn’t be relying on anyway).” I know this is not theoretically correct, but with students outside of the physical/engineering sciences, it just adds extra confusion. I compromise on this by explaining the test in full as a basic test, but then pointing out how irrelevant the difference is in the modern world of massive samples.

I think the same might apply to non-parametric tests – we just don’t use them, and the theory of non-parametric statistics is so much richer and more profound than one would ever realize from studying the Wilcoxon Rank-Sum test. Should we bother with tests that are under-powered, and that get many students mired in confusion over when the Central Limit Theorem holds, and what test to use in what setting? Especially in epidemiology, where we will almost always be working with binary outcomes?

My students seemed to enjoy and benefit from the dice class. I certainly find they grasp the issue of critical points in the distributions more easily if they work from Excel than from tables, and I think it helps them to get a sense of what’s important if we teach some other aspects of the topic as being accidents of history rather than essential parts of theory. Are there other things that we can change? Are there other ways we can make this very beautiful, profound topic interesting and accessible to people with limited mathematical background and even more limited mathematical patience? I think there are, and we should strive to find them.

 

Next Page »