Today, doing a little task in R, I had cause to look up the following “warning” that appeared after compiling a script:

Warning message:
In readLines(file) : incomplete final line found

I couldn’t figure out what this warning meant, because the script ran fine, so I did a web search and I came across this exemplary example of why working with R really sucks: the help files are completely useless, the warning messages are cryptic and meaningless, the inbuilt editor is broken, there is no standardization of externally-developed editors, and the people who provide help online are some of the rudest people you will ever meet in computer science. This simple warning shows it all at once. I’ve complained about the dangers of R’s cryptic and meaningless warning messages again, but this example should really serve to show how they also cry wolf in a really unhelpful way.

The linked page is a message board of some kind (I think a reproduction of the “official” R boards on a another site) where a person called Xiaobo.Gu has posted up a request for help in decoding the above warning message. The request is polite enough though not voluminous, asking “Can you help with this?” but the first response (from someone with 7328 posts on this board!) consists entirely of the following:

Help with what? You got a warning. And it had information that should
tell you how to edit the file if the warning bothers you.

What is the point of a reply this rude and dismissive? This person actually took the time to reply to a post, in order simply to say “I won’t help you.” On a message board explicitly intended to help resolve problems with R. In addition to being rude it’s arrogant: there is no information abou thow to edit the file, just a pointer to the final line. We will shortly see the cause of the error, and it should be clear that no one in their right mind would consider the warning to have provided “information” of any form.

The next reply admonishes the original poster for failing to follow the posting rules (though doesn’t say how they were breached – so is essentially another contentless reply!) and then includes a little sneering aside about the way Windows encodes ASCII text that makes me think the developers of R have an elitist refusal to engage with Windows’s flaws. It then reveals that the warning is harmless and only appears in R version 2.14.0 (unpatched).

Why bother putting such a warning into a program? Whose idea was it to put a harmless warning in a single version of R, and why and how can a warning be a warning and also be harmless? Either something risky is going on, or it’s not. If it’s not, don’t waste my time with red text.

Finally another person comes along to sneeringly answer the question and provide actual information:

A warning message such as this could not be clearer.
It means that the last line of the file does not end with a <newline> sequence ==> the final line of the file is incomplete.

In an editor go to the end of that line and press <Enter> or <Return>
And save.

Alternatively configure your editor to always terminate the last line of a file with  a <newline> sequence.

This is a sparkling gem of passive-aggressive “help.” I can see a simple way in which the warning could be “clearer:” It could say “you did not press enter or return.” Then, it would be clearer. As it is, there is no information about what is missing in the final line: it just says it is “incomplete.” How can anyone claim that a warning such as this could not be clearer?

But then, just to top it off, this commenter has suggested that the poster configure their editor to “always terminate the last line of a file with a <newline> sequence.” This might seem to be reasonable advice, except that I get this warning in every script I write and I am using the built-in editor! This means that some muppet at C-RAN shipped a version of R with an editor configured to write scripts in such a way that they would trigger a warning. By default. Then, the very first patch they released got rid of the warning. wtf!? Is this what passes for quality control at C-RAN?

This is why wherever possible I use Stata for my work. I need software I can trust to produce the same results every time I run it, that isn’t going to waste my time with meaningless warnings and threats in glaring red, that isn’t configured to do things wrong by default, and that performs all calculations correctly. In order to trust that my stats software will perform all calculations correctly, I really need to know that the designers have some degree of basic quality control. When I see stuff like this – simple programmatic failings in things like the default settings of the script editor – I find it really hard to believe that the correct attention has been paid to, say, the way that the program performs adaptive Gaussian quadrature.

I also expect that the people who design this stuff will be polite when answering questions. I don’t need some passive-aggressive guy on the internet telling me off for failing to understand an extremely vague warning message that is only troubling me because C-RAN don’t have adequate quality control. The replies on that thread should have been polite requests for more information followed by an apology and a promise to fix this problem – or, if these people aren’t directly involved in C-RAN (and we know one of them is … one of R’s designers is on that thread) then a suggestion about how to alert the developers to the problem. Sneering and bullying – no thanks. I don’t get that when I contact Mathworks for help with Matlab, no matter how stupid my request.

This is why when I teach my students about stats packages I tell them a) you can’t trust R and b) it has a nasty community. I teach them its value for automation and experimental stats, and warn them away from using it for anything that has to be published in serious journals.

I think R is just another example of how dangerous it is to run your business on open source software, though I’m sure there are times when it’s safe. And I think it would be fascinating to see a detailed textual analysis comparing the message boards of an open source community (linux, R, latex) with a proprietary product like Stata, because in my experience there’s a world of difference between the two communities. Why  that difference exists would not only be a fascinating anthropological study, but would no doubt be of relevance to the scientific study of neckbeard behavior, because I have a strong suspicion that neckbeards are the dominant species in the open source world. Will an anthropologist somewhere take on the task?

 

Advertisements