Saturday, June 08, 2013

Academic publishing and data

Two pieces in today's SMH on economics raise some interesting questions about how research should be done in the 21st century - on which I think I've written before.

The first is another story on the Reinhart and Rogoff paper that make the prediction that once a debt to GDP ratio hit 90% a country would fall off a cliff.  That turned out to be a conclusion reached only because of an error in a spreadsheet. 

Part of the stoush is about how hard it was for people to test the original paper - no one could replicate the result.  There is a very simple 21st century solution that ALL the data and calculations in support of a paper should be available on the web. 

But it also raises the whole validity of the "peer reviewed research" model.  The paper was peer reviewed - which didn't prohibit the error.  Our good friend Thomas Kuhn and his paradigms would easily explain that - the "small government" conclusion of the paper would have matched the dominant thinking of the relevant journal.

Before the "real sciences" crow, they are not immune to the same errors. Be they fraud like William McBride's or a stuff up. 

But the other question then emergs is about availability of data itself.  The other article reports on this weekends Triple -j Hottest 100 of the last twenty years and cites two researchers who have "analysed voting trends from every Hottest 100."  This got me excited because I have tried a couple of times to get the core voting data set from the ABC, but they won't release it.

It turns out that the researchers don't seem to have had it either.  An earlier article reveals his was only an econometric study on presence, and possibly rank, in the Hottest 100 - not votes.  The conclusions of that study are hardly surprising given the research about influence effects mentioned in another blogpost about the Hottest 100.

The votes is an interesting topic because it would provide a good dataset to test the long-tail thesis which is often used to argue that in the online world we have no reason to fear media concentration.  If the thesis holds the votes have a power-law distribution over the whole set of songs, if not the votes suddenly decline faster at some point. Where that point is is more than just of passing interest.

No comments: