This is one of many stories reporting a recent study that crunched music data from 17,000 Billboard hits. I read the study a few weeks ago and thought it was interesting and worthwhile, though I have a few reservations. It’s now broken out into the world of hot takes and comment boxes, and is getting monstered by music critics.
This is not exactly a surprising response. The stories are taking data scientists’ interpretation of their own data (often a bit of a minefield), retrofitting it clumsily to decades-long arguments about pop value and history (computers prove the 80s DID suck!), and then luring said scientists into offering dumb generalisations about that stuff.
But let’s try and present the work in a better light and see what it’s actually doing.
The point of any big data work like this is to find underlying patterns in data which individual observers wouldn’t necessarily spot. Data scientists will sometimes take that claim a step further and hint that these patterns are a deeper and more valid truth than human analysis can muster. That ain’t necessarily so. Spotting patterns is fairly trivial - working out what those patterns mean and whether they’re relevant is the hard work. Like a lot of data types, these researchers have been lured into drawing bogus oppositions between their quantitative work (hard facts) and qualitative work (mere anecdote) - in this case the qual work is rock writing. But big data is valuable because it meshes with qual work - suggesting possibilities, routes of interpretation, which the critic can then follow, and perhaps dismiss.
So what’s going on in this study?
The first thing to say is that the study isn’t really looking at the history of pop. It’s looking at the history of the Billboard Top 100 charts - that’s a limit on its dataset, and a very important one. The history it’s telling is a history of crossover - at what point have new musical ideas reached a level of sales and radio acceptance that gets them into the Billboard charts?
Immediately that’s strike one against a headline finding - that the most dramatic change in US chart history came with the sudden inrush of hip-hop in 1991. Asking an ‘anecdotal’ rock critic would have quickly hit on an explanation: the shift in chart accounting that happened with the introduction of Nielsen’s Soundscan system. The dramatic change is only dramatic because it’s an artificial barrier on hip-hop sales being removed. A potentially juicy finding becomes an obvious one.
Let’s go back, though, and talk about what the study did. It took chunks of 17,000 songs and analysed them on a few musicological metrics - “chords, rhythms, and tonal elements” as the LA Times story vaguely puts it. It then ran a cluster analysis on those songs to come up with a dozen statistically distinct styles, whose rise and fall it tracked. Large shifts in the balance of these styles - whether sudden or more gradual - are the evidence for the overall theory the paper puts forward: that chart music diversity remains mostly steady but is punctuated by revolutions.
But what surged in 1991 isn’t even “hip-hop”, it’s “Style 2″, which can be tentatively identified with hip-hop by last.fm tags. Several of the styles, though, SHARE tags, and all include multiple tags, which makes it very hard to work out what any of them might actually be.
This is the most frustrating part of the whole study, incidentally. Musical genres are social as well as musical constructs - so getting under the skin to find levels of purely musical commonality is a very interesting idea. But there’s nothing to grip onto here - the researchers didn’t even list a few songs which had high centricity with each cluster (i.e. fitted better into that style than into any other). We have really nothing much to go on when trying to imagine what these 12 macro-genres actually are.
So the revolutionary jumps in chart music are frustrating to pin down. The middle one - in 1982 - can be tentatively identified with New Wave. It’s the most gradual and smallest of the three, which makes sense - but like 1991 and Soundscan, there’s probably a technological explanation: this is the era of MTV, which exerted a large influence on sales (and presumably playlists) and helped consolidate shifts in music that had been building for a while.
And the third jump is in 1964 - the British Invasion. Again, this sounds obvious - too obvious to be worth studying, you might think. But the authors say something interesting. When they look at their musical metrics as a whole, they see a big shift. But they don’t see the Beatles and Stones causing the shift - the musical trends their songs represented were on the rise anyhow.
This is a genuinely interesting data finding, which (unlike the 91 and 82 bumps) doesn’t rely on distribution and measurement changes. Does it pass the qualitative sense check?
Yes and no: it is good supporting evidence for the theory that the early 60s in pop are a time of continuity, not total rupture. Pre-1964 US pop was dismissed critically for a long time - the standard Boomer-era narrative was that the Beatles came and saved music from blandness. This story has been chipped away at for years by critics, but the study wants to prove their point musically:something like the British Invasion was the inevitable result of where trends in chart pop were heading anyway, The Beatles and Stones just nudged things along.
Except something DID happen - here’s where the quant interpretation needs the qual one to make sense. The Beatles were loved for far more than just the musical metrics the study uses: image, haircut, peer pressure, hotness, wit, potentially even singing voice - these simply don’t feature in the study. But they’re a crucial part of pop.
So rather than rejecting the data because it doesn’t tally with received wisdom, we can use it to refine that wisdom. What happened in 1964, we can hypothesise, was a pop revolution rather than a musical one. Bands appeared that intuited which way the musical wind was blowing - probably because, get this, they’d spent years studying it as covers bands - and then used it to land big shifts in image, attitude, irreverence, etc.
To me, that sounds like a more nuanced storyline than flying the “genius” flag one more time or smugly pointing at the data and saying “case closed”. But its a storyline that needs qualitative, interpretative intervention - i.e. a critical approach - to get to.
We’re going to see a lot more of these studies quantifying culture. They aren’t going away in a hurry. What can we conclude from this example about how to get the most out of them?
1. Follow the data source - in this case, the Billboard Charts are not “pop” or even a random sample of same. That doesn’t mean it’s not a story worth telling, but beware.
2. Trust the data, not the interpretation - there’s no reason to believe any of the data here is wrong, but the interpretations are mostly weak. If you’re an expert in the topic, you can and should do a better job. Analysis can always be discarded and data kept, even if its analysis by the people mining the data.
3. Try and get hold of the data yourself - sometimes it’s published. Often it’s not, but I’d love a look at those genre clusters, for instance.
4. For critics, don’t be afraid of data - its job is to spot things you can’t. It’s an augmentation, not a replacement.
5. Obvious is fine - if crunching the data is bringing up obvious things, that’s fine, it shows it’s working. Resist the temptation to report them as a revelation just because you have the numbers, though. Especially if you’ve missed really obvious explanations (eg SoundScan).
6. Mixing qual and quant gets you the best story. Should be obvious, often isn’t - combining the numbers in data with the interpretive power of experience is how you get to more interesting places.
7. Finally, if you’re scientists writing one of these papers - get a critic or two in to peer review and sense-check it! It really won’t hurt.