What a fossil revolution reveals about the history of ‘big data’

David Sepkoski in Aeon:

Idea_sized-stenopterygius_fossilIn 1981, when I was nine years old, my father took me to see Raiders of the Lost Ark. Although I had to squint my eyes during some of the scary scenes, I loved it – in particular because I was fairly sure that Harrison Ford’s character was based on my dad. My father was a palaeontologist at the University of Chicago, and I’d gone on several field trips with him to the Rocky Mountains, where he seemed to transform into a rock-hammer-wielding superhero.

That illusion was shattered some years later when I figured out what he actually did: far from spending his time climbing dangerous cliffs and digging up dinosaurs, Jack Sepkoski spent most of his career in front of a computer, building what would become the first comprehensive database on the fossil record of life. The analysis that he and his colleagues performed revealed new understandings of phenomena such as diversification and extinction, and changed the way that palaeontologists work. But he was about as different from Indiana Jones as you can get. The intertwining tales of my father and his discipline contain lessons for the current era of algorithmic analysis and artificial intelligence (AI), and points to the value-laden way in which we ‘see’ data.

My dad was part of a group of innovators in palaeontology who identified as ‘palaeobiologists’ – meaning that they approached their science not as a branch of geology, but rather as the study of the biology and evolution of past life. Since Charles Darwin’s time, palaeontology – especially the study of the marine invertebrates that make up most of the record – involved descriptive tasks such as classifying or correlating fossils with layers of the Earth (known as stratigraphy). Some invertebrate palaeontologists studied evolution, too, but often these studies were regarded by evolutionary biologists and geneticists as little more than ‘stamp collecting’.

The use of computers to analyse large data sets changed this image – particularly because it allowed palaeontologists such as my dad, and his colleague David Raup at the University of Chicago, to expose patterns in the history of life that emerged only on very long timescales. One of their signature contributions was the discovery that life has experienced major, catastrophic mass extinctions at least five times in the Earth’s history (this is why many people now refer to the current biodiversity as the ‘sixth extinction’).

More here.