What the LHC Computing Grid Can Teach the Internet

28f33c7f0110f604a785d4308d3ad234_1 Mark Anderson in Scientific American:

Before the year is out, the LHC is projected to begin pumping out a tsunami of raw data equivalent to one DVD (five gigabytes) every five seconds. Its annual output of 15 petabytes (15 million gigabytes) will soon dwarf that of any other scientific experiment in history.

The challenge is making that data accessible to a scientist anywhere in the world at the execution of a few commands on her laptop. The solution is a global computer network called the LHC Computing Grid, and with any luck, it may be giving us a glimpse of the Internet of the future.

Once the LHC reaches full capacity sometime next year, it will be churning out snapshots of particle collisions by the hundreds every second, captured in four subterranean detectors standing from one and a half to eight stories tall.* It is the grid’s job to find the extremely rare events—a bit of missing energy here, a pattern of particles there—that could solve lingering mysteries such as the origin of mass or the nature of dark matter.

A generation earlier, research fellow Tim Berners-Lee of the European Organization for Nuclear Research (CERN) set out to create a global “pool of information” to meet a similar challenge. Then, as now, hundreds of collaborators across the planet were all trying to stay on top of rapidly evolving data from CERN experiments. Berners-Lee’s solution became the World Wide Web.

But the fire hose of data that is the LHC requires special treatment. “If I look at the LHC and what it’s doing for the future,” said David Bader, executive director of high performance computing at the Georgia Institute of Technology, “the one thing that the Web hasn’t been able to do is manage a phenomenal wealth of data.”