Matthew Hutson in The New Yorker:
Deep learning, the artificial-intelligence technology that powers voice assistants, autonomous cars, and Go champions, relies on complicated “neural network” software arranged in layers. A deep-learning system can live on a single computer, but the biggest ones are spread over thousands of machines wired together into “clusters,” which sometimes live at large data centers, like those operated by Google. In a big cluster, as many as forty-eight pizza-box-size servers slide into a rack as tall as a person; these racks stand in rows, filling buildings the size of warehouses. The neural networks in such systems can tackle daunting problems, but they also face clear challenges. A network spread across a cluster is like a brain that’s been scattered around a room and wired together. Electrons move fast, but, even so, cross-chip communication is slow, and uses extravagant amounts of energy.
Eric Vishria, a general partner at Benchmark, a venture-capital firm in San Francisco, first came to understand this problem in the spring of 2016, while listening to a presentation from a new computer-chip company called Cerebras Systems. Benchmark is known for having made early investments in companies such as Twitter, Uber, and eBay—that is, in software, not hardware. The firm looks at about two hundred startup pitches a year, and invests in maybe one. “We’re in this kissing-a-thousand-frogs kind of game,” Vishria told me. As the presentation started, he had already decided to toss the frog back. “I’m, like, Why did I agree to this? We’re not gonna do a hardware investment,” he recalled thinking. “This is so dumb.”
More here.