Risky Giant Steps Can Solve Optimization Problems Faster

Allison Parshall in Quanta:

Optimization problems can be tricky, but they make the world work better. These kinds of questions, which strive for the best way of doing something, are absolutely everywhere. Your phone’s GPS calculates the shortest route to your destination. Travel websites search for the cheapest combination of flights that matches your itinerary. And machine learning applications, which learn by analyzing patterns in data, try to present the most accurate and humanlike answers to any given question.

For simple optimization problems, finding the best solution is just a matter of arithmetic. But the real-world questions that interest mathematicians and scientists are rarely simple. In 1847, the French mathematician Augustin-Louis Cauchy was working on a suitably complicated example — astronomical calculations — when he pioneered a common method of optimization now known as gradient descent. Most machine learning programs today rely heavily on the technique, and other fields also use it to analyze data and solve engineering problems.

Mathematicians have been perfecting gradient descent for over 150 years, but last month, a study proved that a basic assumption about the technique may be wrong. “There were just several times where I was surprised, [like] my intuition is broken,” said Ben Grimmer, an applied mathematician at Johns Hopkins University and the study’s sole author. His counterintuitive results showed that gradient descent can work nearly three times faster if it breaks a long-accepted rule for how to find the best answer for a given question.

More here.