by Hari Balasubramanian
The Shortest Path
How does Google Maps figure out the best route between two addresses? The exact algorithm is known only to Google, but probably some variation of what is called the shortest path problem has to be solved [1]. Here is the simplified version. Suppose we have a network of nodes (cities, towns, landmarks etc.) connected by links (roads), and we know the time it takes to travel a particular link. Then what is the shortest path from a starting node A to a destination node D?
In the instance above, there are 4 nodes. The rectangles provide the link travel times. The B-C link takes 2 time units to travel; the A-D link takes 5; the C-D link takes 1; and so on. The five possible routes from A to D are: A-D; A-B-D; A-C-D; A-B-C-D; and A-C-B-D. The easily spotted shortest path is A-C-D, with a total length of 3. But what if a network has hundreds of nodes and links? It would be impossible to visually identify the shortest path. We would need an efficient algorithm. By that I mean an algorithm whose execution time on a computer stays reasonable even when the problem size – the number of nodes or links in the network – gets bigger.
In 1959, Edsger Djikstra published just such an algorithm. Djikstra's Algorithm doesn't simply look for all possible routes between the start and destination nodes and then choose the shortest. That kind of brute-force approach wouldn't work, given how dramatically the number of possible routes increases even with a slight increase in network size. Instead, Djikstra's Algorithm progressively explores the network in a simple yet intelligent way. It begins with the start node A, looks at all its immediate neighbors, then moves on to the closest neighbor, and from there updates travel times to all as yet unvisited nodes if new and shorter routes are discovered. I am fudging important details here, but this basic procedure of moving from a node to its nearest neighbor and updating travel times is repeated deeper and deeper in the network until the shortest path to the destination is confirmed. Wikipedia has a good animation illustrating this.
How fast does the algorithm run? Let's say there are V nodes. Then, in the worst case, Djikstra's Algorithm will take in the order of V x V steps to compute the optimal path. An algorithm like this that grows polynomially with the problem size is something we will call efficient (of course lower order polynomials, such as the square function, are preferable; V raised to the power 50 wouldn't be helpful at all). So a 10-node problem might take around 100 steps; a 1000-node problem will take 1000000 steps. This increase is something a modern day computer can easily handle. The algorithm might do much better in most instances, but the worst case is commonly used as a conservative measure of efficiency. There are faster variations of Djikstra's Algorithm, but for simplicity we'll stick to the original.
