Some thoughts on the science of queueing

by Hari Balasubramanian

ScreenHunter_464 Jan. 20 11.10In Spring 2003, as a first year doctoral student at Arizona State University, I took a class on queuing theory. This refers to the science that has its focus on reducing delays, irrespective of where they may be experienced: at a traffic signal; over the phone to speak to a representative (music and ads playing in the background); or, more critically, in a virtual queue of hundreds of patients, each waiting for an organ transplant.

My class required each student to do a hands-on project in a real setting. I chose mine to be the nearest supermarket, a busy metropolitan Safeway store. Broadly speaking, the two big pieces in any queueing study are: (1) how quickly people/requests arrive, and (2) how quickly they are serviced. The interplay of these determines the probability of delays. So for my project at Safeway, I decided to focus primarily on collecting data on customer arrivals and checkout times.

I talked to the store manager, Scott, about my plan. He was a tall, blond man, dressed formally, gentle but with a clear sense of authority. He was immediately worried and didn't get what I was up to. I tried to convince him by suggesting that my data might be useful. Scott was (quite rightly) skeptical, but I got some sort of an affirmation from him. You can't ask customers any questions, he said sternly, and I promised him that.

I began with arrivals. One afternoon in April, I took a stopwatch, sat next to the bicycle stand outside the store entrance – perhaps seeming like a homeless person – and collected the arrival time of customers from 5-6 pm. I would mark the exact time a person/group entered the store (I considered a group to be a single person arrival if all members came together in the same car). With such data, you can estimate the average number of arrivals in a particular time period – say a minute. Equivalently, you can measure it by the average time between any two successive arrivals. So if 4 people arrive every minute on average, then the time between any two successive arrivals is expected to be 15 seconds.

Averages don't tell the complete story, though. What one needs is a sense of the distribution of arrivals. Suppose the first 11 arrivals happened at seconds 0, 2, 10, 12, 14, 18, 22, 24, 26, 28, 32 (i.e first person came at 0, the second at 2, the third at 10…). Using these we can calculate the time between successive arrivals: 2-0, 10-2, 12-10, 14-12…and so on. In 6 out of these 10 observations, the time between successive arrivals is 2 seconds; in 3 of the 10, it is 4 seconds; and 1 case it is 8 seconds. This is what I mean by a “distribution”; and it is the distribution that gives us a sense of how variable arrivals are, something an average value — 3.2 seconds in the above case – cannot reveal.

When I plotted the distribution for the 222 observations I'd collected between 5-6 pm at the store, I obtained the graph below. The horizontal axis, divided into 5-second intervals, indicates the time between successive arrivals; and the height of the vertical bars indicates the frequency, i.e. how many of the observations fell in a particular interval. For example, 47 of the 222 observations fell in the 0-5 seconds range; 48 of them fell in the 6-10 seconds range; 34 of them fell in the 11-15 seconds range, and so on. There were 5 observations in which no one came to the store for a minute or more. The average time between successive arrivals was around 16 seconds; this translates to 3.75 customer arrivals per minute on average. But with the greatest frequency falling between 0-5 and 6-10 seconds, notice how incorrect one would be if one simply planned based on the average!TimeBetweenArrivals3

The data I'd collected was only for one hour of a day (a single sample), but the general shape of the distribution – bars that, for the most part, decrease in height from left to right in an inverted bow shape – is not accidental. This was my first hands-on confirmation of what is known as the Exponential Distribution. In simple terms, it suggests that short intervals between successive arrivals have the greatest probability; the middle part of the curve, where the mean is typically located, has the next greatest probability; and once in a while, a long time elapses between successive arrivals. The terms short and long, of course, are meant in a relative sense and depend on the context.

It may be useful to contrast the Exponential with the well known bell curve or normal distribution. Unlike the Exponential, the highest frequencies in the normal distribution are in the immediate vicinity of the mean and decrease symmetrically on either side.

The big discovery for me — I knew very little about probability theory at the time — was that even when events happen in truly random fashion, the time between successive events should have a consistent shape. If I collected similar data on ten different days, the shape of the distribution in each case would more or less remain the same. This meant that, paradoxically, some types of randomness could be described accurately by the determinism of mathematical equations.


One interesting application of the Exponential can be traced back to the applied work of Danish mathematician, Agner Krarup Erlang. In the first decade of the 20th century, Erlang worked for the Copenhagen Telephone Company. Arriving phone calls at the time were handled by human telephone operators. One of the many questions facing the company was how many operators should be there to handle a given volume of calls. The time between successive calls followed the Exponential Distribution (see footnote [1]). In 1909, Erlang published a key paper on these results, “The theory of probability and telephone conversations”. The title sounds today like a parody of some sort but in fact initiated the use of probability and queuing theory in telecommunication networks.

Strikingly, the Exponential recurs time and again in everyday situations: time between customer arrivals to a car wash; time between hits to a website; time between emergency calls, and many other settings. But it is observed only in “walk-in” type situations and not when services are pre-scheduled. The “walk-in” condition is important because it is really another way of saying that the inherent randomness of arrivals can unfold its own, without any planning. If services were prescheduled (the first at 9:00, the second 9:30, the third at 10:00 and so on), then the time between two successive arrivals is not truly random anymore. Another important condition for the Exponential to manifest is that each person that arrives does so independently of others.

Both these conditions hold extremely well for patient arrivals to emergency rooms in hospitals. Obviously, emergencies are not pre-scheduled; and patients come for their own independent reasons. One person may have had a stroke, another an accident, yet another some chest pain; these individual conditions are completely independent of each other. The exception is when a disaster affects an entire town, city or region – a tornado, a contagious disease, a fire. In these cases, when people rush to the emergency room it is due to the same cause and not independent causes. The Boston marathon bombing last April also falls in this category. The bombing itself was completely unanticipated, but in the hours immediately after, Boston area hospitals were packed with those hurt and critically injured – all due the same reason (see footnote [2]).

What about differences between “busy” and “sparse” periods of the day? An emergency room or a 24-hour grocery store are likely to not be crowded at 2-3 am in the morning, but are at their busiest at, say, 5-6 pm. It turns out that the average time between successive arrivals changes, but surprisingly enough the distribution does not. In early morning hours, time between successive arrivals is simply wider (fewer arrivals) than it is in busy hours. Recently, as part of a research project, we plotted curves for time between successive patient arrivals in each hour of the day, for a large emergency department in Massachusetts. Remarkably, the curve for every single hour followed the shape of the Exponential; only the averages differed.


I had arrivals figured out for my project. The missing piece, which I still needed to collect data on, was the distribution of checkout time, which determines how quickly a customer can be serviced. This proved tricky, however. I would wait at the front end of an aisle, and observe a particular cashier as unobtrusively as I could. One cashier caught me looking intently in his direction and narrowed his eyes. I was less than halfway through my data collection when I felt a tap on my shoulder. I turned to find the tall, imposing figure of, Scott, the manager.

“How are you doing?” he asked.

I explained that I had spoken to him about the project.

“But you can't just show up unannounced! I don't think we agreed on that. You have tell me the time you are coming. My cashier called and said that someone suspicious was observing him.”

Scott seemed upset, but was outwardly calm and polite. His concern was valid. Employees can feel threatened if they are being observed in a time study without their being aware of it. There might have been a deeper worry too: it was barely a year and a half since 9/11, and here was a brown skinned man with a stopwatch positioned suspiciously in an aisle, taking notes. ServiceTimeDistribution2

I could have used more data, but things being the way they were, I decided to stop. I probably had enough for a decent grade. The one interesting point is that check-out time distribution for regular/non-express counters above (average around 100 seconds for 61 observations) does not seem to resemble any known distribution, let alone the Exponential.


There's a lot more to discuss on queueing, but not enough space. Over the last 100 years or so, queuing theorists — Erlag counts as one of the earliest — have derived mathematical expressions for such quantities as waiting time and queue lengths; these have made their way into textbooks and Excel spreadsheets. But there's also a stream of research that focuses on the psychology of waiting. The mathematics of queuing has its place, but how the wait feels to an individual is just as important.

While observing the checkout counters at the store, I noticed that about 50% of the customers tended to pick up something from the racks to the side while waiting. These racks, which consisted of an assortment of candies, tabloids and other odds and ends, were accessible to the first 2-3 customers in the queue. In today's so-called health stores and co-ops, the candies have been replaced by organic dark chocolate bars, and tabloids by equally flashy magazines on yoga and meditation. The end effect is the same: even if nothing is purchased from these racks, customers stay engrossed with what is on display and this potentially alleviates the inconvenience of waiting. In other words, occupied time is better than unoccupied time.

The most ingenious example I've heard of this comes in the context of managing hotel elevator waits. Apparently “floor-to-ceiling mirrors adjacent to elevators in high-rise hotels allow those who are waiting to fix their ties, comb their hair, and even perhaps coyly flirt via the mirror with others who are likewise waiting…those hotels that invested in such mirrors received far fewer complaints about elevator delays than competitors who did not.”

This excerpt is from a 1987 paper by MIT professor Richard Larson, titled: “Perspectives on queues: Social justice and the psychology of queuing” [3]. Larson has worked for long in this area, and it was during a keynote by him in 2005 in San Francisco — two years after my class project — that I first heard of the elevator wait example. I remember being struck by it, and also by his claim that academics have been carried away by the mathematics of queueing when there might be simpler, more human ways of understanding what influences the perception of a wait. To mention just a few: a wait that is unfair because someone cut the line feels much worse; a wait whose duration is uncertain feels longer than a wait whose duration is known beforehand (time until next arrival displayed on monitors in subway stations); a wait that is unexplained feels longer than a wait for which a reason has been provided (waiting inside a plane on the runway for an hour without knowing whether the delay is due to weather or a mechanical problem). See [4] for an excellent summary.

Whatever the reasons, is there a perspective an individual can embrace when faced with a wait? The concluding section of Larson's paper describes the following exchange. In a Nov 17 1984 column of the Boston Globe, someone with the pseudonym 'Thoughtful' complained: “I think the worst thing in the world is waiting.” One of the six letter responses to 'Thoughtful', also published in the Boston Globe, is quoted below [3]. The sentiment expressed won't apply to all situations and temperaments; and some might complain it has too much of the “stay positive” message that has been going around recently. Nevertheless, here it is:

“Dear Thoughtful:

I used to feel as you did about waiting. It was awful. I was so impatient. Now it is different because I am different. I use the time spent waiting to my advantage. Here are a few of the things I do while waiting: I think about good things, projects I would like to do some time; I plan out the details in my mind. I pray instead of stewing because I have to wait. I read. (I usually keep a book or pamphlet with me.) I knit if it is going to be a long wait. I made seven afghans last year while I was waiting in hospitals. A side benefit was that I made a lot of nice acquaintances because people stopped to talk to me about what I was making.

To sum it up, I kind of make the time I wait work for me, and I keep it simple. A positive attitude and an openness to adventure also helps you expect something good to happen to you. You would be surprised at what you can see and learn and do while you wait!

Here's hoping you, too, can turn it around!”

Footnotes and References:

1. Erlang's result showed that randomly arriving phone traffic followed the Poisson distribution. The Poisson distribution characterizes the number of events (arrivals) in a particular time period. I could have equivalently based my essay on the Poisson; it wouldn't have mattered because Poisson and the Exponential are inextricably linked. If the time between events follows the Exponential, then number of events observed in a particular time interval will be Poisson distributed, and vice versa.

2. An article in the New England Journal of Medicine conjectures that Boston was able to cope well the afternoon of the bombing because of a variety of reasons. It helped that Boston is a hospital dense city – more so than other cities. Further the bombing happened at 2:50 pm, at the cusp of a shift change; so staff that would have left stayed on and new staff of the next shift joined them. Also, because the marathon is always held on a state holiday, there weren't the same number of elective cases scheduled in the operating rooms, making it easier to accommodate the victims of the bombing. The article is available free at:

3. Larson, Richard C. “OR Forum—Perspectives on Queues: Social Justice and the Psychology of Queueing.” Operations Research 35.6 (1987): 895-905.

4. David Maister, 1985, The psychology of waiting lines: