A friend posed the following puzzle:![]()
Suppose one is driving through a town with N motels or gas stations or whatever. N is known. One sees the prices posted. One is not allowed to turn around. The puzzle is this: what's the optimum strategy for finding the best price for a motel or gas? There is an optimum strategy.You might want to think about the problem before going on.I'll expand a few points:
- I said nothing about the gas station prices being distributed in some particular way. Assume they are random. (An actual town might have variations wherein, for example, the last gas station in town advertises itself as such and charges more. For the ideal form of this problem, ignore such real-world issues.)
- The driver, as I said, cannot turn around. It's a one way street. All of the gas stations are on the road.
- "Not getting gas" is not an outcome. If one has not stopped at the first N - 1 gas stations, then stopping at the Nth is possible.
- If one stops at the first (like marrying the first girl one dates), the outcome is not likely to be too good. And if one waits to the Nth, the same is true.
- The optimum lies in between--neither the first nor the last, for N in general. Of course, when N = 1 the solution is trivial. And there's a precise algorithm for maximizing the outcome (getting the best price, all things considered).
Whilst in the process of corresponding with the poser of the puzzle about his notion of "optimum strategy" or "maximizing the outcome", I wrote a program to compare gas-buying strategies according to my notion of optimum--the strategy with the best
expected price-- the price the strategy picks, averaged over possible worlds.SPOILER: Here is where the puzzle poser's answer is given. His answer, a strategy I'll call "hold_out", goes through an "information gathering" phase, looking at N / e prices (about 38% of the prices), and then holds out for a price that is as good as or better than the best seen in the "gathering". If no such price is seen, it takes the last price.
I whomped up a strategy I called "linear_slide", which goes through the same information gathering phase, then slides its buying threshold starting at the best price seen, and moving so that the threshold reaches the average price seen (the running average, so it's really a wiggly slide) by the time it reaches the second-to-the-last price. (If the slide starts at the second-to-the last, use the average as the threshold.) As with hold_out, if you get to the end, take the last price.
These I subjected to trials where the prices are set like this:
for( i = 0; i < n; i++ ) {
prices[i] = (double) rand() / RAND_MAX;
}
...a block-shaped distribution, but the puzzler said the price
distribution doesn't matter.
Calculemus! (Let us calculate.)
Here are some results. The "price" column shows the average price a strategy picked through all the runs. "got min" shows how often it picked the smallest price in the set of n. "got better" shows how often it picked a better price than its competitor did.
n = 3, 10000000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
* 4.165912e-01 * 5.000318e-01 * 0.000000e+00 hold_out
* 4.165912e-01 * 5.000318e-01 * 0.000000e+00 linear_slide
n = 4, 7500000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
3.748170e-01 * 4.585467e-01 3.140600e-02 hold_out
* 3.644970e-01 4.376440e-01 * 5.192840e-02 linear_slide
n = 6, 5000000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
3.331782e-01 * 4.278488e-01 7.340940e-02 hold_out
* 3.006933e-01 3.843516e-01 * 1.302930e-01 linear_slide
n = 10, 3000000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
2.749750e-01 * 3.986953e-01 1.293170e-01 hold_out
* 2.187899e-01 3.191723e-01 * 1.997420e-01 linear_slide
n = 30, 1000000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
2.246291e-01 * 3.787530e-01 2.595500e-01 hold_out
* 1.104394e-01 2.076570e-01 * 3.266040e-01 linear_slide
n = 100, 300000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
1.975867e-01 * 3.719933e-01 * 3.794967e-01 hold_out
* 5.739023e-02 1.224833e-01 3.484133e-01 linear_slide
n = 1000, 30000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
1.864774e-01 * 3.665667e-01 * 5.302000e-01 hold_out
* 1.759512e-02 4.113333e-02 3.641667e-01 linear_slide
n = 10000, 3000 runs (< = minimize, > = maximize, * = winner)
< price > got min > got better
1.847001e-01 * 3.760000e-01 * 5.960000e-01 hold_out
* 5.481258e-03 1.733333e-02 3.663333e-01 linear_slide
When n = 3, the two strategies behave identically.
From then on, hold_out always gets the minimum price more often than linear_slide, and as n approaches infinity, it seems to settle on getting the minimum price about 37% of the time, while linear_slide gets it less and less often.
However, starting when n = 4, linear_slide's average price is better than hold_out's. As n approaches infinity, linear_slide's average approaches 0, while hold_out's average approaches .19, because 38% of the time hold_out takes the last price (a random price that averages .5).
The last column is interesting. At first, linear_slide gets a better price than hold_out most of the time, but starting around n = 100, hold_out beats linear_slide more often. (The two numbers in the "got better" column add up to the frequency that the two strategies get different prices.)
To summarize, whenever the strategies act different:
Check out the simulator source code and the text file of the output.
--Steve