Probability matching (PM) is a widely observed phenomenon in which subjects match the probability of choices with the probability of reward in a stochastic context. For instance, suppose one has to choose between two sources of reward: one (A) that gives reward on 70% of the occasions, and the other (B) on 30%. The rational, utility-maximizing strategy is to choose always A. The matching strategy consists in choosing A on 70% of the occasions and B on 30% of the occasions. While the former leads to a reward 7 times out of 10, the latter will be rewarding only 5.8 times out of 10 [(0.7 x 0.7) + (0.3 x 0.3) = 0,58]. Clearly, the maximizing strategy outperforms the matching strategy.
The maximizing strategy, however, is rarely found in the biological world. From bees to birds to humans, most animals match probabilities (Erev & Barron, 2005; C.R. Gallistel, 1990; Greggers & Menzel, 1993; Anil K. Seth, 2001; Vulkan, 2000). In typical experiments with humans, subjects are asked to predict which light will flash (left or right for instance) and have a monetary reward for every correct answer. Rats has to forage for food in a T-maze, pigeons press levers that reward food pellets of different size with different probability, while bees forage artificial flowers with different sucrose delivery rate. In all cases, the problem amount to efficiently maximize reward from various sources, and the most common solution is PM. (There are variations, but PM predicts reliably subjects’ behavior). Different probability distributions, rewards or context variations do not altered the results. Hence it is a particularly robust phenomenon, and a clear example of discrepancy between standards of rationality and agent’s behavior. Three different perspectives could then be adopted: 1) subjects are irrational, 2) subjects are boundedly rational and hence cannot avoid such mistakes or 3) subjects are in fact ecologically rational and hence PM is not irrational.
According to the first one, mostly held in traditional normative economics and decision theory (e.g., Savage, 1954), this behavior is blatantly irrational. Rational agents rank possible actions according to the product of the probability and utility of the consequences of actions, and they choose those that maximize subjective expected utility. In opting for the matching strategy, subjects violate the axioms of decision theory, and hence their behavior cannot be rationalized. In other words, their preferences cannot be construed as maximizing a utility function: it is “an experimental situation which is essentially of an economic nature in the sense of seeking to achieve a maximum of expected reward, and yet the individual does not in fact, at any point, even in a limit, reach the optimal behavior” (Arrow, 1958, p. 14).
Another perspective, found in the “heuristic and biases” tradition (Kahneman et al., 1982; Kahneman & Tversky, 1979) also considers that it is irrational but suggests why this particular pattern is so common. The boundedly rational mind cannot always proceed to compute subjective expected utilities but rely on simplifying tricks: heuristics. One heuristic that may explain human shortcomings in this case is representativeness: judging the likelihood of an outcome by the degree to which it is representative of a series. This is how the phenomena known as the gambler’s fallacy (the belief that an event is more likely to occur because it has not happened for a period of time) may be explained: “there was five heads in a row; there cannot be another one!” This heuristics may also explain why subjects match probabilities: it is more likely that if the 70% source was rewarding in the last round, it would be better to try the 30% a little in order to maximize reward. Hence PM is irrational, but this irrationality is excusable, albeit without any particular significance.
The third perspective, that could be either named “ecological rationality” or “evolutionary psychology” (Barkow et al., 1992; Cosmides & Tooby, 1996; G. Gigerenzer, 2000; Gerd Gigerenzer et al., 1999) argue instead that humans and animals are not really irrational, but adapted to certain ecological conditions whose absence explains apparent irrationality. Ecologically rational heuristics are not erroneous processes, but mechanisms tailored to fit both the structure of the environment and the mind: they are fast, frugal and smart. PM can be rational in some context and irrational in some others: when animals are foraging and competing with conspecifics for resources, PM is the optimal strategy, as illustrated by Gigerenzer & Fiedler:
(…) if one considers a natural environment in which animals are not as socially isolated as in a T-maze and in which they compete with one another for food, the situation looks different. Assume that there are a large number of rats and two patches, left and right, with an 80:20 distribution of food. If all animals maximized on an individual level, then they all would end up in the left part, and the few deviating from this rule might have an advantage. Under appropriate conditions, one can show that probability matching is the more rational strategy in a socially competitive environment (G. Gigerenzer & Fiedler, forthcoming)
This pattern of behavior and spatial distribution correspond to the Ideal Free Distribution (IFD) model used in behavioral ecology (Weber, 1998). Derived from optimal foraging theory (Stephens & Krebs, 1986), the IFD predicts that the distribution of individuals between food patches will match the distribution of resources, a pattern observed in many occasions in animals and humans (Grand, 1997; Harper, 1982; Lamb & Ollason, 1993; Madden et al., 2002; Sokolowski et al., 1999).
There are of courses discrepancies between the model and the observed behavior, but foraging groups tend to approximate the IFD. This supports the claim that PM is a rational heuristics only in as socially competitive environment: it could also be construed as a mixed-strategy Nash equilibrium in a multiplayer repeated game (Glimcher, 2003, p. 295) or as an evolutionarily stable strategy, that is, a strategy that could not be invaded by another competing strategy in a population who adopt it (C. R. Gallistel, 2005). Seth’s simulations (in press) showed that a simple behavioral rule may account for both individual and collective matching behavior.