March: Explore, Exploit
March, James G. 'Exploration and Exploitation in Organizational Learning'. Organizational Science, vol. 2, no. 1, February 1991.
Adaptive systems that engage in exploration to the exclusion of exploitation are likely to find that they suffer the costs of experimentation without gaining many of its benefits. They exhibit too many undeveloped new ideas and too little distinct competence.
Conversely, systems that engage in exploitation to the exclusion of exploration are likely to find themselves trapped in suboptimal stable equilibria.
The implicit choices [between exploration and exploitation] are buried in many features of organizational forms and customs, for example, in organizational procedures for accumulating and reducing slack, in search rules and practices, in the ways in which targets are set and changes, and in incentive systems.
Understanding the choices and improving the balance between exploration and exploitation are complicated by the fact that returns from the two options vary not only with respect to their expected values, but also with respect to their variability, their timing, and their distribution within and beyond the organization.
The certainty, speed, proximity, and clarity of feedback ties exploitation to its consequences more quickly and more precisely than is the case with exploration. ... The effects extend, through network externalities, to others with whom the learning organization interacts. Reason inhibits foolishness; learning and imitation inhibit experimentation. This is not an accident but is a consequence of the temporal and spatial proximity of the effects of exploitation, as well as their precision and interconnectedness. ... Positive local feedback produces strong path dependence and can lead to suboptimal equilibria. It is quite possible for competence in an inferior activity to become great enough to exclude superior activities with which an organization has little experience. Since long-run intelligence depends on sustaining a reasonable level of exploration, these tendencies to increase exploitation and reduce exploration make adaptive processes potentially self-destructive.
Organizations store knowledge in their procedures, norms, rules, and forms. They accumulate such knowledge over time, learning from their members. At the same time, individuals in an organization are socialized to organizational beliefs. ... Organizational knowledge and faiths are diffused to individuals through various forms of instruction, indoctrination, and exemplification. An organization socializes recruits to the languages, beliefs, and practices that comprise the organizational code. Simultaneously, the organizational code is adapting to individual beliefs.
By far the highest equilibrium knowledge occurs when the code learns rapidly from individuals whose socialization to the code is low.
... it was shown that slower learning allows for greater exploration of possible alternatives and greater balance in the development of specialized competences. ... Slow learning on the part of individuals maintains diversity longer, thereby providing the exploration that allows the knowledge found in the organizational code to improve.
... if the [individual's rate of learning] is high, moderate amounts of turnover improve the organizational code. Rapid socialization of individuals into the procedures and beliefs of an organization tends to reduce exploration. A modest level of turnover, by introducing less socialized people, increases exploration, and thereby improves aggregate knowledge. The level of knowledge reflected by the organizational code is increased, as is the average individual knowledge of those individuals who have been in the organization for some time. Note that this effect does not come from the superior knowledge of the average new recruit. Recruits are, on average, less knowledgeable than the individuals they replace. The gains come from their diversity.
Contributions to improving the code (and subsequently individual knowledge) come from the occasional newcomers who deviate from the code in a favorable way. Old-timers, on average, know more, but what they know is redundant with knowledge already reflected in the code. They are less likely to contribute new knowledge on the margin. Novices know less on average, but what they know is less redundant with the code and occasionally better, thus more likely to contribute to improving the code.
... understanding the world may be complicated by turbulence in the world. Exogenous environmental change makes adaptation essential, but it also makes learning from experience difficult.
... mutual learning has a dramatic long-run degenerate property under conditions of exogenous turbulence. As the beliefs of individuals and the code converge, the possibilities for improvement in either decline. Once a knowledge equilibrium is achieved, it is sustained indefinitely. The beliefs reflected in the code and those held by all individuals remain identical and unchanging, regardless of changes in reality. Even before equilibrium is achieved, the capabilities for change fall below the rate of change in the environment. As a result, after an initial period of increasing accuracy, the knowledge of the code and individuals is systematically degraded through changes in reality.
Replacing departing individuals with recruits closer to the current organizational code would significantly reduce the efficiency of turnover as a source of exploration.
... where there is turbulence, there is considerable individual advantage to having tenure in an organization that has turnover.
Where returns to one competitor are not strictly determined by that competitor's own performace but depend on the relative standings of the competitors, returns to changes in knowledge depend not only on the magnitude of changes in the expected value but also on *changes in variability* and on the number of competitors.
As the number of competitors increases, the contribution of the variance to competitive advantage increases...
... if learning increases both the mean and the variance of a normal performance distribution, it will improve the competitive advantage in a competition for primacy. The model also suggests that increases in variance may compensate for decreases in the mean; decreases in the variance may nullify gains from increases in the mean. These variance effects are particularly significant when the number of competitors is large.
Learning processes do not necessarily lead to increases in both average performance and variation, however. Increased knowledge seems often to reduce the variability of performance rather than to increase it. Knowledge makes performance more reliable. As work is standardized, as techniques are learned, variability, both in the time required to accomplish tasks and in the quality of task performance, is reduced. Insofar as that increase [in] reliability comes from a reduction in the left-hand tail, the likelihood of finishing last in a competition among many is reduced without changing the likelihood of finishing first. However, if knowledge has the effect of reducing the right-hand tail of the distribution, it may easily decrease the chance of being best among several competitors even though it also increases average performance. The question is whether you can do exceptionally well, as opposed to better than average, without leaving the confines of conventional action.
... multiple, independent projects may have an advantage over a single, coordinated effort. The average result from independent projects is likely to be lower than that realized from a coordinated one, but their right-hand side variability can compensate for the reduced mean in a competition for primacy. The argument can be extended more generally to the effects of close collaboration or cooperative information exchange. Organizations that develop effective instruments of coordination and communication probably can be expected to do better (on average) than those that are more loosely coupled, and they also probably can be expected to become more reliable, less likely to deviate significantly from the mean of their performance distributions. The price of reliability, however, is a smaller chance of primacy among competitors.
Michael Polanyi, commenting on one of his contributions to physics, observed (Polanyi 1963, p. 1013) that "I would never have conceived my theory, let alone have made a great effort to verify it, if I had been more familiar with major developments in physics that were taking place. Moreover, my initial ignorance of the powerful, false objections that were raised against my ideas protected those ideas from being nipped in the bud."