UNDERWEIGHTING OF BASE-RATE
INFORMATION
REFLECTS IMPORTANT
DIFFICULTIES PEOPLE HAVE WITH PROBABILISTIC INFERENCE
psycoloquy.94.5.03.base-rate.7.hamm Sunday 16 January 1994
ISSN 1055-0143 (16 paragraphs, 34 references, 341 lines)
PSYCOLOQUY is sponsored by the American Psychological Association (APA)
Copyright 1994 Robert M. Hamm
Commentary on Koehler on Base-Rate
Robert M. Hamm
Clinical Decision Making Program
Department of Family Medicine
University of Oklahoma Health Sciences Center
Oklahoma City OK 73190 USA
rob-hamm@@uokhsc.edu
ABSTRACT: I argue that people do a poor job integrating informative base rates
into their decision processes. This is shown by the results of two sorts of study. First,
in probabilistic inference word problems, people's interpretations of conditional
probabilities are confused. Second, in studies where subjects receive a series of pieces
of information and update their probabilities after each, their probability updating is
inaccurate, reflecting several error-producing processes, including overweighting of most
recent information, which is usually not the base-rate information. We should not ask how
much this matters, without considering that experts who make consequential decisions based
on their hypotheses about the state of the world usually follow rule-like scripts, rather
than explicitly revise probabilities.
I. INTRODUCTION
1. Koehler (1993) correctly argues that people do not completely "neglect"
base rate: when asked to judge the probability that an event occurred in a particular
situation, given information about the base rate of the event along with fallible
information pertaining to whether the event occurred, people make some use of the base
rate. Other studies showing some use of base rate that were not mentioned by Koehler
include Ofir (1988) and Hamm (1987).
2. Demonstrating that "base-rate neglect" has been oversold, however, does
not prove that people accurately integrate informative base rates into their decision
processes. In this commentary I argue that people are indeed inaccurate when asked to
revise hypothesis probabilities on the basis of evidence (Sections II and III). This is
true when experts make real decisions, as well as when novices make hypothetical
decisions. How do people reason in realistic situations that demand probability revision?
In Section IV I consider the implications for optimal decision making of the theory that
people follow "mental scripts."
II. WHAT ARE PEOPLE DOING WHEN THEY NEGLECT BASE RATE?
3. Koehler shows that "base-rate neglect" is not an empirically correct
description of what people do when given probabilistic inference word problems because
their responses are affected by the base rate. I would add that "neglect" is not
the actual psychological process. The term implies a flaw in an attentional process, so
that insufficient weight is given to base-rate information. However, the largest part of
the error in these word problems is due to people's lack of understanding of the meaning
of the conditional probabilities in the problem.
4. The problems offer a piece of evidence (e.g., in the Blue/Green Cab problem, a
witness reported that the cab involved in the night accident was blue: evidence e =
"blue"), a base rate (only 15% of the cabs in the city are blue: prior p(blue) =
.15), and a measure of the fallibility or reliability of the evidence (the witness, in
similar conditions, was right 80% of the time: p(e/h) = p("blue"/blue) =
p("green"/green) = .80). Then the question is asked ("What is the
probability that the cab involved in the accident was blue, p(h/e) or
p(blue/'blue')?"). A subject who does not distinguish the conditional probabilities
p(e/h) [the reliability or fallibility of the evidence] and p(h/e) [the desired answer]
may offer p(e/h) as the response. Indeed, in Bar-Hillel's (1980) histogram of responses,
the value for p(e/h), .80, was the most frequent answer. This was observed again with
several word problems by Hamm (1987, 1989).
5. People's difficulty interpreting these conditional probabilities has been offered as
an explanation for their errors on probabilistic inference word problems by Dawes (1986)
and Dawes, Mirels, Gold, and Donahue (1993). Hamm and Miller (1988) showed, with several
word problems, including the Blue/Green Cab problem, that the pattern of responses
differed little whether the text of the problem offered p(e/h) or p(h/e) as the
information regarding the fallibility of the evidence. Analysis of subjects' verbal
protocols showed little association between the specific conditional probability concept
written in the word problem and the concept used in their thinking (Hamm and Miller,
1988).
6. Eddy (1982) noted that this same confusion occurs in the professional writings of
medical doctors. A more recent example in the medical research literature is an error by
Bernstein, Rudolph, Pinto, Viner, and Zuckerman (1990), who reversed the sensitivity
p(Test/Disease) and the positive predictive value p(Disease/Test) in interpreting their
own data table, putting misleading values into the literature for subsequent
meta-analyses. Penney (1992) too has demonstrated the inability of students in statistics
classes to interpret the conditional probabilities in the 2 by 2 table relating evidence
and hypothesis.
7. A demonstration that people confuse these conditional probabilities does not fully
explain how people think about probabilistic inference. It is not simply that they apply Bayes'
Theorem correctly with the sole exception that they mistake the one conditional
probability for the other (Pollatsek, Well, Konold, Hardiman, and Cobb, 1987; Hamm, 1987;
Hamm, 1993).
III. RESULTS FROM THE PROBABILITY UPDATING RESEARCH
8. Further evidence on what people do when given fallible evidence pertinent to a
hypothesis comes from those studies on probability updating in which a sequence of
information is given and the subject revises p(h) after each piece. Early work in this
paradigm, reviewed by Edwards (1968), most often showed conservatism (overweighting of
base rate) as Ayton (1993) noted.
9. For the base-rate neglect question, the important finding from these studies (see
also Hogarth and Einhorn, 1992, and Robinson and Hastie, 1985) is that the order in which
people get the information makes a difference. Although it shouldn't make any difference
what order they get information in, subjects usually put greater weight on the most
recently received information (Adelman, Tolcott, and Bresnick, 1993, with military
intelligence experts dealing with realistic military intelligence problems; Tubbs, Gaeth,
Levin, and Van Osdol, 1993, with college students on everyday problems such as
troubleshooting a stereo; Chapman, Bergus, Gjerde, and Elstein, 1993, with medical doctors
on a realistic diagnosis problem). In more ambiguous situations the first impression had a
lasting effect (Tolcott, Marvin, and Lehner, 1989).
10. Because every probability adjustment involves balancing prior probability or base
rate with the implications of the new evidence, any inappropriate use of the most recent
information implies, indirectly, that the base rate has been inappropriately used too.
Hamm (1987) included the base-rate information in the sequence and found direct evidence
that it had more influence upon subjects' final probability estimates when it was
presented last. In sum, the results from this second paradigm show that people have a more
fundamental problem with probabilistic inference than mere neglect of base rate or
confusion of conditional probabilities.
IV. EXPERTS FOLLOW SCRIPTS: IMPLICATIONS FOR PROBABILITY REVISION
11. Does it matter that people cannot accurately revise numerical probabilities
(Christensen-Szalanski, 1986)? The deeper study of what people actually do, as called for
by Koehler, can provide perspective. What do doctors do, for example, when ideally they
should be forming hypotheses and revising hypothesis probabilities as they gather
evidence?
12. It is not that they do a numerical integration more complex than Bayes' Theorem to
revise probabilities (Gregson, 1993), as Hamm's (1987) explorations show. Doctors thinking
aloud about cases don't even speak explicitly of probabilities (Kuipers, Moskowitz, and
Kassirer, 1988), though when they are induced to do so it improves their decisions (Pozen,
D'Agostino, Selker, Sytkowski, and Hood, 1984; Carter, Butler, Rogers, and Holloway,
1993).
13. Nor do doctors rely exclusively on learning probabilities from experience, like
rats learning the contingencies on a lever (Spellman, 1993). While some of their knowledge
is based on this kind of experience (Christensen-Szalanski and Beach, 1982; Christensen-
Szalanski and Bushyhead, 1981), doctors have to know what to do with both the common
diagnoses (8 out of 10) and the rare ones (1 in 10,000). Though in some situations, where
people experience an event repeatedly, they can implicitly learn a base rate, in other
situations, where people do not experience an event repeatedly but rather learn about it
abstractly, they may also be able to take account of a base ratebut if they cannot,
the consequences may be important.
14. How, then, do doctors usually handle diagnostic problems? Experts generally
organize their extensive knowledge into mental scripts (Schmidt, Norman, and Boshuizen,
1990), complex rules that function with the speed of recognition to provide responses for
familiar and unfamiliar situations. Explicit calculation of Bayesian probabilities is not
a strength of this type of rule (cf. Hamm, 1993). Instead, experts' accuracy may be a
function of the recognition processes, which can bring ideas to mind optimally (Anderson
and Milson, 1989). Or accuracy may be due to well-tuned judgment processes governing
response choice (Chapter 8 of Abernathy and Hamm, 1994).
15. If doctors' scripts are used accurately, producing results similar to those that
wise use of Bayes' theorem would produce, this is due not only to the of
experience but also to reflection and to others' criticism (Chapter 11 of Abernathy and
Hamm, 1994). Any form of argument can be applied toward justifying a change in a script,
including arguments based on probabilistic analysis.
16. For example, when the screening tests for HIV first came out, Meyer and Pauker
(1987) warned against ignoring the base rate, i.e., against assuming that someone with no
risk factors has AIDS if their screen is positive for AIDS. Guided by such explicit
discussion of the probabilities, and by individual cases of people devastated by false
positive HIV screens, doctors' shared scripts were adjusted until now they don't recommend
that patients be screened unless there are risk factors. The "1993 script"
produces behavior that is, for the most part, consistent with a Bayesian analysis.
Individual doctors using the script need neither think about probabilities nor understand
the Bayesian principles. They just think of the rules, or of cases in which the script is
implicit (Riesbeck and Schank, 1989). Note, of course, that this scenario depends on there
being someone who understands the probabilistic principles and can shape the script that
everyone else will use.
REFERENCES
Abernathy, C.M., and Hamm, R.M. (in press, 1994). Surgical Intuition. Philadelphia, PA:
Hanley and Belfus.
Adelman, L., Tolcott, M.A., and Bresnick, T.A. (1993). Examining the Effect of
Information Order on Expert Judgment. Organizational Behavior and Human Decision
Processes, 56, 348-369.
Anderson, J.R., and Milson, R. (1989). Human Memory: an Adaptive Perspective.
Psychological Review, 96, 783-719.
Ayton, P. (1993). Base Rate Neglect: an Inside View of Judgment? Commentary on Koehler
on Base-rate. PSYCOLOQUY 4(63) base-rate.5.ayton.
Bar-Hillel, M. (1980). The Base-rate Fallacy in Probability Judgments. Acta
Psychologica, 44, 211-233.
Bernstein, L.H., Rudolph, R.A., Pinto, M.M., Viner, N., and Zuckerman, H. (1990).
Medically Significant Concentrations of Prostate-specific Antigen in Serum Assessed.
Clinical Chemistry, 36, 515-518.
Carter, B.L., Butler, C.D., Rogers, J.C., and Holloway, R.L. (1993). Evaluation of
Physician Decision Making With the Use of Prior Probabilities and a Decision-analysis
Model. Archives of Family Medicine, 2, 529-534.
Chapman, G.B., Bergus, G.R., Gjerde, C., and Elstein, A.S. (1993). Sources of Error in
Reasoning about a Clinical Case: Clinicians as Intuitive Statisticians (Meeting Abstract).
Medical Decision Making, 13, 382.
Christensen-Szalanski, J.J.J. (1986). Improving the Practical Utility of Judgment
Research. In B. Brehmer, H. Jungermann, P. Lourens, and G. Sevon (Eds.), New Directions in
Research on Decision Making (pp. 383-410). North Holland: Elsevier Science Publishers B.V.
Christensen-Szalanski, J.J.J., and Beach, L.R. (1982). Experience and the Base-rate
Fallacy. Organizational Behavior and Human Performance, 29, 270-278.
Christensen-Szalanski, J.J.J., and Bushyhead, J.B. (1981). Physicians' Use of
Probabilistic Information in a Real Clinical Setting. Journal of Experimental Psychology:
Human Perception and Performance, 7, 928-935.
Dawes, R.M. (1986). Representative Thinking in Clinical Judgment. Clinical Psychology
Review, 6, 425-441.
Dawes, R.M., Mirels, H.L., Gold, E., and Donahue, E. (1993). Equating Inverse
Probabilities in Implicit Personality Judgments. Psychological Science, 4, 396-400.
Eddy, D.M. (1982). Probabilistic Reasoning in Clinical Medicine: Problems and
Opportunities. In D. Kahneman, P. Slovic, and A. Tversky (Eds.), Judgment Under
Uncertainty: Heuristics and Biases (pp. 249-267). Cambridge: Cambridge University Press.
Edwards, W. (1968). Conservatism in Human Information Processing. In B. Kleinmuntz
(Ed.), Formal Representation of Human Judgment (pp. 17-52). New York: Wiley.
Gregson, R.A.M. (1993). Which Bayesian Theorem Could Be Compared With Real Behavior?
Commentary on Koehler on Base-rate. PSYCOLOQUY 4(50) base-rate.2.gregson.
Hamm, R.M. (1987). Diagnostic Inference: People's Use of Information in Incomplete
Bayesian Word Problems. (Publication No. 87-11). Boulder, CO: Institute of Cognitive
Science, University of Colorado.
Hamm, R.M. (1989). People Misinterpret Conditional Probabilities: Final Report of
Project Using Protocol Analysis and Process Tracing Techniques to Investigate
Probabilistic Inference (Publication No. 89-4). Boulder, CO: Institute of Cognitive
Science, University of Colorado.
Hamm, R.M. (1993). Explanations for Common Responses to the Blue/Green Cab
Probabilistic Inference Word Problem. Psychological Reports, 72, 219-242.
Hamm, R.M., and Miller, M.A. (1988). Interpretation of Conditional Probabilities in
Probabilistic Inference Word Problems. (Publication No. 88-15). Boulder, CO: Institute of
Cognitive Science, University of Colorado.
Hogarth, R.M., and Einhorn, H.J. (1992). Order Effects in Belief Updating: The
Belief-adjustment Model. Cognitive Psychology, 24, 1-55.
Koehler, J.J. (1993). The Base Rate Fallacy Myth. PSYCOLOQUY 4(49) base-rate.1.koehler.
Kuipers, B., Moskowitz, A.J., and Kassirer, J.P. (1988). Critical Decisions Under
Uncertainty: Representation and Structure. Cognitive Science, 12, 177-210.
Meyer, K.B., and Pauker, S.G. (1987). Screening for HIV: Can We Afford the False
Positive Rate? New England Journal of Medicine, 317, 238-241.
Ofir, C. (1988). Pseudodiagnosticity in Judgment Under Uncertainty. Organizational
Behavior and Human Decision Processes, 42, 343-363.
Penney, C.G. (1992). Why Can't My Students Understand Conditional Probability? Paper
presented at annual meetings of Psychonomics Society, St. Louis.
Pollatsek, A., Well, A.D., Konold, C., Hardiman, P., and Cobb, G. (1987). Understanding
Conditional Probabilities. Organizational Behavior and Human Decision Processes, 40,
255-269.
Pozen, M.W., D'Agostino, R.B., Selker, H.P., Sytkowski, P.A., and Hood, W.B., Jr.
(1984). A Predictive Instrument to Improve Coronary-care-unit Admission Practices in Acute
Ischemic Heart Disease: A Prospective Multicenter Clinical Trial. New England Journal of
Medicine, 310, 1273-1278.
Riesbeck, C.K., and Schank, R.C. (1989). Inside Case-based Reasoning. Hillsdale, NJ:
Lawrence Erlbaum Associates, Publishers.
Robinson, L.B., and Hastie, R. (1985). Revision of Beliefs When a Hypothesis Is
Eliminated From Consideration. Journal of Experimental Psychology: Human Perception and
Performance, 11, 443-456.
Schmidt, H.G., Norman, G.R., and Boshuizen, H.P.A. (1990). A Cognitive Perspective on
Medical Expertise: Theory and Implications. Academic Medicine, 65, 611-621.
Spellman, B.A. (1993). Implicit Learning of Base Rates: Commentary on Koehler on
Base-rate. PSYCOLOQUY 4(61) base-rate.4.spellman.
Tolcott, M.A., Marvin, F.F., and Lehner, P.E. (1989). Expert Decision Making in
Evolving Situations. IEEE Transactions on Systems, Man, and Cybernetics, 19, 606-615.
Tubbs, R.M., Gaeth, G.J., Levin, I.P., and Van Osdol, L.A. (1993). Order Effects in
Belief Updating with Consistent and Inconsistent Evidence. Journal of Behavioral Decision
Making, 6, 257-269.
Table
of Contents Please
send me your comments.
|