Etiquette



DP Etiquette

First rule: Don't be a jackass.

Other rules: Do not attack or insult people you disagree with. Engage with facts, logic and beliefs. Out of respect for others, please provide some sources for the facts and truths you rely on if you are asked for that. If emotion is getting out of hand, get it back in hand. To limit dehumanizing people, don't call people or whole groups of people disrespectful names, e.g., stupid, dumb or liar. Insulting people is counterproductive to rational discussion. Insult makes people angry and defensive. All points of view are welcome, right, center, left and elsewhere. Just disagree, but don't be belligerent or reject inconvenient facts, truths or defensible reasoning.

Wednesday, January 13, 2016

Rationalizing political policy-making

Politics is at least as irrational as it is rational — probably more irrational than not. That fact is supported by solid evidence. Simply listening to politicians on both sides makes it clear that the two endlessly warring sides see very different realities and facts in almost every issue they deal with. The two sides apply very different kinds of logic or common sense to what they think they see and they routinely wind up supporting policies that are mutually exclusive.

One side or both can be more right than wrong about any given issue. However, its very hard to imagine both sides being mostly about disputed issues but easy to see that they can both be more wrong than right, assuming there is an objective (not personal or subjective) measure of right and wrong (there isn’t).

A lot like religion
That’s just simple logic. Given the lack of definitions for even basic concepts, e.g., the public interest or what is constitutional and what isn’t, partisan liberal vs. conservative disputes can be seen as akin to religious disputes. People have debated for millennia about which God is the “real” God or what the real God’s words really mean. For religious disputes, there is neither evidence nor defined terms of debate. That makes religious disagreements unresolvable and pointless unless the combatants just happen to decide to agree on something.

Political disagreements are a lot like that. They are usually based on (i) little or no evidence and (ii) subjective personal perceptions of reality and personal ideology or morals. That makes most political disputes unresolvable and pointless. Americans have been bickering for centuries about what the Founding Fathers would have wanted or done about most everything. Those disputes will continue for centuries.

Evidence of innate, unconscious human irrationality about politics from academic research is overwhelming. Humans see and think about the world and issues through a lens of personal ideology or morals and unconscious biases. Unfortunately, personal lenses are powerful fact and logic distorters. When experts are carefully scrutinized and evaluated, their ability to see future events is poor, about the same as random guessing. Some people are exceptions and have real talent, but for the most part expert predictions of future events and policy outcomes are useless.

An easy solution . . . .
Happily, there is a simple way to inject more objectivity and rationality into politics. It amounts to consciously gathering and analyzing data to test policy choices to see how well they do once implemented. That has been suggested from time to time in various contexts, e.g., as an experiment in state’s rights or as policy-making modeled on random controlled trials that are used in medicine to test the safety and efficacy of a new drug or clinical treatment protocol. It really is that simple: just collect data and analyze it and use comparison groups and/or policy variants when it is feasible to do so. Politics can be made more rational if there is a will to it.

. . . that cannot be implemented
Sadly, the easy solution is impossible to implement under America’s deranged two-party system and its corrupt, incompetent brand of partisan politics. In a recent article advocating a random controlled trial approach (RCT) to policy-making, the Economist articulated the implementation problem:
“The electoral cycle is one reason politicians shun RCTs. Rigorous evaluation of a new policy often takes years; reformers want results before the next election. Most politicians are already convinced of the wisdom of their plans and see little point in spending time and money to be proved right. Sometimes they may not care whether a policy works, as long as they are seen to be doing something.”
Evidence from social science research is clear that politicians and experts who are convinced of their own wisdom are far more likely to be wrong than right most of the time, if not always. Finding a solution to that little self-delusion conundrum is a necessary prelude to implementing the obvious, simple solution.

Friday, January 1, 2016

Superforecasting: Book review



Book Review
Superforecasting: The Art & Science of Prediction

What most accurately describes the essence of intelligent, objective, public service-oriented politics? Is it primarily an honest competition among the dominant ideologies of our times, a self-interested quest for influence and power or a combination of the two? Does it boil down to understanding the biological functioning of the human mind and how it sees and thinks about the world? Or, or is it something else entirely?

Turns out, it isn’t even close. Superforecasting comes down squarely on the side of getting the biology right. Everything else is a distant second.

Superforecasting: The Art & Science of Prediction, written by Philip E. Tetlock and Dan Gardener (Crown Publishers, September 2015), describes Tetlock’s ongoing research into asking what factors, if any, can be identified that contribute to a person’s ability to predict the future. In Superforecasting, Tetlock asks how well average but intellectually engaged people can do compared to experts, including professional national security analysts with access to classified information. What Tetlock and his team found was that the interplay between dominant, unconscious, distortion-prone intuitive human cognitive processes (“System 1” or the “elephant” as described before) and less-influential but conscious, rational processes (“System 2” or the “rider”) was a key factor in how well people predicted future events.

Tetlock observes that a “defining feature of intuitive judgment is its insensitivity to the quality of the evidence on which the judgment is based. It has to be that way. System 1 can only do its job of delivering strong conclusions at lightning speed if it never pauses to wonder whether the evidence at hand is flawed or inadequate, or if there is better evidence elsewhere. . . . . we are creative confabulators hardwired to invent stories that impose coherence on the world.”

It turns out, that with minimal training and the right mind set, some people, “superforecasters”, routinely trounce the experts. Based on a 4-year study, the “Good Judgment Project”, funded by the DoD’s Intelligence Advanced Research Projects Agency, about 2,800 volunteers made over a million predictions on topics that ranged from potential conflicts between countries to currency fluctuations. Those predictions had to be, and were, precise enough to be analyzed and scored.

About 1% of the 2,800 volunteers turned out to be superforecasters who beat national security analysts by about 30% at the end of the first year. One even beat commodities futures markets by 40%. The superforecaster volunteers did whatever they could to get information, but they nonetheless beat professional analysts who were backed by computers and programmers, spies, spy satellites, drones, informants, databases, newspapers, books and whatever else that lots of money can buy. As Tetlock put it, “. . . . these superforecasters are amateurs forecasting global events in their spare time with whatever information they can dig up. Yet they somehow managed to set the performance bar high enough that even the professionals have struggled to get over it, let alone clear it with enough room to justify their offices, salaries and pensions.”

What makes them so good?
The top 1-2% of volunteers were carefully assessed for personal traits. In general, superforecasters tended to be people who were eclectic about collecting information and open minded in their world view. They were also able to step outside of themselves and look at problems from an “outside view.” To do that they searched out and aggregated other perspectives, which goes counter to the human tendency to seek out only information that confirms what we already know or want to believe. That tendency is an unconscious bias called confirmation bias. The open minded trait also tended to reduce unconscious System 1 distortion of problems and potential outcomes by other unconscious cognitive biases such as the powerful but very subtle and hard to detect “what you see is all there is” bias, hindsight bias and scope insensitivity, i.e., not giving proper weight to the scope of a problem.

Superforecasters tended to break complex questions down into component parts so that relevant factors could be considered separately, which also tends to reduce unconscious bias-induced fact and logic distortions. In general, superforecaster susceptibility to unconscious biases was significantly lower than for other participants. That appeared to be due mostly to their capacity to use conscious System 2 thinking to recognize and then reduce unconscious System 1 biases. Most superforecasters shared 15 traits including (i) cautiousness based on an innate knowledge that little or nothing was certain, (ii) being reflective, i.e., introspective and self-critical, (iii) being comfortable with numbers and probabilities, and (iv) being pragmatic and not wedded to any particular agenda or ideology. Unlike political ideologues, they were pragmatic and did not try to “squeeze complex problems into the preferred cause-effect templates [or treat] what did not fit as irrelevant distractions.”

What the best forecasters knew about a topic and their political ideology was far less important than how they thought about problems, gathered information and then updated thinking and changed their minds based on new information. The best engaged in an endless process of information and perspective gathering, weighing information relevance and questioning and updating their own judgments when it made sense. It was work that required effort and discipline. Political ideological rigor was detrimental, not helpful.

Regarding common superforecaster traits, Tetlock observed that “a brilliant puzzle solver may have the raw material for forecasting, but if he also doesn’t have an appetite for questioning basic, emotionally-charged beliefs he will often be at a disadvantage relative to a less intelligent person who has a greater capacity for self-critical thinking.” Superforecasters have a real capacity for self-critical thinking. Political, economic and religious ideology is mostly beside the point.

Why this is important
The topic of predicting the future might seem to some to have little relevance and/or importance to politics and political policy. That belief is wrong. Tetlock cites an example that makes the situation crystal clear. In an interview in 2014 with General Michael Flynn, head of the Defense Intelligence Agency, DoD’s 17,000 employee equivalent to the CIA, Gen. Flynn said “I think we’re in a period of prolonged societal conflict that is pretty unprecedented.” A quick Google search of the phrase “global conflict trends” and some reading was all it took to prove that belief was wrong.

Why did Gen. Flynn, a high-ranking, intelligent and highly accomplished intelligence analyst make such an important, easily-avoided mistake? The answer lies in System 1 and its powerful but unconscious “what you see is all there is” (WYSIATI) bias. He succumbed to his incorrect belief because he spent 3-4 hours every day reading intelligence reports filled with mostly bad news. In Ge. Flynn’s world, that was all there was. In Flynn’s unconscious mind, his knowledge had to be correct and he therefore didn’t bother to check his basic assumption. Most superforecasters would not have made that mistake. They train themselves to relentlessly pursue information from multiple sources and would have found what Google had to say about the situation. 

Tetlock asserts that partisan pundits opining on all sorts of things routinely fall prey to the WYSIATI bias for the same reason. They frequently don’t check their assumptions against reality and/or will knowingly lie to advance their agendas. Simply put, partisan pundits are frequently wrong because of their ideological rigidity and the intellectual sloppiness it engenders. 

Limits and criticisms of forecasting
In Superforecasting, Tetlock points out that predicting the future has limits. Although Tetlock is not explicit about this, forecasting most questions for time frames more than about 18-36 months in the future appears to become increasingly less accurate and fade into randomness. That makes sense, given complexity and the number of factors that can affect outcomes. Politics and the flow of human events are simply too complicated for long-term forecasting to ever be feasible. What is not known is the ultimate time range where the human capacity to predict fades into the noise of randomness. More research is needed.

A criticism of Tetlock’s approach argues that humans simply cannot foresee things and events that are so unusual that they are not even considered possible until the event or thing is actually seen or happens. Such things and events, called Black Swans, are also believed to dictate major turning points and therefore even trying to predict the future is futile. Tetlock rebuts that criticism, arguing that, (i) there is no research to prove or disprove that hypothesis and (ii) clustered small relevant questions can collectively point to a Black Swan or something close to it. The criticism does not yet amount to a fatal flaw - more research is needed.

Another criticism argues that superforecasters operating in a specified time frame, 1-year periods in this case, are flukes and they cannot defy psychological gravity for long. Instead, the criticism argues that superforecasters will simply revert to the mean and settle back to the ground the rest of us stand on. In other words, they would become more or less like everyone else with essentially no ability to predict future events.

The Good Judgment Project did allow testing of that criticism. The result was the opposite of what the criticism predicted. Although some faded, many of the people identified as superforecasters at the end of year 1 actually got better in years 2 and 3 of the 4-year experiment. Apparently, those people not only learned to limit the capacity of their unconscious System 1 (the elephant) to distort fact and logic, but they also consciously maintained that skill and improved on how the conscious but rational System 2 (the rider) was able to counteract the fact- and logic-distorting lenses of unconscious System 1 biases. Although the mental effort needed to be objective was high, most superforecasters could nonetheless defy psychological gravity, at least over a period of several years.

The intuitive-subjective politics problem
On the one hand, Tetlock sees a big upside for “evidence-based policy”: “It could be huge - an “evidence-based forecasting” revolution similar to the “evidence-based medicine” revolution, with consequences every bit as significant.” On the other hand, he recognizes the obstacle that intuitive or subjective (System 1 biased), status quo two-party partisan politics faces: “But hopelessly vague language is still so common, particularly in the media, that we rarely notice how vacuous it is. It just slips by. . . . . If forecasting can be co-opted to advance their [narrow partisan or tribe] interests, it will be. . . . . Sadly, in noisy public arenas, strident voices dominate debates, and they have zero interest in adversarial collaboration.”

The rational-objective politics theoretical solution
For evidence-based policy, Tetlock sees the Holy Grail of his research as “. . . . using forecasting tournaments to depolarize unnecessarily polarized policy debates and make us collectively smarter.” He asserts that consumers of forecasting need to “stop being gulled by pundits with good stories and start asking pundits how their past predictions fared - and reject answers that consist of nothing but anecdotes and credentials. And forecasters will realize . . . . that these higher expectations will ultimately benefit them, because it is only with the clear feedback that comes with rigorous testing that they can improve their foresight.”

What Tetlock is trying to do for policy will be uncomfortable for most standard narrow ideology ideologues. That’s the problem with letting unbiased fact and logic roam free - they will go wherever they want without much regard for people’s personal ideologies or morals. For readers who follow Dissident Politics (“DP”) and its focus on “objective politics” or ideology based on unbiased fact and unbiased logic in service to an “objectively” defined public interest, this may sound like someone has plagiarized someone else. It should. DP’s cognitive science-based ideology draws heavily on the work of social scientists including Dr. Tetlock, Daniel Khaneman, George Lakoff and Richard Thaler. Both Tetlock and DP advocate change via focusing policy and politics on understanding human biology and unspun reality, not political ideology or undue attention for special interest demands.

Tetlock focuses on evidence-based policy, while DP’s focus is on evidence-based or “objective” politics. Those things differ somewhat, but not much. In essence, Tetlock is trying to coax pundits and policy makers into objectivity based on human cognitive science and higher competence by asking the public and forecast consumers to demand better from the forecasters they rely on to form opinions and world views. DP is trying to coax the public into objectivity by adopting a new, “objective” political ideology or set of morals based on human cognitive science. The hope is that over time both average people and forecasters will see the merits of objectivity. If widely accepted, either approach will eventually get society to about the same place. More than one path can lead to the same destination, which is politics based as much on the biology of System 2 cognition as the circumstances of American politics will allow.

One way to see it is an effort to elevate System 2’s capacity to enlighten over System 1’s awesome power to hide and distort fact and logic. Based on Tetlock’s research, optimal policy making, and by extension, optimal politics, does not boil down to being more conservative, liberal, capitalist, socialist or Christian. Instead, it is a matter of finding an optimum balance in the distribution of mental influence between the heavily biased intuition-subjectivity of unconscious System 1 and less-biased reason-objectivity of conscious System 2, aided by statistics or an algorithm when a good one is available.

That optimum balance won’t lead to perfect policy or politics. But, the result will be significantly better in the long run than what the various, usually irrational, intuitive or subjective mind sets deliver now.

The rational-objective politics practical problem
For the attentive, the big problem has already jumped out as obvious. Tetlock concedes the point: “. . . . nothing may change. . . . . things may go either way.” Whether the future will be a “stagnant status quo” or change “will be decided by the people whom political scientists call the “attentive public.” I’m modestly optimistic.” Not being a forecaster but an experiment instead, DP does not know the answer. Tetlock and DP both face the same problem - how to foster the spread and acceptance of an idea or ideology among members of a species that tends to be resistant to change and biased against questioning innate morals and beliefs.

In essence, what Tetlock and DP both seek is replacing blind faith in personal political morals or ideology and unreasonable influence to narrow special interests with a new ideology grounded in understanding of human cognitive biology and respect for unbiased facts and unbiased logic. The goal of a biology-based political ideology or set of morals is to better serve the public interest. Those narrow interests include special interests and individuals who see the world through the distorting lenses of standard subjective "narrow" ideologies such as American liberalism, conservatism, socialism, capitalism and/or Christianity.

There is some reason for optimism that citizens who adopt such objective political morals or values can come to have significant influence in American politics. Tetlock points to one observer, an engineer with the following observation: "'I think it's going to get stranger and stranger' for people listen to the advice of experts whose views are informed only by their subjective judgment." Only time will tell if any optimism is warranted. Working toward more objectivity in politics is an experiment whose outcome cannot yet be predicted, at least not by DP. Maybe one of Tetlock's superforecasters would be up to that job.

Thursday, December 31, 2015

Assessing personal risk from terrorism

IVN published a Dissident Politics article on the very real difficulty of rationally assessing personal risks from terrorism (and other threats). The personal risk of death from a terrorist attack on any given American in any given year is very low, about 1 in 20 million. Despite that low risk, over 50% of Americans who planned to travel recently changed their plans by canceling, changing or delaying their travel.

The reason is that the unconscious human mind, which controls reactions to fear, does not use statistics to assess risk and thus we unconsciously but grossly overestimate risk. Thirty percent of average Americans believe that their personal chance of dying from a terrorist attack is 100% (1 in1), not 1 in 20 million, of being killed by a terrorist in the next 12 months. In other words, 30% of Americans believe they personally will be attacked within the next year, which amounts to an incorrectly perceived 100% or 1:1 chance. Based on the statistics, that wildly incorrect belief in the likelihood of personal attack in the next year is 20 million times too high. However, that is perfectly reasonably, not too high, by the "logic" of false but persuasive, unconscious human physiological or 'psycho-logic', not real or statistics-based, unbiased logic.

In view of the data, not anyone's opinion, it is objectively irrational for anyone to change their travel plans unless a specific, credible threat exists. Despite that fact (not opinion) many people nonetheless do change their behavior despite no significant threat.[1]

The bottom line is that to think and act rationally or objectively about risk, the rational mind has to impose statistics into our unconscious thinking when it is relevant. Most people simply don't do that. The press-media and politicians foster irrational emotions such as this kind of unfounded fear and, indirectly, that fosters the irrational thinking and actions that flow from such fear. Under those circumstances, it is no wonder that Americans overreact - they are being deceived into misunderstanding by a self-serving two-party system, including the press-media industry, that benefits more from public misunderstanding than from understanding.

The article is here.

Footnote:
1. A "significant threat" is defined as a threat that has more than a 1 in 10,000 chance of actually happening in the time frame and under the conditions in which the threat is perceived, e.g., within a 1-year period for threat of personal terrorist attack. That is a Dissident Politics definition. There is no widely accepted definition for a "significant threat", so that is how Dissident Politics (DP) defines it to make what DP argues make any sense at all. No doubt, some or many others will define it as zero per year, less than a 1 in 10,000/year or something lower. Given the reality of American society, that makes little objective sense.

The lack of definitions for most all terms in politics is why most all political debate/discourse is mostly meaningless and intellectually, mostly useless. The great selling point of such empty debate is that undefined terms in political debate reinforces the beliefs of people who want to believe what they want to believe instead of belief in what is real. Like it or not, most people easily and unconsciously distort reality, including actual threats of personal risk because that is just how the human mind evolved.

Monday, December 28, 2015

Superforecasting: Book comments and quotes




Philip E. Tetlock and Dan Gardner 
Superforecasting: The Art and Science of Prediction
Crown Books, September 2015
Quotes and comments on the book

Chapter 1: An optimistic skeptic
p. 3: regarding expert opinions, there is usually no accurate measurement of how good they are, there are “just endless opinions - and opinions on opinions. And that is business as usual.”; the media routinely delivers, or corporations routinely pay for, opinions that may be accurate, worthless or in between and everyone makes decisions on that basis

p. 5: talking head talent is skill in telling a compelling story, which is sufficient for success; their track record isn’t irrelevant - most of them are about as good as random guessing; predictions are time-sensitive - 1-year predictions tend to beat guessing more than 5- or 10-year projections

p. 8-10: there are limits on what is predictable → in nonlinear systems, e.g., weather patterns, a small initial condition change can lead to huge effects (chaos theory); we cannot see very far into the future

p. 13-14: predictability and unpredictability coexist; a false dichotomy is saying the weather is unpredictable - it is usually relatively predictable 1-3 days out, but at days 4-7 accuracy usually declines to near-random; weather forecasters are slowly getting better because they are in an endless forecast-measure-revise loop; prediction consumers, e.g., governments, businesses and regular people, don’t demand evidence of accuracy, so it isn’t available, and that means no revision, which means no improvement

p. 15: Bill Gate’s observation: surprisingly often a clear goal isn’t specified so it is impossible to drive progress toward the goal; that is true in forecasting; some forecasts are meant to (1) entertain, (2) advance a political agenda, or (3) reassure the audience their beliefs are correct and the future will unfold as expected (this kind is popular with political partisans)

p. 16: the lack of rigor in forecasting is a huge opportunity; to seize it (i) set the goal of accuracy and (ii) measure success and failure

p. 18: the Good Judgment Project found two things, (1) foresight is real and some people have it and (2) it isn’t strictly a talent from birth - (i) it boils down to how people think, gather information and update beliefs and (ii) it can be learned and improved

p. 21: a 1954 book showed that analysis of 20 studies showed that algorithms based on objective indicators were better predictors than well-informed experts; more than 200 later studies have confirmed that and the conclusion is simple - if you have a well-validate statistical algorithm, use it

p. 22: machines may never be able to beat talented humans, so dismissing human judgment as just subjective goes too far; maybe the best that can be done will come from human-machine teams, e.g., Garry Kasparov and Deep Blue together against a machine or a human

p. 23: quoting David Ferrucci, IBM’s Watson’s chief engineer is optimistic: ““I think it’s going to get stranger and stranger” for people to listen to the advice of experts whose views are informed only by their subjective judgment.”; Tetlock: “. . . . we will need to blend computer-based forecasting and subjective judgment in the future. So it’s time to get serious about both. 

Chapter 2: Illusions of knowledge
p. 25: regarding a medical diagnosis error: “We all been too quick to make up our minds and too slow to change them. And if we don’t examine how we make these mistakes, we will keep making them. This stagnation can go on for years. Or a lifetime. It can even last centuries, as the long and wretched history of medicine illustrates.

p. 30: “It was the absence of doubt - and scientific rigor - that made medicine unscientific and caused it to stagnate for so long.”; it was an illusion of knowledge - if the patient died, he was too sick to be saved, but if he got better, the treatment worked - there was no controlled data to support those beliefs; physicians resisted the idea of randomized, controlled trials as proposed in 1921 for decades because they knew their subjective judgments revealed the truth

p. 35: on Khaneman’s fast system 1: “A defining feature of intuitive judgment is its insensitivity to the quality of the evidence on which the judgment is based. It has to be that way. System 1 can only do its job of delivering strong conclusions at lightning speed if it never pauses to wonder whether the evidence at hand is flawed or inadequate, or if there is better evidence elsewhere.” - context - instantly running away from a Paleolithic shadow that might be a lion; Khaneman calls these tacit assumptions WYSIATI; system 1 judgments take less than 1 sec. - there’s no time to think about things; regarding coherence: “. . . . we are creative confabulators hardwired to invent stories that impose coherence on the world.

p. 38-39: confirmation bias: (i) seeking evidence to support the 1st plausible explanation, (ii) rarely seeking contradictory evidence and (iii) being a motivated skeptic in the face of contrary evidence and finding even weak or no reasons to denigrate contradictory evidence or reject it entirely, e.g., a doctor’s belief that a quack medical treatment works for all but the incurable is taken as proof that it works for everyone except the incurably ill; that arrogant, self-deluded mind set kept medicine in the dark ages for millennia and people suffered accordingly

p. 40: attribute substitution, availability heuristic or bait and switch: one question may be difficult or unanswerable w/o more info, so the unconscious System 1 substitutes another, easier, question and the easy question’s answer is the same as the hard question’s answer, even when it is wrong; CLIMATE CHANGE EXAMPLE: people who cannot figure out climate change on their own substitute what most climate scientists believe for their own belief - it can be wrong

p. 41-42: “The instant we wake up and look past the tip of our nose, sights and sounds flow into the brain and System 1 is engaged. This system is subjective, unique to each of us.”; cognition is a matter of blending inputs from System 1 and 2 - in some people, System 1 has more dominance than in others; it is a false dichotomy to see it as System 1 or System 2 operating alone; pattern recognition: System 1 alone can make very good or bad snap judgments and the person may not know why - bad snap judgment or false positive = seeing the Virgin Mary in burnt toast (therefore, slowing down to double check intuitions is a good idea)

p. 44: tip of the nose perspective is why doctors did not doubt their own beliefs for thousands of years

Chapter 3: Keeping Score
p. 48: it is not unusual that a forecast that may seem dead right or wrong really cannot be “conclusively judged right or wrong”; details of a forecast may be absent and the forecast can’t be scored, e.g., no time frames, geographic locations, reference points, definition of success or failure, definition of terms, a specified probability of events (e.g., 68% chance of X) or lack thereof or many comparison forecasts to assess the predictability of what is being forecasted; p. 53: “. . . . vague verbiage is more the rule than the exception.”; p. 55: security experts asked what the term “serious possibility” meant in a 1951 National Intelligence Estimate → one said it meant 80 to 20 (4 times more likely than not), another said it meant 20 to 80 and others said it was in between those two extremes

p. 50-52: national security experts had views split along liberal and conservative lines about the Soviet Union and future relations; they were all wrong and Gorbachev came to power and de-escalated nuclear and war tensions; after the fact, all the experts claimed they could see it coming all along; “But the train of history hit a curve, and as Karl Marx once quipped, the intellectuals fall off.”; the experts were smart and well-informed, but they were just misled by System 1’s subjectivity (tip of the nose perspective)

p.58-59: the U.S. intelligence community resisted putting definitions and specified probabilities in their forecasts until finally, 10 years after the WMD fiasco with Saddam Hussein, the case for precision was so overwhelming that they changed; “But hopelessly vague language is still so common, particularly in the media, that we rarely notice how vacuous it is. It just slips by.

p. 60-62: calibration: perfect calibration = X% chance of an event when past forecasts have always been “there is a X% chance” of the event, e.g., rainfall; calibration requires many forecasts for the assessment and is thus impractical for rare events, e.g., presidential elections; underconfidence = prediction is X% chance, but reality is a larger X+Y% chance; overconfidence = prediction is X% chance, but reality is a smaller X-Y% chance

p. 62-66: the two facets of good judgment are captured by calibration and resolution; resolution: high resolution occurs when predictions of low < ~ 20% or high > ~80% probability events are accurately predicted; accurately predicting rare events gets more weight than accurately predicting more common events; a low Brier score is best, 0.0 is perfect, 0.5 is random guessing and 2.0 is getting all or none, or yes or no, predictions wrong 100% of the time; however a score of 0.2 in one circumstance, e.g., weather prediction in Phoenix, AZ looks bad, while a score of 0.2 in Springfield MO is great because the weather there is far less predictable than in Phoenix; apples-to-apples comparisons are necessary, but it is very hard to find that kind of data - it usually doesn’t exist

p. 68: In EPJ, the bottom line was that some experts were marginally better than random guessing - the common characteristic was how they thought, not their ideology, Ph.D. or not, or access to classified information; the typical expert was about as good as random guessing and their thinking was ideological; “They sought to squeeze complex problems into the preferred cause-effect templates and treated what did not fit as irrelevant distractions. Allergic to wishy-washy answers, they kept pushing their analyses to the limit (and then some), using terms like “furthermore” and “moreover” when piling up reasons why they were right and others were wrong. As a result, they were confident to declare things “impossible” or “certain.” Committed to their conclusions, they were reluctant to change their minds even when their predictions clearly failed. They would tell us, “Just wait.”

p. 69: “The other group consisted of more pragmatic experts who drew on many analytical tools, with the choice of tool hinging on the particular problem they faced. . . . . They talked about possibilities and probabilities, not certainties.

p. 69: “The fox knows many things but the hedgehog knows one big thing. . . . . Foxes beat hedgehogs on both calibration and resolution. Foxes had real foresight. Hedgehogs didn’t. . . . . How did hedgehogs manage to do slightly worse than random guessing?”; hedgehog example is CNBC’s Larry Kudlow and his supply side economics Big Idea in the face of the 2007 recession
p. 70-72: on Kudlow: “Think of that Big Idea as a pair of glasses that the hedgehog never takes off. . . . And, they aren’t ordinary glasses. They are green-tinted glasses . . . . Everywhere you look, you see green, whether it’s there or not. . . . . So the hedgehog’s one Big Idea doesn’t improve his foresight. It distorts it.”; more information helps increase hedgehog confidence, not accuracy; “Not that being wrong hurt Kudlow’s career. In January 2009, with the American economy in a crisis worse than any since the Great Depression, Kudlow’s new show, The Kudlow Report, premiered on CNBC. That too is consistent with the EPJ data, which revealed an inverse correlation between fame and accuracy: the more famous an expert was, the less accurate he was.”; “As anyone who has done media training knows, the first rule is keep it simple, stupid. . . . . People tend to find uncertainty disturbing and “maybe” underscores uncertainty with a bright red crayon. . . . . The simplicity and confidence of the hedgehog impairs foresight, but it calms nerves - which is good for the careers of hedgehogs. . . . Foxes don’t fare so well in the media. . . . This aggregation of many perspectives is bad TV.

p. 73: an individual who does a one-off accurate guess is different from people who do it consistently; consistency is based on aggregation, which is the recognition that useful info is widely dispersed and each bit needs a separate weighting for importance and relevance

p 74: on information aggregation: “Aggregating the judgments of people who know nothing produces a lot of nothing.”; the bigger the collective pool of accurate information, the better the prediction or assessment; Foxes aggregate, but Hedgehogs don’t

p. 76-77: aggregation: looking at a problem from one perspective, e.g., pure logic can lead to an incorrect answer; multiple perspectives are needed; using both logic and psycho-logic (psychology or human cognition) helps; some people are lazy and don’t think, some apply logic to some degree and then stop, while others pursue logic to its final conclusion → aggregate all of those inputs to arrive at the best answer; “Foxes aggregate perspectives.

P 77-78: on human cognition - we don’t aggregate perspectives naturally: “The tip-of-your nose perspective insists that it sees reality objectively and correctly, so there is no need to consult other perspectives.

p. 79-80: on perspective aggregation: “Stepping outside ourselves and really getting a different view of reality is a struggle. But Foxes are likelier to give it a try.”; people’s temperament fall along a spectrum from the rare pure Foxes to the rare pure Hedgehogs; “And our thinking habits are not immutable. Sometimes they evolve without out awareness of the change. But we can also, with effort, choose to shift gears from one mode to another.

Chapter 4: Superforecasters
p. 84-85: the U.S. intelligence community (IC) is, like every huge bureaucracy (about 100,000 people, about $50 billion budget), very change-resistant - they saw and acknowledged their colossal failure to predict the Iranian revolution, but did little or nothing to address their dismal capacity to predict situations and future events; the WMD-Saddam Hussein disaster 22 years later finally inflicted a big enough shock to get the IC to seriously introspect

p 88 (book review comment?): my IARPA work isn’t as exotic as DARPA, but it can be just as important: that understates the case → it is more important

p. 89: humans “will never be able to forecast turning points in the lives of individuals or nations several years into the future - and heroic searches for superforecasters won’t change that.”; the approach: “Quit pretending you know things you don’t and start running experiments.

p. 90-93: the shocker: although the detailed result is classified, Good Judgment Project (GJP)) volunteers who passed screening and used simple algorithms but without access to classified information beat government intelligence analysts with access to classified information; one contestant (a retired computer programmer) had a Breier score of 0.22, 5th highest among 2,800 GJP participants and then in a later competition among the best forecasters, his score increased to 0.14, top among the initial group of 2,800 → he beat the commodities futures markets by 40% and the “wisdom of the crowd” control group by 60%

p. 94-95: the best forecasters got things right at 300 days out more than regular forecasters looking out 100 days and that improved over the 4-year GJP experiment: “. . . . these superforecasters are amateurs forecasting global events in their spare time with whatever information they can dig up. Yet they somehow managed to set the performance bar high enough that even the professionals have struggled to get over it, let alone clear it with enough room to justify their offices, salaries and pensions.

p. 96 (book review comment?): “And yet, IARPA did just that: it put the intelligence community’s mission ahead of the people inside the intelligence community - at least ahead of those insiders who didn’t want to rock the bureaucratic boat.

p. 97-98: “But it’s easy to misinterpret randomness. We don’t have an intuitive feel for it. Randomness is invisible from the tip-of-your-nose perspective. We can see it only if we step outside of ourselves.”; people can be easily tricked into believing that they can predict entirely random outcomes, e.g., guessing coin tosses; “. . . . delusions of this sort are routine. Watch business news on television, where talking heads are often introduced with a reference to one of their forecasting references . . . . And yet many people takes these hollow claims seriously.

p. 99: “Most things in life involve skill and luck, in varying proportions.

p. 99-101: regression to the mean cannot be overlooked and is a necessary tool for testing the role of luck in performance → regression is slow for activities dominated by skill, e.g., forecasting, and fast for activities dominated by chance/randomness, e.g., coin tossing

p. 102-103: the key question is how did superforecasters hold up across the years? → in years 2 and 3, superforecasters were the opposite of regressing and they got better; sometimes causal connections are nonlinear and thus not predictable and some of that had to be present among the variables that affected what the forecasters were facing → there should be some regression unless an offsetting process is increasing forecasters’ performance; there is some regression - about 30% of superforecasters fall out of the top 2% each year but 70% stay in - individual year-to-year correlation is about 0.65, which is pretty high, i.e., about 1 in 3 → Q: Why are these people so good?

Chapter 5: Supersmart?
p. 114: Fermi-izing questions, breaking a question into relevant parts, allows better guesses, e.g., how many piano tuners are there in Chicago → guess total pop, total # pianos, time to do one piano, hours/year a tuner works → that technique usually helps increase accuracy a lot, even when none of the numbers are known; Fermi-izing tends to defuse the unconscious System 1’s tendency to bait & switch the question; EXAMPLE: would testing of Arafat’s body 6 years after his death reveal the presence of Polonium, which is allegedly what killed him? → Q1 - can you even detect Po 6 years later? Q2: if Po is still detectable, how could it have happened, e.g., Israel, Palestinian enemies before or after his death → for this question the outside view, what % of exhumed bodies are found to be poisoned is hard to (i) identify and (ii) find the answer to, but identifying it is most important, i.e., it’s not certain (< 100%, say 80%), but it has to be more that trivial evidence otherwise authorities would not allow his body to be exhumed (> 20%) → use the 20-80% halfway point of 50% as the outside view, then adjust probability up or down based on research and the inside or intuitive System 1 view

p.118: superforecasters look at questions 1st from Khaneman’s “outside view”, i.e., the statistical or historical base rate or norm (the anchor) and then 2nd use the inside view to adjust probabilities up or down → System 1 generally goes straight to the comfortable but often wrong inside view and ignores the outside view; will there be a Vietnam-China border clash in the nest year starts the 1st (outside) view that asks how many clashes there have been over time, e.g., once every 5 years, and then merged in the 2nd view of current Vietnam-China politics to adjust the baseline probability up or down

p. 120: the outside view has to come first; “And it’s astonishingly easy to settle on a bad anchor.”; good anchors are easier to find from the outside view than from the inside

p. 123-124: some superforecasters kept explaining in the GJP online forum how they approached problems, what their thinking was and asking for criticisms, i.e., they were looking for other perspectives; simply asking if a judgment is wrong tends to lead to improvement in the first judgment; “The sophisticated forecaster knows about confirmation bias and will seek out evidence that cuts both ways.

p. 126: “A brilliant puzzle solver may have the raw material for forecasting, but if he also doesn’t have an appetite for questioning basic, emotionally-charged beliefs he will often be at a disadvantage relative to a less intelligent person who has a greater capacity for self-critical thinking.

p. 127: “For superforecasters, beliefs are hypothesis to be tested, not treasures to be guarded.

Chapter 6: Superquants?
p. 128-129: most superforecasters are good at math, but mostly they rely on subjective judgment: one super said this: “It’s all, you know, balancing, finding relevant information and deciding relevant is this really?”; it’s not math skill that counts most - its nuanced subjective judgment

p. 138-140: we crave certainty and that’s why Hedgehogs and their confident yes or no answers on TV are far more popular and comforting than Foxes with their discomforting “on the one hand . . . but on the other” style; people equate confidence with competence; “This sort of thinking goes a long way to explaining why so many people have a poor grasp of probability. . . . The deeply counterintuitive nature of statistics explains why even very sophisticated people often make elementary mistakes.” A forecast of a 70% chance of X happening means that there is a 30% chance it won’t - that fact is lost on most people → most people translate an 80% of X to mean X will happen and that just ain’t so; only when probabilities are closer to even, maybe about 65:35 to 34:65 (p. 144), does the translation for most people become “maybe” X will happen, which is the intuitively uncomfortable translation of uncertainty associated with most everything

p. 143: superforecasters tend to be probabilistic thinkers, e.g., Treasury secy Robert Rubin; epistemic uncertainty describes something unknown but theoretically knowable, while aleatory uncertainty is both unknown and unknowable

p. 145-146: superforecasters who use more granularity, a 20, 21 or 22% chance of X tended to be more accurate than those who used 5% increments and they tended to be more accurate than those who used 10% increments, e.g., 20%, 30% or 40%; when estimates were rounded to the nearest 5% or 10%, the granular best superforecasters fell into line with all the rest, i.e., there was real precision in those more granular 1% increment predictions

p. 148-149: “Science doesn’t tackle “why” questions about the purpose of life. It sticks to “how” questions that focus on causation and probabilities.”; “Thus, probabilistic thinking and divine-order thinking are in tension. Like oil and water, chance and fate do not mix. And to the extent we allow our thoughts to move in the direction of fate, we undermine our ability to think probabilistically. Most people tend to prefer fate.

p. 150: the sheer improbability of something that does happen, you meet and marry your spouse, is often attributed to fate or God’s will, not the understanding that sooner or later many/most people get married to someone at some point in their loves; the following psycho-logic is “incoherent”, i.e., not logic: (1) the chance of meeting the love of my life was tiny, (2) it happened anyway, (3) therefore it was meant to be and (4) therefore, the probability it would happen was 100%

p. 152: scoring for tendency to accept or reject fate and accept probabilities instead, average Americans are mixed or about 50:50, undergrads somewhat more biased toward probabilities and superforecasters are the most grounded in probabilities, while rejecting fate as an explanation; the more inclined a forecaster is to believe things are destined or fate, the less accurate their forecasts were, while probability-oriented forecasters tended to have the highest accuracy → the correlation was significant

Chapter 7: Supernewsjunkies?
p. 154-155: based on news flowing in, superforecasters tended to update their predictions and that tended to improve accuracy; it isn’t just a matter of following the news and changing output from sufficient new input - their initial forecasts were 50% more accurate that regular forecasters

p. 160: belief perseverance = people “rationalizing like crazy to avoid acknowledging new information that upsets their settled beliefs.” → extreme obstinacy, e.g., the fact that something someone predicted didn’t happen is taken as evidence that it will happen

p. 161-163: on underreacting to new information: “Social psychologists have long known that getting people to publicly commit to a belief is a great way to freeze it in place, making it resistant to change. The stronger the commitment, the greater the resistance.”; perceptions are a matter of our “identity”; “. . . . people’s  views on gun control often correlate with their views on climate change, even though the two issues have no logical connection to each other. Psycho-logic trumps logic.”; “. . . . superforecasters may have a surprising advantage: they’re not experts or professionals, so they have little ego invested in each forecast.”; consider “career CIA analysts or acclaimed pundits with their reputations on the line.

p. 164: on overreacting to new information: dilution effect = irrelevant or noise information can and often does change perceptions of probability and that leads to mistakes; frequent forecast updates based on small “units of doubt” (small increments) and that seems to tend to minimize overreacting and underreacting; balancing new information with the info that drive the original or earlier updates captures the value of all the information

p. 170: Baye’s theorem: new/updated belief/forecast = prior belief x diagnostic value of the new information; most superforecasters intuitively understand Baye’s theorem, but can’t write the equation down nor do they actually use it, instead they use the concept and weigh updates based on the value of new information

Chapter 8: Perpetual Beta
p. 174-175: two basic mindsets - the growth mindset is that you can learn and grow through hard work; the fixed mindset holds that you have what you were born with and that innate talents can be revealed but not created or developed, e.g., fixed mindsetters say things like, e.g., “I’m bad at math”, and it becomes a self-fulfilling prophecy; fixed mindset children given harder puzzles give up and lose interest, while growth mindset kids loved the challenge because for them, learning was a priority

p. 178: consistently inconsistent - John Maynard Keynes: engaged in an endless cycle of try, fail, analyze, adjust, try again; he retired wealthy from his investing, despite massive losses from the great depression and other personal blunders; skills improve with practice

p. 181-183: lack of prompt feedback is necessary for improvement, but it is usually lacking - experience alone doesn’t compensate - experienced police gain confidence that they are good at spotting liars, but it isn’t true because they don’t improve with time; most forecasters get little or no feedback because (1) their language is ambiguous and their forecasts are thus not precise enough to evaluate - self-delusion is a real concern and (2) a long time lag between forecast and time to get feedback on success or failure - with time a person forgets the details of their own forecasts and hindsight bias distorts memory, which makes it worse; vague language is elastic and people read into it what they want; hindsight bias = knowing the outcome of an event and that distorts our perception of what we thought we knew before the outcome; experts succumb to it all the time, e.g., prediction of loss of communist power monopoly in the Soviet Union before it disintegrated in 1991 and after it happened → recall was 31% higher than their original estimate

p. 190: “Superforecasters are perpetual beta.” - they have the growth mind set

p. 191: list of superforecaster traits

Chapter 9: Superteams
p. 201: success can lead to mental habits that undermine the mental habits that led to success in the first place; on the other hand, properly functioning teams can generate dragonfly eye perspectives, which can improve forecasting

p. 208-209: givers on teams are not chumps - they tend to make the whole team perform better; it is complex and it will take time to work out the psychology of groups - replicating this won’t be easy in then real world; “diversity trumps ability” may be true due to the different perspectives a team can generate or, maybe it’s a false dichotomy and a shrewd mix of ability and diversity is the key to optimum performance

Chapter 10: The Leader’s Dilemma
p. 229-230: Tetlock uses the Wehrmacht as an example of how leadership and judgment can be effectively combined, even though it served an evil end → the points being that (i) even evil can operate intelligently and creatively so therefore don’t underestimate your opponent and (ii) seeing something as evil and wanting to learn from it presents no logical contradiction but only a psycho-logical tension that superforecasters overcome because they will learn from anyone or anything that has information or lessons of value

Chapter 11: Are They really So Super?
p. 232-233: in a 2014 interview Gen. Michael Flynn, Head of DIA (DoD’s equivalent of the CIA; 17,000 employees) said “I think we’re in a period of prolonged societal conflict that is pretty unprecedented.” but googling the phrase “global conflict trends” says otherwise; Flynn, like Peggy Noonan and her partisan reading of political events, suffered from the mother of all cognitive illusions, WYSIATI → every day for three hours, Flynn saw nothing but reports of conflicts and bad news; what is important is the fact that Flynn, a highly accomplished and intelligent operative fell for the most obvious illusion there is → even when we know something is a System 1 cognitive illusion, we sometimes cannot shut it off and see unbiased reality, e.g., Müller-Lyer optical illusion (two equal lines, one with arrow ends pointing out and one with ends pointing in - the in-pointing arrow line always looks longer, even when you know it isn’t)

p. 234-237: “. . . . dedicated people can inoculate themselves to some degree against certain cognitive illusions.”; scope insensitivity is a major illusion of particular importance to forecasters - it is another bait & switch bias or illusion where a hard question is unconsciously substituted with a simpler question, e.g., the average amount groups of people would be willing to pay to avoid 2,000, 20,000 or 200,000 birds drowning in oil ponds was the same for each group, $80 → the problem’s scope recedes into the background so much that it becomes irrelevant; the scope insensitivity bias or illusion (Tetlock seems to use the terms interchangeably) is directly relevant to geopolitical problems; surprisingly, superforecasters were less influenced by scope insensitivity than average forecasters - scope sensitivity wasn’t perfect, but it was good (better than Khaneman guessed it would be); Tetlock’s guess → superforecasters were skilled and persistent in making System 2 corrections of System 1 judgments, e.g., by stepping into the outside view, which dampens System 1 bias and/or ingrains the technique to the point that it is “second nature” for System 1

p. 237-238: CRITICISM: how long can superforecasters defy psychological gravity?; maybe a long time - one developed software designed to correct System 1 bias in favor of the like-minded and that helped lighten the heavy cognitive load of forecasting; Nassim Taleb’s Black Swan criticism of all of this is that (i) rare events, and only rare events, change the course of history and (ii) there just aren’t enough occurrences to judge calibration because so few events are both rare and impactful on history; maybe superforecasters can spot a Black Sawn and maybe they can’t - the GJP wasn’t designed to ask that question

p. 240-241, 244: REBUTTAL OF CRITICISM: the flow of history flows from both Black Swan events and from incremental changes; if only Black Swans counted, the GJP would be useful only for short-term projections and with limited impact on the flow of events over long time frames; and, if time frames are drawn out to encompass a Black Swan, e.g., the one-day storming of the Bastille on July 14, 1789 vs. that day plus the ensuing 10 years of the French revolution, then such events are not so unpredictable - what’s the definition of a Black Swan?; other than the obvious, e.g., there will be conflicts, predictions 10 years out are impossible because the system is nonlinear

p. 245: “Knowing what we don’t know is better than thinking we know what we don’t.”; “Khaneman and other pioneers of modern psychology have revealed that our minds crave certainty and when they don’t find it, they impose it.”; referring to experts revisionist response the unpredicted rise of Gorbachev: “In forecasting, hindsight bias is the cardinal sin.” - hindsight bias not only makes past surprises seem less surprising, it also fosters belief that the future is more predictable than it is

Chapter 12: What’s Next?
p. 251: “On the one hand, the hindsight-tainted analyses that dominate commentary major events are a dead end. . . . . On the other hand, our expectations of the future are derived from our mental models of how the world works, and every event is an opportunity to learn and improve those models.”; the problems is that “effective learning from experience can’t happen without clear feedback, and you can’t have clear feedback unless your forecasts are unambiguous and scoreable.

p. 252: “Vague expressions about indefinite futures are not helpful. Fuzzy thinking can never be proven wrong. . . . . Forecast, measure, revise: it is the surest path to seeing better.” - if people see that, serious change will begin; “Consumers of forecasting will stop being gulled by pundits with good stories and start asking pundits how their past predictions fared - and reject answers that consist of nothing but anecdotes and credentials. And forecasters will realize . . . . that these higher expectations will ultimately benefit them, because it is only with the clear feedback that comes with rigorous testing that they can improve their foresight.

p. 252-253: “It could be huge - an “evidence-based forecasting” revolution similar to the “evidence-based medicine” revolution, with consequences every bit as significant.

p. 253: nothing is certain: “Or nothing may change. . . . . things may go either way.”; whether the future will be the “stagnant status quo” or change “will be decided by the people whom political scientists call the “attentive public. I’m modestly optimistic.

p. 254-256: one can argue that the only goal of forecasts is to be accurate but in practice, there are multiple goals - in politics the key question is - Who does what to whom? - people lie because self and tribe matter and in the mind of a partisan (Dick Morris predicting a Romney landslide victory just before he lost is the example Tetlock used - maybe he lied about lying) lying to defend self or tribe is justified because partisans want to be the who doing whatever to the whom; “If forecasting can be co-opted to advance their interests, it will be.” - but on the other hand, the medical community resisted efforts to make medicine scientific but over time persistence and effort paid off - entrenched interests simply have to be overcome

p. 257: “Evidence-based policy is a movement modeled on evidence-based medicine, with the goal of subjecting government policies to rigorous analysis so that legislators will actually know - not merely think they know - whether policies do what they are supposed to do.”; “. . . . there is plenty of evidence that rigorous analysis has made a real difference in government policy.”; analogies exist in philanthropy (Gates Foundation) and sports - evidence is used to feed success and curtail failure

p. 262-263: “What matters is the big question, but the big question can’t be scored.”, so ask a bunch of relevant small questions - it’s like pointillism painting - each dot means little but thousands of dots create a picture; clusters of little questions will be tested to see if that technique can shed light on big questions

p. 264-265: elements of good judgment include foresight and moral judgment, which can’t be run through an algorithm; asking the right questions may not be the province of superforecasters - Hedgehogs often seem to come up with the right questions - the two mindsets needed for excellence may be different

p.266: the Holy Grail of my research: “. . . . using forecasting tournaments to depolarize unnecessarily polarized policy debates and make us collectively smarter.”

p. 269: adversarial but constructive collaboration requires good faith; “Sadly, in noisy public arenas, strident voices dominate debates, and they have zero interest in adversarial collaboration. . . . But there are less voluble and more reasonable voices. . . . . let them design clear tests of their beliefs. . . . . When the results run against their beliefs, some will try to rationalize away the facts, but they will pay a reputational price. . . . . All we have to do is get serious about keeping score.

Invitation to participate at the GJP website: www.goodjudgement.com.