Computational Journalism 2017 Week 6: Drawing Conclusions From Data

Frontiers of
Computational Journalism
Columbia Journalism School
Week 6: Drawing Conclusions from Data
October 20, 2017

This class
Randomness and Randomization Testing
$%@*! P-Values
Bayesian inference
Causal Models
Analysis of Competing Hypotheses
Randomness
Margin of Error
Which one is random?
One star per box less random
Two principles of randomness
1. Random data has patterns in it way more often

than you think.
2. This problem gets much more extreme when you

have less data.
Is this die loaded?
Are these two dice loaded?
Two dice: non-uniform distribution
Statistics Without Numbers
Is something causing cancer?
Cancer rate per county. Darker = greater incidence of cancer.

From Graphical Inference for Infovis, Wickham et. Al.
Which of these is real data?
Global temperature record
How likely is it that the temperature won't increase over next decade?
From The Signal and the Noise, Nate Silver
It is conceivable that the 14 elderly people who are reported to have
died soon after receiving the vaccination died of other causes.
Government officials in charge of the program claim that it is all a
coincidence, and point out that old people drop dead every day. The
American people have even become familiar with a new statistic:
Among every 100,000 people 65 to 75 years old, there will be nine or
ten deaths in every 24-hour period under most normal circumstances.
Even using the official statistic, it is disconcerting that three elderly

people in one clinic in Pittsburgh, all vaccinated within the same hour,
should die within a few hours thereafter. This tragedy could occur by
chance, but the fact remains that it is extremely improbable that such
a group of deaths should take place in such a peculiar cluster by pure
coincidence.
- New York Times editorial, 14 October 1976

Assuming that about 40 percent of elderly Americans were
vaccinated within the first 11 days of the program, then about 9 million
people aged 65 and older would have received the vaccine in early
October 1976. Assuming that there were 5,000 clinics nationwide, this
would have been 164 vaccinations per clinic per day. A person aged
65 or older has about a 1-in-7,000 chance of dying on any particular
day; the odds of at least three such people dying on the same day
from among a group of 164 patients are indeed very long, about
480,000 to one against. However, under our assumptions, there were
55,000 opportunities for this extremely improbable event to occur
5,000 clinics, multiplied by 11 days. The odds of this coincidence
occurring somewhere in America, therefore, were much shorteronly
about 8 to 1
- Nate Silver, The Signal and the Noise, Ch. 7 footnote 20

The Howland Will Trial
Randomization to detect insider trading
Looking at executives' trading in the week before their companies
made news, the Journal found that one of every 33 who dipped in and
out posted average returns of more than 20% (or avoided 20%
downturns) in the following week. By contrast, only one in 117 executives
who traded in an annual pattern did that well.
$%@*! P-Values
P-value
p(observed data > your data | null hypothesis)

Whats it good for? Whats it bad for?
From A dirty dozen: twelve p-value misconceptions, S.Goodman
Is one classroom better than another?
T-test for two groups with different variance. Expected to have

T-distribution under under null hypothesis of equal scores
Reasons for possible differences
Things that depend on which classroom a student is in
Things that dont depend on which classroom theyre in

Reasons for possible differences
Things that depend on which classroom a student is in
Things that dont depend on which classroom theyre in

Break the relationship
observed difference
between classes
observed difference
between classes
14% of all resamples have a class difference > observed, so p = 0.14

New samples from the data
Computing the sampling distribution
Boostrapping: resample with repetition. This gives an excellent

approximation of the sampling distribution, even if non-normal.
A dirty dozen: twelve p-value misconceptions, S. Goodman
A dirty dozen: twelve p-value misconceptions, S. Goodman
Bayesian inference
A more complete theory
Compare probability of multiple alternatives.

Did the stoplight reduce accidents?
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
7
4
1
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
8
5
2
Simulated without stoplight
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
9
6
3
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
7
4
1
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
8
5
2
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
Simulated with a 50% effective stoplight
9
6
3
Bayes learns from evidence
Pr(H|E) = Pr(E|H) Pr(H) / Pr(E)
or
P(H|E) = Pr(E|H)/Pr(E) * Pr(H)
Posterior Likelihood Prior

How likely is H Base Rate How likely was
Probability of
given evidence E? How commonly H to begin with?
seeing E
do we see E at all?
if H is true
Probability distribution over hypotheses
Is the NYPD targeting mosques for stop-and-frisk?
0
H0 H1 H2
Never Once or twice Routinely
*Tricky: you have to imagine a hypothesis before you can assign it

a probability.
Parameter Estimation
Computing probability for a continuum of hypotheses
P(|E) = Pr(E|)/Pr(E) * Pr()

Strength of Evidence
Can we find a p-value equivalent?
There is Bayes factor
Pr(H1|E)/Pr(H2|E)
= [Pr(E|H1)Pr(H1)/Pr(E)] / [Pr(E|H2)Pr(H2)/Pr(E)]
= Pr(E|H1)/Pr(E|H2) * Pr(H1)/Pr(H2)
Bayes Factor
Ok, but whats a significant Bayes Factor?
From Bayes Factors, Kass and Raftery

The Garden of Forking Paths
I Fooled Millions Into Thinking Chocolate Helps Weight Loss. Here's How.
John Bohannon
Science Isnt Broken, FiveThirtyEight
Statistical significance is usually asking the wrong
question.
Does the model reproduce the data?
Testing for Racial Discrimination in Police Searches of Motor Vehicles, Simoiu et al.
Causal Models
Does chocolate make you smarter?
Occupational Group Smoking Mortality
Farmers, foresters, and fisherman 77 84
Miners and quarrymen 137 116
Gas, coke and chemical makers 117 123
Glass and ceramics makers 94 128
Furnace, forge, foundry, and rolling mill 116 155
Electrical and electronics workers 102 101
Engineering and allied trades 111 118
Woodworkers 93 113
Leather workers 88 104
Textile workers 102 88
Clothing workers 91 104
Food, drink, and tobacco workers 104 129
Paper and printing workers 107 86
Makers of other products 112 96

Does marriage make women safer?
How correlation happens
X Y X Y
X causes Y Y causes X
X Y X Y
Z causes X and Y hidden variable causes X and Y
X Y
random chance!
Guns and firearm homicides?
X Y
if you have a gun, you're going to use it
X Y
if it's a dangerous neighborhood, you'll buy a gun
X Y
the correlation is due to chance

Beauty and responses
X Y
telling a woman she's beautiful

makes her respond less
X Y
if a woman is beautiful,
1) she'll respond less
2) people will tell her that
Beauty is a "confounding variable." The correlation is

real, but you've misunderstood the causal structure.
What an experiment is:
intervene in a network of causes
Does Facebook news feed cause
people to share links?
Analysis
of Competing Hypotheses
Cognitive biases
Availability heuristic: we use examples that come to mind,
instead of statistics.
Preference for earlier information: what we learn first has a much

greater effect on our judgment.
Memory formation: whatever seems important at the time is what

gets remembered.
Confirmation bias: we seek out and give greater importance to

information that confirms our expectations.
Confirmation bias
Comes in many forms.
...unconsciously filtering information that doesn't fit expectations.
...not looking for contrary information.
...not imagining the alternatives.

Method of competing hypotheses
Start with multiple hypotheses H0, H1, ... HN
(Remember, if you can't imagine it, you can't conclude it!)
Go looking for information that gives you the best ability to discriminate
between hypotheses.
Evidence which supports Hi is much less useful than evidence which

supports Hi much more than Hj, if the goal is to choose a hypothesis.
In practice: Triangulation
A good conclusion is one which is supported by multiple lines of evidence
from multiple methods.
Philosophy ought to imitate the successful sciences in its methods, so far as

to proceed only from tangible premises which can be subjected to careful
scrutiny, and to trust rather to the multitude and variety of its arguments
than to the conclusiveness of any one. Its reasoning should not form a
chain which is no stronger than its weakest link, > but a cable whose fibers
may be ever so slender, provided they are sufficiently numerous and
intimately connected.
- Charles Sanders Peirce

A difficult example
NYPD performs ~600,000 street stop and frisks per year.
What sorts of conclusions could we draw from this

data? How?
Stop and Frisk Causation
Suppose you take the address of every mosque in NYC,

and discover that there are 15% more stop-and-frisks within
100m of mosques than the overall average.
Can we conclude that the police are targeting Muslims?

Computational Journalism 2017 Week 6: Drawing Conclusions From Data

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computational Journalism 2017 Week 6: Drawing Conclusions From Data

Uploaded by

Copyright:

Available Formats

Frontiers of

October 20, 2017

1. Random data has patterns in it way more often

2. This problem gets much more extreme when you

Cancer rate per county. Darker = greater incidence of cancer.

Even using the official statistic, it is disconcerting that three elderly

- New York Times editorial, 14 October 1976

- Nate Silver, The Signal and the Noise, Ch. 7 footnote 20

p(observed data > your data | null hypothesis)

T-test for two groups with different variance. Expected to have

Things that depend on which classroom a student is in

Things that dont depend on which classroom theyre in

Things that depend on which classroom a student is in

Things that dont depend on which classroom theyre in

14% of all resamples have a class difference > observed, so p = 0.14

Boostrapping: resample with repetition. This gives an excellent

Compare probability of multiple alternatives.

P(H|E) = Pr(E|H)/Pr(E) * Pr(H)

Posterior Likelihood Prior

Never Once or twice Routinely

*Tricky: you have to imagine a hypothesis before you can assign it

Computing probability for a continuum of hypotheses

P(|E) = Pr(E|)/Pr(E) * Pr()

From Bayes Factors, Kass and Raftery

Farmers, foresters, and fisherman 77 84

Miners and quarrymen 137 116

Gas, coke and chemical makers 117 123

Glass and ceramics makers 94 128

Furnace, forge, foundry, and rolling mill 116 155

Electrical and electronics workers 102 101

Engineering and allied trades 111 118

Leather workers 88 104

Textile workers 102 88

Clothing workers 91 104

Food, drink, and tobacco workers 104 129

Paper and printing workers 107 86

Makers of other products 112 96

Z causes X and Y hidden variable causes X and Y

if you have a gun, you're going to use it

if it's a dangerous neighborhood, you'll buy a gun

the correlation is due to chance

telling a woman she's beautiful

Beauty is a "confounding variable." The correlation is

Preference for earlier information: what we learn first has a much

Memory formation: whatever seems important at the time is what

Confirmation bias: we seek out and give greater importance to

Comes in many forms.

...unconsciously filtering information that doesn't fit expectations.

...not looking for contrary information.

...not imagining the alternatives.

Evidence which supports Hi is much less useful than evidence which

Philosophy ought to imitate the successful sciences in its methods, so far as

- Charles Sanders Peirce

NYPD performs ~600,000 street stop and frisks per year.

What sorts of conclusions could we draw from this

Suppose you take the address of every mosque in NYC,

Can we conclude that the police are targeting Muslims?

You might also like