You are on page 1of 5

Welcome back. This is week six, and this is module five of five.

In this module, we're going to talk about statistics. Now, obviously, you can take a whole Ph.D or more in this topic. The point of this module, the point I want to get across to you, is the idea that P-values. I'll explain what those are. That p-values are not an indicator of quality research. That's the take-home message. P-values are not an indicator of quality research. Let's explore what that means. So first, let's start with some basic jargon, some basic understanding. We can imagine a target population, a group we wish to study. Sometimes epidemiologists call this the source population. This might be a group of persons living in America in the year 2013. It might be schoolchildren in a school at some given time. It might be persons living in South Africa. Whatever it might be, the target population is what we want to learn about. And the target population has some characteristics. Those characteristics could be theoretically measured if we measured everybody in the target population, did a census if you will. And that kind of measure is called a parameter. That is, it's a summary measure from the whole target population, a parameter. But, it's very rare that we can measure everyone in a whole target population, so instead we take a sample. The best samples, of course, are representative of the target population. Representative. The best way to do that is from random sampling. In any case, we draw a sample from the target population. And the summary measure we take from that sample is called a statistic. So here's the big distinction. A statistic comes from a sample. And a parameter comes from the whole target population. The key idea of inference, more

technically statistical inference, is an idea of taking repeated samples from the target population. Importantly, this is not usually done. Instead, we imagine it. We imagine taking repeated samples from the same target population. This is statistical theory. So, here we can see the target population and we take sample number one. We take another independent sample from the target population; call it two. Yet another, three. And so on and so forth. Each of these samples Either real or hypothetical, has a summary measure. Maybe an average health measure. This is a statistic. And in statistics we have the letter y, the outcome variable, and subscript one says from the first sample and the bar above it, that horizontal bar represents an average. Sometimes called the mean. So here we have four means, four averages, from four independent samples. The idea is to do this over and over again. Frankly, infinitely many times. Then if we took those averages and graphed them, on a piece of graph paper. What we'd see is the well known bell shaped curve. Sometimes called a normal distribution. At the center here, at the center would be the mean or the average of all the averages. There's symmetry here that happens a lot, but not always. That is it's a shape, is the same both sides. You could flip this distribution on a piece of paper and fold it onto itself, and it would be the same. That's one way to think about symmetry. In any case, this is a very special distribution because it's the distribution of the sample statistics Of the averages from the repeated samples. This distribution has some variability, some spread. That variability of the sample statistics is called the standard error. Technically, the standard error is the standard deviation of the sampling distribution. That's a lot of jargon. But all you need to know is that, there is some variation in averages if we keep taking averages from the target

population. One sample after another. If we compare two groups, one exposed to a virus, the other not. One gets a treatment program. The other does not. And we say that these two groups have an average difference. That is, we compare the average in the treated group, the average in the control or comparison group. And they're the same or different. That's really what we're doing in epidemiology. This group has a lot of social capital, this group doesn't. How does their health differ? This group got exposed to the virus. This group didn't. What's the prevalence of a cold? some sort of symptom? If the difference, the average difference, between these two groups is 0, we say there's no difference. On the other hand, if the average difference is above or in fact less than 0, it doesn't really matter, but if it's different from 0, then we. But the rub in statistics, the important point, is that there's variation around each of those summary measures, those averages. So the first group could have an average of five, the second group could have an average of seven, but that five and seven each have some variability. So we use some statistical calculations to discern, to figure out, if 5 minus 7 is actually a difference, or if it's due to chance alone. Chance alone is the key idea. To do this, we often calculate the statistic called the p-value. A p-value. And there's some math behind it, and you can Google it if you want. I'm not going to go into that here. the point I want to make is that the p-value has no bearing on the quality of the study. So if we calculate a p-value associated some statistical test, and average difference. And we see that the k- -- p- -- heh -and we see that the p-value is less than some threshold that we set. Usually, it's .05, so it's small. Then we can say, hey, we can reject the null hypothesis. All that means is we say that there is

some difference. So often if you read the literature, or sometimes the newspaper, or some website about news, they'll say a study showed this is different from that, p point 04. That could be true, that might be true, but that doesn't mean that the study is quality. And it's the quality study that I want to convey to you So what does this mean? Well, we have some t statistic. That's just some jargon. In the numerator here, this is our difference. This is the 5 to 7, which is 2, or negative 2, if you want. And then, underneath, in the denominator here, We have the standard error, which I described earlier. In this case, it's the standard error of those differences. This thing gets compared -- this becomes a number and gets compared to some known statistical distribution. Fair enough. And then what happens is we see if that difference is out in this red region. And in fact, it can go either side, but in this demonstration, just one direction. So that if the p-value is way out here, we say, hey, the null hypothesis can be rejected. The p-value is small. All true, but what this doesn't tell us is if there's any other biases, selection bias, confounding, all the other things that epidemiologic methodologists and people from other disciplines worry about. Did we have self selection or is this a really well done randomized trial where no biases or a really well done cohort or observational study? Where there's no biases. If there's no biases the p-value is meaningful. If there are, the p-values can be misleading. That's the point. So a p-value of say 0.04 tells us that if the null were true, an effect or an association at least as large as the one observed 5 to 7. Value of two would occur four out of 100 times. Here, we're getting a measure of chance. P value, importantly, is not the probability that the alternative hypothesis is true.

So this is more jargon. Again, I want to emphasize, the point is, the p value does not indicate quality of the study. Importantly, for over 50 years, methodologists have clearly rejected the use of naive, if you will, use of p-values. P-values without appreciating the other aspects of the study. But the practice goes on today, which is why I'm spending this module to remind you, or perhaps, first time teach you, but if you see a study that says this difference has a small p-value, that does not mean it's true. It also doesn't mean it's a quality or unbiased estimate. And we talked about that earlier in this lecture. Some cautions, some take home messages. P-values don't imply the strength of a relationship. It could be 5 minus 7, or 5 minus 700. There's just some difference. P-value does not imply or connote how big that difference is. We don't want to compare p-values. From tests across different studies. A P-value is within a study. There's some exceptions, but for now we'll say, don't go across studies. Furthermore, we don't want to compare P-values within the same study, this is the effect for race, this is the effect for SCS, this is the effect for gender, that is a very complicated analysis. And so it takes some caution. If you're an expert, of course. If you're new to this, I urge caution. The important point I'm going to come back to again is that p-values Don't tell us that the effect is identified. This idea we talked about in the last lecture. Nor does it say it is not confounded, and in epidemiology that is the key point. So this means that identification, confounding bias, potential alternative explanations of these things, far more than statistics with a p-value are the most important parts of judging a social epidemiologic study.

You might also like