Tuesday, December 31, 2013

When is absence of evidence = evidence of absence?

You often hear the old mantra "absence of evidence is not evidence of absence," but I think that this is an oversimplification. The truth is, sometimes it is, and sometimes it isn't. It depends on the experiment, and also how well you understand the implication of the results.

There are times when you can make an excellent case that something is absent because there is no evidence of it. If you are is small, well-lit room, you probably don't need to look under many things to convince any reasonable person that there is no tiger there. In general, you need two things:
  1. You need a solid argument for what sort of evidence you would expect if what you are looking for is there, and how that would be different from the null hypothesis (not there). 
  2. A set of data that tends to falsify the hypothesis, thus advancing the null hypothesis.
We're not talking about "proof" here. I would just as soon we not talk about proof much at all, unless the topic is logic or mathematics. Rather, simply raising or lowering the odds of the null hypothesis, i.e. that the claim in question is not true.



Consider the "tiger in a small room" hypothesis. We can right away come up with some predictions based upon the hypothesized presence of a tiger:
  • Tigers are large. If the light is on, we should see the tiger.
  • Tigers are wild animals and probably have a bit of a smell to them we could detect.
  • Tigers are dangerous when cornered or even when they're not. We might very well feel a tiger in manifestly unpleasant ways, or at least find the mangled remains of some other poor thing that has felt the tiger.
  • We would probably hear the tiger breathing, if not roaring.
You can probably think of a few other things that would constitute evidence of tiger presence. In each case, you reason from what you know about tigers to the kinds of evidence you would expect.

For the null hypothesis of no tiger, the probability of any of these observations is much reduced. If they were all completely absent, including an inspection of the closet, you would be fully justified in saying "no tiger here!"

The reason this is easy is our assumption of a small, well-lit room. In a dark and very large cavern, your probability of detecting the tiger's sight, smell, or sound in time to avoid becoming his meal is much reduced, and you will have to depend on how likely you think it is that a tiger might be in there in the first place - the so-called "prior odds." So, absence of tiger evidence may or may not be a surprise, but in this case, it is not evidence of absence.

Another way to look at this is the likelihood ratio. This is simply the ratio of the probability of the evidence at hand given the hypothesis to the probability of the SAME evidence, given the null hypothesis. The odds that a hypothesis is true ("there really is a tiger in here") go up and down in proportion to the likelihood ratio. So, if in the small, well-lit room, there is no sight, sound, smell or touch of a tiger, then the likelihood ratio is low - much less than 1, and the odds that there is a tiger in that room are decreased markedly. In this case, absence of evidence really is evidence of absence.

In the large, dark cavern however, the tiger may well be in there even if there is no evidence of it available to you. In this case, the likelihood ratio is close to 1, and so the odds of a tiger in the cavern are not much reduced by the lack of evidence.  Absence of evidence is at best, very weak evidence of absence.

To sum up, it's all about the likelihood ratio. If the ratio is low (well less than one), then absence of evidence is evidence of absence. If for some reason you can't honestly calculate this ratio, then you can't tell me what the data we have is evidence for (or against).  We like to do experiments or observations where the likelihood ratio is as far from 1 as possible, but this can take some patience.

Now, what about this prior odds stuff? Where do the prior odds come from? When little or no evidence has been evaluated, then the truth is it's pretty arbitrary, and largely depends on personal biases. I can't emphasize enough that prior odds don't form the basis of much of an argument one way or another. We have to start making meaningful predictions and collecting evidence, and then the odds can start moving up or down as we learn more.

We're never really done with this process, but sooner or later we may reach a point where the odds of a hypothesis given the evidence available are so low (or so high), that it's time to move on.  At that point we have a belief - we lie down and sleep as if there is no tiger to worry about. Beliefs are not absolute, though, and there can always be a seed of doubt, and that's how we like it.

What does this have to do with SETI and aliens and stuff? I would argue that instead of a tiger in a small room, we are more likely in the huge cavern and hypothesize the presence of a small, stealthy black cat. All we have at present to search the cavern is a small candle. OK, let's not beat this analogy to death, but you get the idea: the likelihood ratio is nearly 1, so whatever you think the prior odds are, the search so far barely budges them.

Let me use a more straightforward metaphor with a simple search space. Let us say you have 1000 boxes, and your hypothesis is that there is a valuable diamond in one of the boxes. You can open a box, look into it, and reliably determine whether there is a diamond in that box or not. Let us say you have opened two boxes, and found no diamond. The probability of this result if there is no diamond at all (the null hypothesis) is 1.  The probability of this result if there is one diamond is 0.998, so the likelihood ratio is 0.998 - the 2 trials are very weak evidence that there is no diamond. On the other hand, if you performed 500 trials and found no diamond, the likelihood ratio is now only 0.5 (if there could be more than 1 diamond, it's 0.37), so that is pretty good evidence there is no diamond, but not persuasive. The more of the search space you cover, the more the likelihood ratio goes down.

SETI researchers argue (persuasively, I think), that they haven't opened very many of the boxes yet. Perhaps one day, after a much more exhaustive tour of the search space, it will be less than 1, and we can start meaningfully wondering about the eerie silence. Until then, the search continues and no one should be too discouraged.

Creative Commons License
Dream of the Open Channel is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at http://disownedsky.blogspot.com.