Wednesday, April 23, 2014

Rosmarus et Naupegus


The Walrus and the Carpenter is another poem from Lewis Carroll's Through the Looking-Glass. It is the semi-nonsensical story of a walrus, a carpenter and some oysters. The poem is one of my favorites, and thus an obvious candidate to be translated into Latin. A literal de-translation is given below the parallel translation.

Rosmarus et Naupegus

          A Luviso Carollo
          Latinatum a Nadavo Cravito

Sol nitebat in mare,
Nitens summa cum vire
Fluctus conans facere
Blandos et clarosque
Visu mirabilissime
In media nocte

Maeste luna nitebat,
Putabat nam soli
Haud opus esse adesse
Post finem quin diei.
Dicit "venisse, irruere
Improba sunt mihi,"

Mare umet udissimum,
Litus aridissimum
Nimbus nullus videatur
Abest nam videndum:
Super volat avis nulla—
Abest nam volandum.

Rosmarus et Naupegus
Juxtim ambulabant;
Moles harenae videre
Quam misere flebant
"Si modo hoc depurgatum,
Quam bonum." dicebant

"Si servae septem scopis quot
Averrent quot menses,
"Num possent" dicit Rosmarus
"Purgarene putes?"
Naupegus dicit "Dubito,"
Flens lacrimas tristes.

"O Ostreae, sequamini"
Obsecrat Rosmarus
“Loquamur salsam per actam
atque ambulemus:
Sed summum quattuor sumamus,
Manum cuique demus.”

Ostrea maxima spicit,
Nulla verba dicit:
Ostrea maxima nictat,
Grave caput quatit—
Linquere ostrearium
Videlicet nolit.

Minores quattuor festinant,
Muneri stundentes:
Aliclae tersae, frontes lautae,
Calcei elegantes—
Sane mirabilissime,
Non pedes habentes.

Ostreae quattuor sequuntur,
Et quattuor aliae.
Veniunt iam et gregatim,
Et crescenti multae.
Per undas spumas saliunt,
Petunt portum harenae.

Rosmarus et Naupegus
Stadium gradiuntur
Tunc commode in scopulo
Demisso nituntur
Ordineque ostreulae
Nunc opperiuntur.

"Adest tempus," dicit Rosmarus
"Semonis de multis:
Naves, calcei, cera sigilli,
Et reges et caulis
Ac porci num alarentur,
Causa fervoris maris."

“Mora sodes,” testae clamant
“Antequam loquaris;
Exanimatae et pingues
Sunt namque e nobis!”
“Festina lente!” Naupegus
Dicit valde gratis.

“Massae panis” dicit Rosmarus
“Nobis opus maximum”
“Enimvero optimi sunt
Piper et acetum—
Ostreulae, si paratae,
Inchoamus comesum.”

Ostreae “Atqui non nostrum!”
Clamant livescentes.
“Factum atrum sit nobis
Post gratias tales!”
Rosmarus dicit “Bella nox,
Miramini species?”

“Quam comes sunt quod venisitis,
Et estis quam belli!”
Tacebat Naupegus nisi
“Seca frustum mihi:
Opto ut minus surdus sis—
Iam bis te poposci!”

“Turpe eas dolo capere
Est” dicit Rosmarus
“Post tam longe eduximus
Tam cursim egimus!”
Tacebat Naupegus nisi
“Butyrum crassius!”

“Vos lacrimo” ait Rosmarus
“Vostrum me miseret.”
Singultibus et lacrimis
Maximas diribet
Mucinium ante oculos
Fundentes praetendet.

“Ostreae,” dicit Naupegus,
“Bene cucurrerunt!
Debemus domum redire?”
At voces nullae reddunt—
Et haud mirabilissime,
Quod quamque ederunt.
The Walrus and the Carpenter

          By Lewis Carroll


The sun was shining on the sea,
Shining with all his might:
He did his very best to make
The billows smooth and bright--
And this was odd, because it was
The middle of the night.

The moon was shining sulkily,
Because she thought the sun
Had got no business to be there
After the day was done--
"It's very rude of him," she said,
"To come and spoil the fun!"

The sea was wet as wet could be,
The sands were dry as dry.
You could not see a cloud, because
No cloud was in the sky:
No birds were flying overhead--
There were no birds to fly.

The Walrus and the Carpenter
Were walking close at hand;
They wept like anything to see
Such quantities of sand:
"If this were only cleared away,"
They said, "it would be grand!"

"If seven maids with seven mops
Swept it for half a year.
Do you suppose," the Walrus said,
"That they could get it clear?"
"I doubt it," said the Carpenter,
And shed a bitter tear.

"O Oysters, come and walk with us!"
The Walrus did beseech.
"A pleasant walk, a pleasant talk,
Along the briny beach:
We cannot do with more than four,
To give a hand to each."

The eldest Oyster looked at him,
But never a word he said:
The eldest Oyster winked his eye,
And shook his heavy head--
Meaning to say he did not choose
To leave the oyster-bed.

But four young Oysters hurried up,
All eager for the treat:
Their coats were brushed, their faces washed,
Their shoes were clean and neat--
And this was odd, because, you know,
They hadn't any feet.

Four other Oysters followed them,
And yet another four;
And thick and fast they came at last,
And more, and more, and more--
All hopping through the frothy waves,
And scrambling to the shore.

The Walrus and the Carpenter
Walked on a mile or so,
And then they rested on a rock
Conveniently low:
And all the little Oysters stood
And waited in a row.

"The time has come," the Walrus said,
"To talk of many things:
Of shoes--and ships--and sealing-wax--
Of cabbages--and kings--
And why the sea is boiling hot--
And whether pigs have wings."

"But wait a bit," the Oysters cried,
"Before we have our chat;
For some of us are out of breath,
And all of us are fat!"
"No hurry!" said the Carpenter.
They thanked him much for that.

"A loaf of bread," the Walrus said,
"Is what we chiefly need:
Pepper and vinegar besides
Are very good indeed--
Now if you're ready, Oysters dear,
We can begin to feed."

"But not on us!" the Oysters cried,
Turning a little blue.
"After such kindness, that would be
A dismal thing to do!"
"The night is fine," the Walrus said.
"Do you admire the view?

"It was so kind of you to come!
And you are very nice!"
The Carpenter said nothing but
"Cut us another slice:
I wish you were not quite so deaf--
I've had to ask you twice!"

"It seems a shame," the Walrus said,
"To play them such a trick,
After we've brought them out so far,
And made them trot so quick!"
The Carpenter said nothing but
"The butter's spread too thick!"

"I weep for you," the Walrus said:
"I deeply sympathize."
With sobs and tears he sorted out
Those of the largest size,
Holding his pocket-handkerchief
Before his streaming eyes.

"O Oysters," said the Carpenter,
"You've had a pleasant run!
Shall we be trotting home again?'
But answer came there none--
And this was scarcely odd, because
They'd eaten every one.


The Walrus and the Carpenter (De-translated)

The sun was shining on the sea
shining with [his] greatest might.
Trying to make the billows
Both smooth and bright
[A sight] most astonishing to see
in the middle of the night

The moon was shining gloomily
For she thought, for the sun
it was not [his] business to be present
Indeed, after the end of the day.
She says “To have come and to intrude
are rude [acts] to me.”

The very wet sea is wet
And the shore is most dry
No cloud might be seen
For anything to be seen is absent
No bird flies above—
For anything to be flown is absent.

The Walrus and the [ship-wright] Carpenter
Were walking together
To see such masses of sand,
how wretchedly they wept.
“If only this might be cleaned away
How good [it would be]” they said.

“If seven maids with as many brooms
were to sweep for as many months,
don’t you think” says the Walrus
“they couldn’t clean [it]?”
The Carpenter says “I doubt [it]”,
Weeping sad tears.

“O oysters, let you follow [us]”
Entreats the Walrus
“Let us talk along the briny beach
and also let us walk:
But we might only select at most four,
that we might give a hand to each.”

The oldest oyster looks
[and] says no words:
The oldest oyster winks,
[and] gravely shakes [her] head—
To leave the oyster-bed
evidently, she would be unwilling.

Four younger [oysters] hurry,
eager for [the] gift
[Their] child’s-cloaks [were] wiped, [their] faces washed
[And their] shoes were handsome—
Certainly most astonishingly,
[as they] had no feet.

Four [more] oysters follow,
and four others.
Now they come even in flocks
and to an increasing[ly] many [oysters].
Through waves [and] foam they leap
Seeking the refuge of the sand.

The Walrus and the Carpenter
walk for a furlong
Then lean upon a rock
conveniently low.
And in a line, the little oysters
now wait.

“The time has arrived” says the Walrus
“of speaking concerning many things:
Ships, shoes, wax of a seal,
both kings and cabbage
And whether pigs be winged,
and the cause of the boiling of the sea.”

“A pause, please” the shellfish cry
Before you might speak;
For exhausted and fat
are there among us!”
“Hasten slowly!” said the Carpenter
To the greatly thankful [oysters].

“A lump of bread,” the Walrus says
“is our greatest need:
Certainly, pepper and vinegar
are also very good—
Little oysters, if [you are] ready,
we begin the eating.”

The oysters cry “But not of us!”
becoming blue.
“[That] would be a terrible deed,
after so great kindnesses!”
The Walrus says “The night [is] pretty,
Do you admire the sights?”

How kind you are that you came
And you are so charming!”
The Carpenter was silent except [for saying]
“Cut a scrap for me:
I wish that you might be less deaf—
I have already asked you twice!”

“A disgraceful [thing] it is
to catch them with a trick” says the Walrus
“After we have led them out so far
and drove them so swiftly.”
The Carpenter was silent except [for saying]
“The butter is too thick!”

“I weep for you” says the Walrus
“I feel sorry for you.”
With sobbing and with tears
he sorted out the biggest ones.
He extended a handkerchief
before his pouring eyes.

“Oysters,” says the Carpenter
“You have run well!
Ought we return home?”
But no voices return—
And hardly is it most astonishing
because they had eaten every one.

Monday, April 14, 2014

Probability

      Probability is a concept that is intuitively fairly easy to understand, yet difficult to give a comprehensive, universally acceptable interpretation. In general, probabilities are given with respect to events or propositions and give a way of quantitatively answering such questions as “How certain are we that X will occur/is true?”. Probabilities range from 0, (almost*) certain not to happen/be true, to 1, (almost*) certain to happen/be true.

There is division as to whether probabilities are objective facts or only subjective. Some say the probability of an event is a measure of the propensity of a certain situation to yield a certain outcome, while others say that the probability of an event is the relative frequency of that event in the limit of a large number of relevantly identical cases, or trials. Those who say it is subjective give, for instance, the conception that the probability of an event can be defined as “the price (in arbitrary units) at which you would buy or sell a bet that paid 1 unit if the event occurred and 0 if it did not occur”.

One way to circumvent all of these is to leave probability somewhat vague and give it a thorough mathematical basis. This can readily be done. We will deal with the probability of some event which will occur as a result of some experiment in individual trials. What is needed for a probabilistic model are two things:
       
  • The sample space \(\Omega\): the set of all possible outcomes of the experiment.
  •    
  • The probability law \(P\). This is a function that takes a subset of the sample space and returns a real number. This law, to qualify as a proper probability law, must satisfy three conditions. Let \(A\) be some subset of \(\Omega\).
         
    1. Non-negativity: For any set \(A\) subset of \(\Omega\), \(P(A) \geq 0\).
    2.    
    3. Countable Additivity: Let \(A_{1}, A_{2}, ...\) be a countable sequence of mutually disjoint sets (that is, no element in one set is in any other set), each a subset of \(\Omega\). Then \(P(A_{1} \cup A_{2} \cup ...)=P(A_{1})+P(A_{2})+...\).
    4.    
    5. Normalization: the probability of some event in the space is unity, that is \(P(\Omega)=1\).
If the model satisfies these conditions, it is at least admissible, though typically we have other considerations that help us choose a model, such as simplicity. These conditions imply that the empty set, that is, the set containing no elements, has probability zero.

Very typical in probability theory is the use of set-theoretic or logical-operator notation. While notation varies, the fundamental concepts remain consistent. When we want the probability that events \(A\) and \(B\) will both happen (e.g. a die lands on an even number and on a number above three), we ask for the probability of their conjunction, represented as \(P(A \cap B)\) or \(P(A \& B)\) or \(P(A \wedge B)\). When we want the probability that at least one event of the events \(A\) and \(B\) will happen (e.g. a die lands on an even number or on a number above three, or both) we ask for the probability of their disjunction, represented as \(P(A \cup B)\) or \(P(A \vee B)\). When we want the probability that some event will not happen (e.g. a die does not land on an even number), we ask for the probability of the complement of the event, represented \(P(\sim A)\) or \(P(\bar{A})\) or \(P(A^{c})\) or \(P(\neg A)\). The empty set is symbolized as \(\varnothing\) and represents the set with no elements. Thus, taking the union of \(\varnothing\) with any other set gives the latter set, and taking the intersection yields the empty set. In addition, we can say \(\varnothing=\sim \Omega \) and \(\sim \varnothing= \Omega \). Lastly, a partition of set \(C\) is a countable sequence of sets such that no two sets in the partition share an element (the sets are mutually exclusive) and every element in \(C\) is in some set (the collection is collectively exhaustive).

Also important in the study of probability is the concept of conditional probability. Thus is the measure of probability based on some information: assuming that something is the case, what is the chance that some event will occur? For instance, we could ask what the chance is that a die landed six, given that it landed on an even number. While a more thorough discussion of conditional probability can be found elsewhere, we will here merely give the formula. \(P(A|B)\), the probability that \(A\) will occur given \(B\) (read “the probability of \(A\) given \(B\)” or “the probability of \(A\) on \(B\)”), is given by the expression \[ P(A|B)=\frac{P(A \cap B)}{P(B)}\] whenever \(P(B) \neq 0\). Sometimes it is possible to assign a meaningful value to \(P(A|B)\) when \(P(B)=0\). For instance, suppose we ask “what is the probability that a homogeneous, spherical marble, when rolled, will land on point A, given that it landed either on point A or point B?” The answer then seems clearly to be 0.5. A good interpretation of the conditional is a change in the sample space: when we condition on \(B\), we are changing the sample space from \(\Omega\) to \(B\). We find that all the axioms and theorems are consistent with this view. We can also mention here the notion of independence. Two events \(A\) and \(B\) are independent iff \(P(A \cap B)=P(A) P(B)\). This implies that \(P(A|B)=P(A)\) and \(P(B|A)=P(B)\). This means that, given the one, we gain no information about the other: it remains just as probable.

While the probability of events is relatively easy to understand, the probability of propositions is not as easy, as propositions can have only two values: true and false. How is it that we can say “The probability that you are female is 51%” when you are either definitely male or definitely female? This is where the notion of epistemic probability comes into play. Epistemic probability has to do with how likely something seems to us, or some other (rational) person, given some set of background information. For instance, in some murder, given that we see Joe’s fingerprints on the murder weapon, we deem it likely that Joe committed the murder. Though it is very difficult to give a good account, a rough way to quantify it would be in the following sense:
\(X\) has epistemic probability \(p\) given background information \(B\) (i.e. \(P(X|B)=p\)) iff the following is true: supposing we encountered this set of information \(B\) in many scenarios, we would expect \(X\) to be true in fraction \(p\) of those scenarios.
Again, this may not be a perfect analysis, but it does give a rough way to understand it. However, we must note that epistemic probability is of a significantly inferior sort than, say, experimental probability (observing that \(X\) happens in fraction p of experimental cases), or even a good theoretical probability (theory predicts that a homogeneous cube of material will, when haphazardly tossed, land on any given face with equal probability). There is a principle called the principle of indifference that says one should assign equal epistemic probabilities to two events or propositions when we have no justification to prefer one to the other. That may be a good principle as far as epistemic probability goes, but it is very deeply restricted by background information (clearly: lacking any background information to prefer one possibility to another, we are to assign them equal probabilities), and at least somewhat subjective. It is thus greatly limited by what we know: in fact, what we think is a possibility, based on our background information, may not be a possibility at all (it could be what is called an epistemic possibility). Thus, while epistemic probability may be the best we can do, given our background information, it may not be very good at all.

Statistical probability is of the epistemic sort: suppose that fraction p of population S has property X. We then come across a member M of S. Suppose we have no way to tell immediately whether M has property X, but we know M comes from S. We therefore say that M has property X with (epistemic) probability p. This is a statistical probability: based on facts about the population, we deduce a probability as regards a given individual, even though, if we had more information, we could say that M had X with probability either zero or one. This is to be contrasted with what we might call stochastic probability. If we have a perfect coin, and flip it fairly, before we do so, there is no information anywhere, even possibly, as to what its outcome will be. We don't know what will happen when we flip it, not because we aren't privy to some information, but because there is no information to be had. This will be the case with any genuinely indeterministic event. We might demonstrate the difference between statistical and stochastic probabilities as between a coin that was flipped but is hidden from view and a coin yet to be flipped, respectively. Most physicists believe many quantum processes are genuinely stochastic, and some philosophers believe free will is also stochastic in some sense ("You will probably choose X" does not mean that based on what I know now, there is a pretty high epistemic probability that you will choose X, but if I knew more, I would be able to predict with certainty whether you will choose X or not (e.g. you chose X most of the time when you are in certain circumstances). Instead, it is that you are more disposed to choose X).



We will here give a few theorems of probability theory. We will try to present them such that their derivation is clear, but if not, then any introductory text on probability theory can give a more thorough exposition. \(A\) and \(B\) are some subsets of \(\Omega\):
\[P(\Omega)=1;\;\;\;\ P(\varnothing)=0\] \[P(A \cap \Omega)=P(A);\;\;\;\ P(A \cap \varnothing)=P(\varnothing)=0\] \[P(A \cup \Omega)=P(\Omega)=1;\;\;\;\ P(A \cup \varnothing)=P(A)\] \[0 \leq P(A) \leq 1\] \[0 \leq P(A|B) \leq 1\] \[P(A \cup \sim A)=P(A)+P(\sim A)=P(\Omega)=1\] \[P(A \cup B)+P(A \cap B)=P(A)+P(B)\] \[P(A \cap B)\leq \min(P(A),P(B))\] \[P(A\cup B) \leq P(A)+P(B)\] \[P(A \cup B) \geq \max(P(A),P(B))\] \[P(A \cap B)+P(A \cap \sim B)=P(A)\] \[P(A \cap B)=P(A)P(B|A)\] \[P(A|B)=\frac{P(B|A)P(A)}{P(B)}\] \[\frac{P(A|B)}{P(A)}=\frac{P(B|A)}{P(B)}\]


Let \(B_{1},B_{2},…\) be a partition of \(C\), then: \[P(A \cap B_{1})+P(A \cap B_{2})+…=P(A \cap C) \] \[P(A \cap C)=P(A|B_{1})P(B_{1})+ P(A|B_{2})P(B_{2})+…\] \[P(C|A)=P(B_{1}|A)+ P(B_{2}|A)+…\] Particularly, \[1=P(B|A)+P(\sim B|A)\]

De Morgan’s Laws


\[P(\sim A \cap \sim B)+P(A \cup B)=1\]\[P(\sim A \cup \sim B)+P(A \cap B)=1\]


Bayes’ theorem


Let \(H_{1}, H_{2}, …\) be a partition of \(\Omega\). Then: \[P(H_{m}|E)=\frac{P(H_{m}) P(E|H_{m})}{P(H_{1})P(E|H_{1})+ P(H_{2})P(E|H_{2})+…}\] This is typically applied to choosing a hypothesis to explain a certain fact, or given a certain set of evidence. \(P(E|H_{m})\) is the (epistemic) probability that we would get evidence \(E\) supposing hypothesis \(H_{m}\) is true, and \(P(H_{m}|E)\) is the (epistemic) probability that hypothesis \(H_{m}\) is true, given evidence \(E\). Thus, hypothesis \(H_{m}\) becomes more likely on evidence \(E\) the more probable it is without the evidence, the more likely the evidence would be on that hypothesis, the less likely the evidence would be on alternate hypotheses, and the less likely the alternate hypotheses are without the evidence.



Probability of a Union


We can here give a useful formula for determining the probability of the union of events, which we can deduce from DeMorgan’s laws: suppose we want to find the probability of the union of some events \(Q=P(A_{1} \cup A_{2} \cup …)\).
We take the product \(Q'=1-\prod_{n}(1-A_{n})\)
We then replace every occurrence of \(A_{m_{1}}A_{m_{2}}…\) with \(P(A_{m_{1}} \cap A_{m_{2}}…) \). For instance, to find \(P(A \cup B \cup C) \), we take \(1-(1-A)(1-B)(1-C)=A+B+C-AB-AC-BC+ABC\)
We then make replacements as described to get \[P(A \cup B \cup C)=P(A)+P(B)+P(C)-P(A \cap B) -P(A \cap C) -P(B \cap C)+ P(A \cap B \cap C) \] If the events are all independent, we can simplify the formula to: \[Q=1-\prod \nolimits_{n}(1-P(A_{n}))\] If \(P(A_{m})=p\) for all m, we can further simplify: \[Q=1-(1-p)^{N}\] Where \(N\) is the number of events. For p fairly small, we can approximate this as \[Q \approx 1-e^{-pN}\] And from this we can rearrange to get \[N \approx \frac{-\ln(1-Q)}{p}\] This gives the number of independent trials necessary to get a probability Q of at least one success, if the probability of success in each trial is p.
As an application, we can ask what is the probability that an event will happen on a given day if it has a 50% probability of happening in a year? In this case, we want to solve for \(p\) given \(Q=0.5\) and \(N=365\). We find that \(p \approx \frac{-\ln(1-Q)}{N}=0.19\%\).

Using this, we can also show that improbable events are likely in a collection of many trials. Suppose we have \(N\) trials, in each of which X happens with probability \(p\). We then have the probability that X never happens as given by \((1-p)^{N} \approx e^{-pN}\). We thus see that, as N increases, the probability of no X occurring tends to zero; in fact, it tends to zero exponentially. Thus, given enough trials we would expect to see the individually improbable: long strings of all heads while flipping a coin, the same person winning the lottery multiple times, someone has two unrelated rare diseases, etc. Coincidences will always crop up given enough opportunities. These coincidences combined with confirmation bias--remembering the hits and forgetting the misses--result in muddled thinking. A coincidence happens and it is interpreted as a sign from on high, even though they ignored the hundreds of other times no coincidence happened. It is important to remember that coincidences are basically inevitable in large enough samples: if something has a one in a billion chance of happening to any given person on any given day, we can expect it will happen seven times per day worldwide.



Implication and Conditional Probability


We can also prove the following interesting theorem. Note that “if \(A\), then \(B\)” or “\(A\rightarrow B\)” is logically equivalent to “\( \sim A \) or \(B\)” or “\( \sim A \cup B\)”. Thus \(P(A \rightarrow B)=P(\sim A \cup B)\). We then have
\(1.\;\; 1 \geq P(A)\)
\(2.\;\; P(\sim B|A) \geq P(\sim B|A)P(A)=P(\sim B \cap A)\)
\(3.\;\; 1-P(\sim B|A) \leq 1-P(A \cap \sim B)\)
\(4.\;\; P(B|A) \leq P(\sim A \cup B)=P(A \rightarrow B)\)
That is, the probability of “if \(A\) then \(B\)” is not less than that of “\(B\), given \(A\)”.

We can also note that, as \(P(A)=P(A|B)P(B)+P(A|\sim B)(1-P(B))\), then \[\min(P(A|B),P(A|\sim B)) \leq P(A) \leq \max(P(A|B),P(A|\sim B))\]


Conditional Changes in Probability and How it Relates to Evidence


We can demonstrate the following:
Suppose \(P(A|B)>P(A)\). Then \(P(A \cap B)>P(A)P(B)\) and \(P(B|A)>P(B)\). In fact, all three are equivalent. In that case:
\( 1.\;\; P(A \cap B)> P(A)P(B) \)
\( 2.\;\; P(A) - P(A \cap B)< P(A)- P(A)P(B)=P(A)(1-P(B)) \)
\( 3.\;\; P(A \cap \sim B) < P(A)P(\sim B) \)
\( 4.\;\; P(A | \sim B) < P(A) \)
We can easily prove the greater-than case in the same way.
In English: "if A is more probable on B, B is more probable on A" and "if A is more probable on B, A is less probable on not-B".

An important consequent of this theorem is in discerning what counts as evidence. In a loose sense, we can say that \(A\) provides evidence for \(B\) if \(P(B|A)>P(B)\). We thus see that a necessary and sufficient condition for \(A\) providing evidence for \(B\) is that \(\sim A\) would need to provide evidence against \(B\). Thus, if we do some experiment to test a claim, we must be willing to accept failure as evidence against the claim if we would be willing to accept success as evidence for the claim, and vice versa. We must be willing to accept the possibility of weakening the claim if we are willing to accept the possibility of strengthening it by some test. It is often said that "absence of evidence is not evidence of absence", but this needs some qualification. Suppose we want to test the claim that there is life on Mars. We then do some test, like looking at a sample of martian soil under a microscope, and it comes up negative: is that evidence against life on Mars? Certainly, albeit very weak evidence. If we had found microbes in that sample, we would certainly have said that was evidence for life on Mars, therefore we must necessarily admit that the lack of microbes is evidence against life on Mars. It may only reduce the (epistemic) probability that there is life on Mars by something like a millionth of a percent, but if we do a million tests, that amounts to about a whole percent. If we do a hundred million tests, that amounts to over 60%.

In short, absence of evidence does count as evidence of absence in any and every instance where a presence of evidence would count as evidence of presence.



"Extraordinary Claims Require Extraordinary Evidence"


This phrase is declared nearly as often as it is denounced. However, it is clearly not specific enough to be definitively evaluated. One way of interpreting it is to say "Initially improbable hypotheses require improbable evidence to make them probable". This formulation is relatively easy to demonstrate as being true: \[ P(E)=\frac{P(H \cap E)}{P(H|E)} \leq \frac{P(H)}{P(H|E)} \] For example, if \(P(H)=1 \%\) and \(P(H|E)=75 \%\) then \(P(E) \leq 1.33 \%\).
If \(P(H|E) \geq 0.5\), then \(P(E) \leq 2 P(H)\). Thus, it is clear that the evidence required to make an initially improbable hypothesis probable must be comparably improbable.




Inscrutable Probabilities, Meta-probabilities and Widening Epistemic Probability


Sometimes, in cases of certain probabilities, we cannot estimate the probabilities, either at all, or to an adequate degree. We call such probabilities inscrutable. For all we know, these probabilities could have any value. We can use the concept of inscrutable probabilities to improve the descriptive accuracy of our epistemic probability judgements. For instance, suppose we have a die, and we are \(90\%\) sure that it is fair. We then want to find the probability that a six will be rolled. We make use of the formula: \[P(A)=P(A|B)P(B)+P(A|\sim B)P(\sim B)\] In this case, A is the event "a six is rolled" and B is the event "the die is fair". In this case, \(P(A|\sim B)\) is inscrutable: given that the dies is not fair, we cannot predict what the outcome will be. However, we do know that this probability, like all probabilities, is between zero and 1. Thus: \[P(A|B)P(B) \le P(A) \le P(A|B)P(B)+P(\sim B)\] In this case, we find \[P(6|\text{fair})P(\text{fair}) \le P(6) \le P(6|\text{fair})P(\text{fair})+P(\sim \text{fair})\] \[\frac{1}{6} \cdot 0.9 \le P(6) \le \frac{1}{6} \cdot 0.9 + 0.1\] \[0.15 \le P(6) \le 0.25\] Here we may introduce the concept of meta-probabilities. These take the form of the probability that something is true about a probability, for instance \(P(P(X) \ge \alpha)\) is the probability that the probability of X is not less than \(\alpha\). Returning to our example, suppose we are only \(80\%\) confident that the probability that the die is fair is \(90\%\). Applying the above formula: \[P(\text{fair}|P(\text{fair})=0.9)P(P(\text{fair})=0.9) \le P(\text{fair}) \le P(\text{fair}|P(\text{fair})=0.9)P(P(\text{fair})=0.9)+P(P(\text{fair}) \neq 0.9)\] \[0.9 \cdot 0.8 \le P(\text{fair}) \le 0.9 \cdot 0.8+0.2\] \[0.72 \le P(\text{fair}) \le 0.92\] This then implies \(0.08 \le P(\sim \text{fair}) \le 0.28\).
Returning to our former equation, we then have: \[P(6|\text{fair})P(\text{fair}) \le P(6) \le P(6|\text{fair})P(\text{fair})+P(\sim \text{fair})\] \[\frac{1}{6} \cdot 0.72 \le P(6) \le \frac{1}{6} \cdot 0.92 + 0.28\] \[0.12 \le P(6) \le 0.433...\] We thus see that adding in our meta-probabilistic uncertainty in our estimate for \(P(\text{fair})\) has further widened our uncertainty in the likelihood of rolling a six. This highlights the importance of both accounting for and minimizing any potential sources of uncertainty. We must factor in our confidence in a model in assessing the results it predicts to be likely or unlikely, if we are to use that model to form our epistemic probabilities of the predicted results.



*       \(P(X)=0\) does not mean that X can and will never happen. If you roll a marble, the chance of it landing on any given point is zero (ideally), and yet you will have it land on some point. What \(P(X)=0\) means is specific: it means that the measure of the space in which X holds is zero relative to the measure of \(\Omega\). There may still be a possibility that X happens, just that the region in which X happens is of zero "area", compared to \(\Omega\) (e.g. it is a point, and \(\Omega\) is a line segment). If \(P(X)=0\) we say that X will almost surely not happen, as opposed to \(\varnothing\), which will surely not happen.