10 Probability & Likelihood – Inverse Problems in Geophysics

10.1 Probability density function

\[\int p(d)\dd d=1\]

10.2 Definitions

Probability

defines how probable a certain event is expected to occur

Given a known model or system, what are the chances of a specific outcome?

Likelihood

defines probability of data under the assumption of a model (hypothesis) is true

Given an observed outcome, what is the chance that our model or system parameters are correct?

Conditional probability

\(P(A|B)\) Probability of A under the assumption that B occurs

10.3 Variance

\[ \sigma^2 = \int (d-\expval{d})^2 p(d) \dd{d} \]

\(\sigma\) is a measure of the width of the distribution

related to standard deviation and mean of sampling

\[ \sigma^2_{est}=\frac{1}{N-1} \sum\limits_{i=1}^N (d_i - \expval{d})^2 \qq{with} \expval{d}=\frac{1}{N}\sum\limits_{i=1}^{N} d_i \]

10.4 Data correlation

independent: \(p(\vb d) = p(d_1) p(d_2)\ldots p(d_N)\)

10.5 Covariance

(measure of correlation between data)

\[ \mbox{cov}(d_1, d_2) = \int\int (d_1-\expval{d_1})(d_2-\expval{d_2}) p(d_1, d_2) \dd{d_1}\dd{d_2} \]

\[ \expval{d_i} = \int\ldots\int d_i p(\vb d) \dd{d_1}\ldots \dd{d_N} \]

10.6 Covariance propagation

Linear problem \(\vb m = \vb M \vb d\), e.g., \(\vb m = \vb G^\dagger\vb d\)

Mean value \(\expval{\vb m}=\vb M \expval{\vb d} + \vb n\) and covariance

\[\mbox{cov}(\vb m) = \vb M \mbox{cov}(\vb d) \vb M^T\]

Least-squares: \(\vb M=(\vb G^T \vb G)^{-1} \vb G^T\), uncorrelated data: \(\mbox{cov}(\vb d)=\sigma_d^2 \vb I\)

\[ \Rightarrow \mbox{cov}(\vb m) = (\vb G^T \vb G)^{-1} \vb G^T \sigma_d^2 \vb I ((\vb G^T \vb G)^{-1} \vb G^T)^T = \sigma_d^2 (\vb G^T \vb G)^{-1} \]

10.7 A priori knowledge

10.8 Bayes’ theorem

Conditional probability

\(p(a|b) = p(a, b) / p(b)\)

\[p(\vb m|\vb d)p(\vb d) = p(\vb d|\vb m)p(\vb m)\]

\[p(\vb m|\vb d) = \frac{p(\vb d|\vb m)p(\vb m)}{p(\vb d)}\]

posterior distribution \(\propto\) likelihood x prior distribution

10.9 Example: COVID test

\(P(T)\): probability of positive test (\(T\))
\(P(I)\): probability of a person to be ill (1%) (\(I\))
\(P(T|I)\): (conditional) probability of a test recognizing illness (90%)

assume 1000 patients (1%=10 ill, 99%=990 healthy)

false tests (10%) \(\Rightarrow\) 1 false negative (9 correct), 99 false positive

\[ P(I|T) = \frac{P(T|I)P(I)}{P(T)} = \frac{0.9 \cdot 0.01}{108/1000}=8.3\% \]

10.10 Example: COVID test (2)

\(P(T)\): probability of positive test (\(T\))
\(P(I)\): probability of a person to be ill (1%) (\(I\))
\(P(T|I)\): (conditional) probability of a test recognizing illness (99%)

assume 10000 patients (1%=100 ill, 99%=9900 healthy)

false tests (1%) \(\Rightarrow\) 1 false negative (99 correct), 99 false positive

\[ P(I|T) = \frac{P(T|I)P(I)}{P(T)} = \frac{0.99 \cdot 0.01}{198/10000}=50\% \]

10.11 FIFA example

Note

Tony hears football-watching Arthur cheer. How probable is a goal being made?

Assumption: In 2% of the time (segments covering a cheer) there is a goal.
Assumption: If a goal is made by Arthurs team, he cheers by 90%.
Assumption: Reasons for non-goal cheers (98%) have a probability of 1%.

\[ P(G|C)=\frac{P(C|G) P(G)}{P(C|G)+P(C|-G)} = \frac{0.9 \cdot 0.02}{0.02 \cdot 0.9+0.98\cdot 0.01}=64.7\% \]

10.12 Bayes theorem simple example

joint probability & conditional probabilities (Menke, 2015)

10.13 A priori (Menke, 2012)

A: a priori pdf \(p_a(\vb m, \vb d)\), B: conditional pdf \(p_g(\vb m, \vb d)\), C: product \(p_t(\vb m, \vb d)=p_a(\vb m, \vb d) p_g(\vb m, \vb d)\), white: theory

10.14 A priori and likelihood

10.15 Bayes view in nonlinear problems

10.16 Highly nonlinear problems

10.17 Monte Carlo methods

Monte Carlos search: randomly draw solutions from grid
accept solution only if better than old
Markow-Chain-Monte-Carlo
Metropolis-Hastings (Metropolis et al., 1953; Hastings, 1970)

10.18 Simulate Annealing

Test parameter

\[ t = e^{-(\Phi(\vb m)-\Phi(\vb m^p))/T} \]

10.19 Alternatives to grid search

Monte Carlo search

draw random samples and accept them if the error is improved

undirected search (Newtons method is directed)

Simulated annealing

decrease temperature controlling particle movements:

high \(T\): undirected, low \(T\): search in vicinity of current model

10.1 Probability density function

10.2 Definitions

10.3 Variance

10.4 Data correlation

10.5 Covariance

10.6 Covariance propagation

10.7 A priori knowledge

10.8 Bayes’ theorem

10.9 Example: COVID test

10.10 Example: COVID test (2)

10.11 FIFA example

10.12 Bayes theorem simple example

10.13 A priori (Menke, 2012)

10.14 A priori and likelihood

10.15 Bayes view in nonlinear problems

10.16 Highly nonlinear problems

10.17 Monte Carlo methods

10.18 Simulate Annealing

10.19 Alternatives to grid search

10.20 Monte Carlo vs. Simulated Annealing