C.6 Finding $σ_i$

Throughout the preceding sections, the uncertainties in the supplied target values $$f_i$ have been denoted $σ_i$ (see Section~ \ref{sec:bayes_ notation}). The user has the option of supplying these in the source datafile, in which case the provisions of the previous sections are now complete; both best-estimate parameter values and their uncertainties can be calculated. The user may also, however, leave the uncertainties in $f_i$ unstated, in which case, as described in Section~ \ref{sec:bayes_ notation}, we assume all of the data values to have a common uncertainty $σ_data$, which is an unknown. 

In this case, where 

$σ_i = σ_data  ∀  i$, the best fitting parameter values are independent of $σ_data$, but the same is not true of the uncertainties in these values, as the terms of the Hessian matrix do depend upon $σ_data$. We must therefore undertake a further calculation to find the most probable value of $σ_data$, given the data. This is achieved by maximising $P σ_data | x_i, f_i $. Returning once again to Bayes’ Theorem, we can write: 

\begin{equation}  \mathrm{P}\left( \sigma _\mathrm {data} | \left\{  \mathbf{x}_ i, f_ i \right\}  \right) = \frac{ \mathrm{P}\left( \left\{  f_ i \right\}  | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\}  \right) \mathrm{P}\left( \sigma _\mathrm {data} | \left\{  \mathbf{x}_ i \right\}  \right) }{ \mathrm{P}\left( \left\{  f_ i \right\}  | \left\{  \mathbf{x}_ i \right\}  \right) } \end{equation}

As before, we neglect the denominator, which has no effect upon the maximisation problem, and assume a uniform prior 

$P σ_data | x_i $. This reduces the problem to the maximisation of $P f_i | σ_data, x_i $, which we may write as a marginalised probability distribution over $u$: 

\begin{eqnarray}  \label{eqa:p_ f_ given_ sigma} \mathrm{P}\left( \left\{  f_ i \right\}  | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\}  \right) = \idotsint _{-\infty }^{\infty } &  \mathrm{P}\left( \left\{  f_ i \right\}  | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\} , \mathbf{u} \right) \times & \\ &  \mathrm{P}\left( \mathbf{u} | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\}  \right) \, \mathrm{d}^{n_\mathrm {u}}\mathbf{u} &  \nonumber \end{eqnarray}

Assuming a uniform prior for 

$u$, we may neglect the latter term in the integral, but even with this assumption, the integral is not generally tractable, as $P f_i | σ_data, x_i , u_i $ may well be multimodal in form. However, if we neglect such possibilities, and assume this probability distribution to be approximate a Gaussian \textit{globally}, we can make use of the standard result for an $n_u$-dimensional Gaussian integral: 

\begin{equation}  \idotsint _{-\infty }^{\infty } \exp \left( \frac{1}{2}\mathbf{u}^\mathbf {T} \mathbf{A} \mathbf{u} \right) \, \mathrm{d}^{n_\mathrm {u}}\mathbf{u} = \frac{ (2\pi )^{n_\mathrm {u}/2} }{ \sqrt{\mathrm{Det}\left(-\mathbf{A}\right)} } \end{equation}

\noindent We may thus approximate Equation~ (\ref{eqa:p_ f_ given_ sigma}) as: 

\begin{eqnarray}  \mathrm{P}\left( \left\{  f_ i \right\}  | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\}  \right) &  \approx &  \mathrm{P}\left( \left\{  f_ i \right\}  | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i \right\} , \mathbf{u}^0 \right) \times \\ & &  \mathrm{P}\left( \mathbf{u}^0 | \sigma _\mathrm {data}, \left\{  \mathbf{x}_ i, f_ i \right\}  \right) \frac{ (2\pi )^{n_\mathrm {u}/2} }{ \sqrt{\mathrm{Det}\left(-\mathbf{A}\right)} } \nonumber \end{eqnarray}

As in Section~ \ref{sec:bayes_ pdf}, it is numerically easier to maximise this quantity via its logarithm, which we denote 

$L_2$, and can write as: 

\begin{eqnarray}  L_2 &  = &  \sum _{i=0}^{n_\mathrm {d}-1} \left( \frac{ -\left[f_ i - f_{\mathbf{u}^0}(\mathbf{x}_ i)\right]^2 }{ 2\sigma _\mathrm {data}^2 } - \log _ e \left(2\pi \sqrt{\sigma _\mathrm {data}} \right) \right) + \\ & &  \nonumber \log _ e \left( \frac{ (2\pi )^{n_\mathrm {u}/2} }{ \sqrt{\mathrm{Det}\left(-\mathbf{A}\right)} } \right) \end{eqnarray}

This quantity is maximised numerically, a process simplified by the fact that 

$u^0$ is independent of $σ_data$.  \include{changelog}

$