# Monte Carlo Integration

Let \(f\) be a function of \(\mathbb{R}^n\) to \(\mathbb{R}\) and \(p\) a
probability density on \(\mathbb{R}^n\). We consider the problem of
numerically evaluating

\[
\begin{equation}
\mathbb{E}[f] = \int_{\mathbb{R}^n} f(x) p(x) dx.
\end{equation}\]

Using i.i.d. \(X_i \sim p\), Monte-Carlo integration gives the following estimation of \(\mathbb{E}_p[f]\):

\[
\begin{equation}
Z_N = \frac{1}{N} \sum_{i=1}^N f(X_i).
\end{equation}\]

Let \(\sigma^2 = \mathbb{E}_p[f^2] - \mathbb{E}_p[f]^2\). \(Z_N\) is a random variable and one has:

\[
\begin{equation}
\left\{
\begin{aligned}
&\mathbb{E}[Z_N] = \mathbb{E}_p[f] \\
&Var(Z_N) = \sigma^2 / N
\end{aligned}
\right.
\end{equation}\]

The quality of a Monte Carlo estimation is determined by its variance. Therefore, the number of required samples \(N\) for a good estimation increases with \(\sigma^2\).

\medskip
Deterministic integral evaluation methods are based on the properties of \(f\) and their costs are strongly dependent on the dimension. However the Monte-Carlo integration can be reformulated in a dimension independent way. Let \(\pi\) the pushforward measure on \(\mathbb{R}\) of \(p\) by \(f\). We recall the definition of the pushforward measure, for any measurable subset \(A\):

\[
\begin{equation}
\pi(A) = \int_{\mathbb{R}^n} 1_{A}(f(x)) p(x) dx
\end{equation}\]

Let \(Y_i = f(X_i)\). Then \(Y_i \sim \pi\) and are i.i.d. . Therefore we can rewrite the approximation as

\[
\begin{equation*}
Z_N = \frac{1}{N} \sum_{i=1}^N Y_i
\end{equation*}\]

which is the Monte-Carlo methods applied to \(\int_{\mathbb{R}} \pi(x)dx\) with \(p = \pi\), \(n = 1\) and \(f : x \rightarrow x\).

Hence Monte-Carlo integration convergence speed depends only on the pushfowrad measure \(\pi\), making it suited for high dimensional problems. This is well illustrated by the following example: let \(B\) be a measurable subset of \(A=[0,1]^n\) with volume \(V\), let \(f = 1_{B}\) and let

\[
I = \int_{[0,1]^n} f(x)dx\]

Here, \(p = \lambda\) the Lebesgue measure on \([0,1]^n\), therefore the variables \(X_i\) are independent and uniformly distributed in \([0,1]^n\). Therefore the variables \(Y_i=f(X_i)\) are independent Bernoulli random variables of parameter \(V\). Thus the Monte Carlo method is exactly the same *for any* \(n\) and depends only on \(V\).

### Python example