1. Integral of Measurable Functions

In what follows, we suppose that all function are measurable and defined on the measure space $(X, A, μ)$ .

First we define the integral with respect to some measure $μ$ , Definition Let $ϕ = \sum_{i = 1}^{n} α_{i} 1_{A_{i}}$ be a simple function; then

\begin{matrix} (1) & \int ϕ d μ = \sum_{i - 1}^{n} α_{i} μ (A_{i}) \end{matrix}

is called the integral of $ϕ$ with respect to the measure $μ$ .

Recall that function is simple if it is measurable and takes finite number of values, such as the step function defined on $[- 1, 1]$ .

The integral of $μ$ with respect to $μ$ is finite if and only if the measure of the set where $ϕ (x) \neq 0$ is finite. In other words, the measure of the support of $ϕ$ must be zero.

This definition is quite intuitive, for example, if we want to know how expensive a bag of fruit is, we just count the number of certain kind of fruit (the measure of the set), multiply it by the price (value of the function on this set), and sum over all kinds of the fruits.

Whatever properties for integrals certainly hold for integral with respect to measure $μ$ as well, for example

\begin{matrix} (2) & \int (ϕ_{1} + ϕ_{2}) d μ = \int (ϕ_{1}) d μ + \int (ϕ_{2}) d μ, \end{matrix}

I am too lazy to list all of them here.

Now we need to generalize this concept to continuous functions, say $f (x)$ . Starting from the simplest case, for now we will assume $f (x) \geq 0$ .

Definition Let $f$ be a position measurable function, then the integral of $f$ with respect to the measure of $f$ is

\begin{matrix} (3) & \int f d μ = sup \int ψ d μ, \end{matrix}

where the supremum is taken over all positive simple functions $ϕ$ such that $ϕ < f$ everywhere. This means we are limiting function $f$ from below. $f$ is said to be $μ$ integrable if its integral is finite.

$f$ is only integrable if the the set

\begin{matrix} (4) & {x ∣ f (x) = \infty} \end{matrix}

is $μ$ -negligible. Note that this condition is necessary but not sufficient.

Next we give the monotone convergence theorem without proof.

Theorem Monotone convergence theorem. Let $(f_{n})$ be a sequence of positive measurable functions such that

\begin{matrix} (5) & f_{n} (x) < f_{n + 1} (x) \forall n \in N \forall x \in X \end{matrix}

and

\begin{matrix} (6) & \forall x \in X lim_{n \to \infty} f_{n} (x) = f (x) . \end{matrix}

Then

\begin{matrix} (7) & lim_{n \to \infty} \int f_{n} d μ = \int f d μ . \end{matrix}

This theorem is also called the Beppo Levi theorem.

The integral of a positive function is only zero if the function is zero almost everywhere.

What about functions that are not positive? The solution is rather intuitive: We separate positive and negative parts and treat them both as positive functions. Given a real-valued function $f$ , define

\begin{matrix} (8) & f^{+} = {\begin{cases} f (x) & f (x) > 0 \\ 0 & f (x) < 0 \end{cases} \end{matrix}

and

\begin{matrix} (9) & f^{-} = {\begin{cases} - f (x) & f (x) < 0 \\ 0 & f (x) > 0 \end{cases} \end{matrix}

where both $f^{\pm}$ are positive functions. $f$ is said to be $μ$ -integrable if both $f^{+}$ and $f^{-}$ are integrable, and the integral of $f$ is

\begin{matrix} (10) & \int f d μ = \int f^{+} d μ - \int f^{-} d μ . \end{matrix}

Quite intuitive, isn’t it? For complex-valued functions, we just need to treat its real and imaginary part as real functions, everything follows from that of a real-valued function.

The interesting thing is that, as far as integral is concerned, we can neglect the sets with zero measure. Traditionally, any function has an operation called evaluation which is defined pointwise, e.g. given a real-valued function $f (x)$ , we can evaluate it at point $x = x_{0}$ , which gives us a real number. However, a point has measure zero, so, again, as far as integral is concerned, evaluation is not needed. You could perfectly come up with a function which can’t be evaluated at some point, for instance a function at $x = x_{0}$ might not be defined, but still the function could be integrable. To say a function is $μ$ -integrable just means that $f^{+, -}$ are $μ$ -integrable respectively.

Two functions $f, g$ has the same integral if they equal to each other almost everywhere, and this is a equivalence relation.

The sum of a convergent series is nothing but an integral on the counting measure. Recall that give an finite set $S$ and $A \subseteq S$ then the cardinality of $A$ is the number of elements in $A$ , sometimes denoted by $Num (A)$ . Then $Num$ as a measure is called the counting measure.

Theorem The set of all $μ$ -integral functions form a vector space, it is denoted by $L_{μ}^{1}$ .

The superscript $1$ denotes the power of the function. $L_{μ}^{2}$ would mean the square of the functions are integrable. This theorem is easy to verify since addition, subtraction and number-production of an element in $L_{μ}^{1}$ is still in $L_{μ}^{1}$ .

Theorem If $f$ is $μ$ -integrable, so is $| f |$ , and we have

\begin{matrix} (11) & | \int f d μ | \leq \int | f | d μ . \end{matrix}

Recall that for $f$ to be $μ$ -integrable, by definition both $f^{+}$ and $f^{-}$ are required to be finite.

A measurable function is $μ$ -integrable if and only if $| f |$ is $μ$ -integrable.

If a function is Riemann integrable on $[a, b]$ then it is also Lebesgue integrable on $[a, b]$ , and the two results agree. Indeed, if $f$ is Riemann integrable then we can always find two limiting step functions sandwiching $f$ from above and below,

\begin{matrix} (12) & g_{ϵ} \leq f \leq h_{ϵ} \end{matrix}

where $g, h_{ϵ}$ are step functions and

\begin{matrix} (13) & \int (g_{ϵ} - h_{ϵ}) d μ < ϵ . \end{matrix}

Step function are always Lebesgue integrable, plus the function is bounded, thus $f$ is also Lebesgue integrable. To summarize we have

\begin{matrix} (14) & Riemann integrable ⟹ Lebesgue integrable . \end{matrix}

The inverse, however, is not necessarily true. A famous example is the Dirichlet function, defined on $[0, 1]$ and

\begin{matrix} (15) & f (x) = {\begin{cases} 0 & if x is rational, \\ 1 & if x is irrational . \end{cases} \end{matrix}

$f = 1$ almost everywhere, since the Lebesgue measure of the rational numbers in $[0, 1]$ is $0$ . Recall that $f = 1$ almost everywhere means that the set on which $f = 0$ has measure zero, and this is true since $f (r a t i o a n l) = 0$ and rational numbers are countable, plus each point has zero measure, and countable many zeros are still zero. However, it is impossible to find two step function to sandwich Dirichlet function, thus it is not Riemann integrable.

The class of Riemann integrable functions is quite restricted. Of course a continuous, bounded function on a finite interval $[a, b]$ is Riemann integrable, but being Riemann integrable actually requires less – any bounded function on $[a, b]$ which is continuous almost everywhere is Riemann integral. For example, the $θ (x)$ function defined on $[- 2, 2]$ is continuous except at $x = 0$ , so it is continuous almost everywhere, thus Riemann integrable.

E.g. The function

\begin{matrix} (16) & f : x \mapsto \frac{\sin x}{x} \end{matrix}

is not integrable with respect to the Lebesgue measure on $R^{+}$ since $| f |$ is not integrable on $R^{+}$ .

We know that $\frac{\sin x}{x}$ is conditionally convergent, such notion does not exist in Lebesgue theory of integral.

2. Lebesgue’s Dominated Convergence Theorem

According to my mathematical friends, the dominated convergence theorem is one of the most important results of Lebesgue’s integration theorem. It tells us how to deal with the limit of a sequence of functions under the integral sign, and it might be trickier than some physicist might have thought.

To understand Lebesgue’s theorem we need the following lemma.

Fatou’s lemma. Let $(f_{n})$ be a sequence of functions which are nonnegative and measurable on $(X, A, μ)$ . Then

\begin{matrix} (17) & \int lim_{n \to \infty} inf f_{n} d μ \leq lim_{n \to \infty} inf \int f_{n} d μ . \end{matrix}

The equal sign is not surprising at all, what usually surprises people is the less than sign, for Fatou’s lemma tells us that if you first take pointwise limit of a sequence of functions, then integrate the limit, what you get might be less than integrate each function in the sequence and take the limit later.

To have an intuitive feeling about the lemma, consider an example where the less-than-relation holds. Consider a sequence of functions $(f_{n}), n \in N$ defined on $R$ with Borel $σ$ -set,

\begin{matrix} (18) & f_{n} (x) = {\begin{cases} \frac{1}{n} & x \in (0, n), \\ 0 & otherwise . \end{cases} \end{matrix}

This sequence uniformly converges to zero function on $R$ , Thus

\begin{matrix} (19) & \int lim_{n \to \infty} inf f_{n} d μ = \int_{0}^{n} lim_{n \to \infty} inf \frac{1}{n} d μ = \int 0 d μ = 0, \end{matrix}

On the other hand, the integral of $f_{n}$ on $R$ is always $1$ , thus

\begin{matrix} (20) & lim_{n \to \infty} inf \int f_{n} d x = lim_{n \to \infty} inf \int_{0}^{n} \frac{1}{n} d μ = lim_{n \to \infty} inf 1 = 1, \end{matrix}

and $0 < 1$ .

The integral of a function not only depends on its pointwise value, but also the support. In this context, taking the pointwise limit of a function might give you different results because the limit procedure might give you some function value which is qualitative different, for example $\frac{1}{n}$ is qualitatively different from $0$ because there is a number that multiplies $\frac{1}{n}$ could give you $1$ but there is no such number for $0$ . By taking the limit under the integral sign, we might miss the information about the support of the function, while taking the limit of the integral will not. That’s why these two results could be different.

The condition that $(f_{n})$ being non-negative is also important, it is necessary for the $\leq$ sign to hold.

To prove the lemma, define

\begin{matrix} (21) & g_{n} := inf_{k \geq n} f_{k}, \end{matrix}

then $g_{n}$ is 1)measurable since $f_{n}$ ’s are measurable, 2) non-decreasing and 3) $g_{n} \leq f_{n}$ by construction. Properties 2) and 3) are pointwise. We have

\begin{matrix} (22) & lim_{n \to \infty} \int g_{n} d μ = \int lim_{n \to \infty} inf f_{n} d μ, \end{matrix}

Where we have interchanged the order of $lim_{n \to \infty}$ and $\int d μ$ . This is allowed since both sides are $< \infty$ . Since $g_{n} \leq f_{n}$ , we have

\begin{matrix} (23) & \int g_{n} d μ \leq \int f_{n} d μ ⟹ lim_{n \to \infty} \int g_{n} d μ \leq lim_{n \to \infty} \int f_{n} d μ = lim_{n \to \infty} inf \int f_{n} d μ . \end{matrix}

The lemma follows.

Theorem Lebesgue’s dominated convergence theorem. Let $(X, A, μ)$ be a measure space, let $(f_{n})$ be a sequence of measurable functions from $X$ to $R$ or $C$ that converge to $f$ almost everywhere. If there exists $μ$ -integral function $g$ such that, for all positive integers $n$ , $| f_{n} | \leq g$ , then

\begin{matrix} (24) & lim_{n \to \infty} \int f_{n} d μ = \int f d μ . \end{matrix}

Note the position of the limit, it is in front of the integral not under it, so information about the support of the function is already included in the integral.

$f$ is measurable since $g$ is and

\begin{matrix} (25) & | f_{n} | \leq g ⟹ | f | \leq g . \end{matrix}

For the same reason $f$ is integrable. Since $| f_{n} - f | \leq 2 g$ , the sequence of functions

\begin{matrix} (26) & 2 g - | f_{n} - f | \end{matrix}

are non-negative. Then Fatou’s lemma applies and yields

\begin{matrix} (27) & \int lim_{n \to \infty} inf (2 g - | f_{n} - f |) d μ \leq lim_{n \to \infty} inf \int (2 g - | f_{n} - f |) d μ . \end{matrix}

Since $lim inf f_{n} = f$ , we have

\begin{matrix} (28) & 2 \int g d μ \leq 2 \int g d μ - lim_{n \to \infty} inf \int | f_{n} - f | d μ . \end{matrix}

Since $\int g d μ < \infty$ , we can subtract it on both sides,

\begin{matrix} (29) & lim_{n \to \infty} inf \int | f_{n} - f | d μ \leq 0, \end{matrix}

however the integral of a non-negative must be non-negative, thus

\begin{matrix} (30) & lim_{n \to \infty} inf \int | f_{n} - f | d μ = 0 ⟹ lim_{n \to \infty} \int f_{n} d μ = \int f d μ . \end{matrix}

The $n$ -independent function $g$ is said to dominate the sequence $(f_{n})$ . Roughly speaking it provides some kind of an upper limit so the sequence of functions behave nicely under limiting procedure.

There indeed exists functions that can not be dominated by other functions, such as our old friend $f_{n} : x \mapsto \frac{1}{n} 1_{[0, n]}$ . To see that $f : x \mapsto \frac{1}{n} 1_{[0, n]}$ has no dominating function $g (x)$ we notice that the support of the function goes to $\infty$ as $n \to \infty$ . Of course you can find a function so that $| f_{n} | \leq g$ for all $n$ , such as $g = 1$ on $R$ , but then $g$ wouldn’t be measurable. Thus dominated convergence theorem doesn’t apply here. The theorem applies to function $f : x \mapsto x^{n}$ defined on $[0, 1]$ , since it is dominated by $1_{[0, 1]}$ . The takeaway is that, the support of a function matters!

In Riemann’s integral theory there is something called the improper integral, such as infinite integral, or when the integrand is discontinuous. They generalize Riemann’s integral theory. There is no such thing in Lebesgue’s integral theory.

Example Consider the sequence of functions defined on $[0, 1]$ ,

\begin{matrix} (31) & f_{n} : x \mapsto \frac{n^{3 / 2} x}{1 + n^{2} x^{2}} \end{matrix}

which tends to zero at large $n$ . However, the convergence is not uniform since no matter how large $n$ is, there always exists a very small $x \sim \frac{1}{n}$ where $f_{n} \sim \frac{\sqrt{n}}{2}$ . Thus it is not clear that its Riemann integral is zero or not,

\begin{matrix} (32) & lim_{n \to \infty} \int_{0}^{1} f_{n} d x \overset{?}{=} 0. \end{matrix}

Here is where the dominated convergence theorem come to aid. We can find a measurable function $\frac{1}{\sqrt{x}}$ that dominates $f_{n}$ , thus

\begin{matrix} (33) & lim_{n \to \infty} \int_{0}^{1} f_{n} d x = \int_{0}^{1} lim_{n \to \infty} f_{n} d x = 0. \end{matrix}

The following results are important to study a function defined by an integral. Since summation is just a discrete form of integral, the results apply equally to functions defined by a series.

Theorem Let $(X, A, μ)$ be a measure space and $(Y, d)$ be a metric space; if $f$ is a function defined on $X \times Y$ such that

for all $y \in Y$ , $x \mapsto f (x, y)$ as a function of $x$ is measurable,
for all $x \in X$ , $y \mapsto f (x, y)$ as a function of $y$ is continuous at $y_{0}$ ,
there exists an integrable function $g$ on $X \times Y$ such that for all $x \in X$ and $y \in Y$ , $| f (x, y) | \leq g$ .

Then, for all $y \in Y$ , $x \mapsto f (x, y)$ is integrable, and the function defined by integral

\begin{matrix} (34) & F : y \to \int f (x, y) d μ \end{matrix}

is continuous at $y = y_{0}$ .

The proof can be found in textbooks. The proof of course uses the Lebesgue’s dominance convergence theorem.

There is a similar theorem, Theorem let $(X, A, μ)$ be a measure space and $I$ an open interval of $R$ , if $f$ is a function defined on $X \times I$ and satisfies

$x \mapsto f (x, y)$ is integrable for all $x \in X$ ,
$y \mapsto f (x, y)$ is differentiable for $x$ almost everywhere,
there exists a function defined on $X \times I$ that dominates $f (x, y)$ ,

then for all $y \in I$ , the function

\begin{matrix} (35) & x \mapsto \frac{d}{d y} f (x, y) \end{matrix}

is integrable, and the function

\begin{matrix} (36) & F : y \mapsto \int f (x, y) d y \end{matrix}

is differentiable.

As an example, let’s look at the Bessel function of the first kind, defined by

\begin{matrix} (37) & J_{n} (x) = \frac{1}{π} \int_{0}^{π} \cos (n θ - x \sin θ) d θ \end{matrix}

Since $x \mapsto \cos (n θ - x \sin θ)$ is continuous with absolute value $\leq 1$ (being dominated by a measurable function), $J_{n}$ is continuous. It is also differentiable, hence $J_{n}$ is also differentiable. It is actually infinitely differentiable, it is the solution of equation

\begin{matrix} (38) & y^{″} + \frac{1}{x} y^{'} + (1 - \frac{n^{2}}{x^{2}}) y = 0. \end{matrix}

Within the framework of Lebesgue’s integral theory, differentiation under the integral sign is not permitted. Consider the function $F$ defined by

\begin{matrix} (39) & F (x) = \int_{- \infty}^{\infty} \frac{e^{i x y}}{1 + y^{2}} d y, \end{matrix}

the integrand is bounded by $\frac{1}{1 + y^{2}}$ , which is integrable on $R$ . Follow the same arguing we see that $F (x)$ is continuous. If we differentiate under the integral sign, we obtain

\begin{matrix} (40) & F (x) = \int_{- \infty}^{\infty} \frac{i y e^{i x y}}{1 + y^{2}} d y, \end{matrix}

which is bounded by

\begin{matrix} (41) & \frac{y}{1 + y^{2}} \end{matrix}

which is not measurable, thus the theorem we introduced before doesn’t apply, and $F^{'} (x)$ can not be expressed by the above integral. In fact the original integral can be calculated using the residue theorem yielding

\begin{matrix} (42) & F (x) = π e^{- | x |} \end{matrix}

and the absolute value is where the problem arises. This function is not differentiable at $x = 0$ but infinitely differentiable at the complement of the origin.

3. Fubini’s Theorem

We often meet double integrals, that is integrals of measurable functions defined on some product space $X \times Y$ where both $X$ and $Y$ are measurable spaces, $(X, A, μ)$ and $(Y, B, ν)$ . The product $σ$ -algebra is denoted $A \otimes B$ and product measure $μ \otimes ν$ .

Let $E$ be a subset of $X \times Y$ , the sets

\begin{matrix} (43) & E_{x} := {y ∣ (x, y) \in E} and E^{y} := {x ∣ (x, y) \in E} \end{matrix}

are called the $x$ -section and $y$ -section of $E$ . It is called a section since, given a map $π_{x} : (x, y) \mapsto x$ , and regard $E \overset{π_{x}}{\to} x$ as a fiber bundle then $E_{x}$ is just the section of $π_{x}$ . Similar for $E_{y}$ .

Without proof, we claim that measurable sets have measurable sections. This should be intuitive since the section is a subset of the product space, and the $σ$ -algebra of the product space still applies to the subset. I am not sure why in $E_{x}$ , $x$ appears in the subscript while in $E^{y}$ , $y$ appears in the superscript.

To put the above claim in mathematical language, we have

Theorem Let $(X \times Y, A \otimes B)$ be the product measurable space, if $E$ belong to $X \times Y$ then

\begin{matrix} (44) & for all x \in X, y \in Y, we have E^{y} \in A and E_{x} \in B . \end{matrix}

As a corollary, given a function $f$ on $X \times Y$ , for all $x \in X$ , the function $f_{x} (y) := y \mapsto f (x, y)$ is measurable, as it is nothing but the $x$ -section.

We distinguish two closely related concepts, namely finite measure and * $σ$ -finite measure, the extra $σ$ - makes all the difference.

given a measure space $(X, A, μ)$ ,

the measure is called finite measure, if the measure of the entire $X$ is finite, and a subset $A \subset X$ is of finite measure if $μ (A) < \infty$ . This is quite self-explanatory. On the other hand,
The measure is called $σ$ -finite if $X$ is a union of countable (could be infinite) subsets, each subset has a finite measure. A subset of $X$ is said to have finite $σ$ -measure if it is countable union of measurable sets with finite measure.

$σ$ -finite is a weaker condition than finite measure. For example, $R$ under the Lebesgue is NOT a finite measure since $μ (R) = \infty$ , however it is $σ$ -finite, since we can separate $R$ into countable intervals, each with finite $σ$ -measure.

The next theorem deals with the order of double integrals, and it makes use of the concept of $σ$ -finiteness.

Theorem Let $(X, A, μ)$ and $(Y, B, ν)$ be two measure spaces, the measure $μ$ and $ν$ being $σ$ -finite. If $E$ belong to the $σ$ -algebra $A \otimes B$ , then, for all $x \in X$ and $y \in Y$ ,

\begin{matrix} (45) & \int ν (E_{x}) d μ (x) = \int μ (E^{y}) d ν (y) . \end{matrix}

This theorem is important but the proof is not (for physicists at least) so we will skip it, interested readers can refer to textbooks on measure theory.

Fubini’s theorem. Let $(X, A, μ)$ and $(Y, B, ν)$ be two measure spaces, measure $μ, ν$ being $σ$ -finite. Let $f$ be a $μ \otimes ν$ -integrable function defined on $X \times Y$ . The function

\begin{matrix} (46) & x \mapsto \int f (x, y) d ν (y) \end{matrix}

as a function of $x$ is $μ$ -integrable. Namely, we can first perform the integral over $y$ , the remaining function of $x$ is $μ$ -integrable. Similarly, we can perform the integral over $x$ first and the remaining function of $y$ is still integrable, namely the function

\begin{matrix} (47) & y \mapsto \int f (x, y) d μ (x) \end{matrix}

is $ν$ -integrable. The order of integral is exchangeable,

\begin{aligned} (48) & \int f d μ (x) \otimes ν & = \int [\int f (x, y) d ν (y)] d μ (x) \\ (49) & = \int [\int f (x, y) d μ (x)] d ν (y) . \end{aligned}

The proof is also skipped here, we only mentioned that the definition of Lebesgue integral, namely liming the real integral of a function by the infimum of simple functions, is used, as well as the monotone convergence theorem.

Since the two measures $μ, ν$ play a symmetric role, we can also write

\begin{matrix} (50) & \int f d μ \otimes ν = \int f d μ (x) d ν (y) \end{matrix}

If the function is $μ \otimes ν$ -integrable, Fubini’s theorem is often used to justify interchanging the order of the integrals.

To apply Fubini’s theorem, it is essential to verify that the function $f$ is $μ \otimes ν$ integrable, that is, $f$ has to be measurable, and $| f |$ is also measurable, which means one of the three following things,

$\int | f | d μ \otimes ν < \infty,$
$\int d μ (x) \int d ν (y) | f (x, y) | < \infty,$
$\int d ν (y) \int d μ (x) | f (x, y) | < \infty .$

Fubini’s theorem for positive functions is also known as the Fubini-Tonelli theorem.

Since summation is just integral with counting measure, we could replace one or both of the integral signs as summation. Then Fubini’s theorem states that we can interchange the order of summation (or of summation and integral) in the case of absolutely convergent double series (or summation of integrals, etc.). An counter example is

\begin{matrix} (51) & \int_{- \infty}^{\infty} e^{- x^{2} - λ x^{4}} d = \int_{- \infty}^{\infty} e^{- x^{2}} e^{- λ x^{4}} d x = \int_{- \infty}^{\infty} e^{- x^{2}} \sum_{n = 0}^{\infty} \frac{(- λ)^{n}}{n!} x^{- 4 n} d x, \end{matrix}

the radius of convergence of the summation is $\infty$ . However if we interchange the order of integral and summation, namely move the summation in from of the integral sign, we have

\begin{matrix} (52) & \sum_{n = 0}^{\infty} \frac{(- λ)^{n}}{n!} \int_{- \infty}^{\infty} e^{- x^{2}} x^{- 4 n} d x \end{matrix}

where the integral can be performed with the help of Gamma function, then after some calculation we have a divergent power series, thus not measurable. Then Fubini’s theorem does not apply to this situation.

When applicable, Fubini’s theorem can be used to integral a function defined by another integral. However, it is possible that interchanging the order of integrals gives two different results. For example, consider function

\begin{matrix} (53) & f : x \mapsto \frac{x^{2} - y^{2}}{(x^{2} + y^{2})^{2}} \end{matrix}

defined on $[0, 1] \times [0, 1] ∖ (0, 0)$ where $(0, 0)$ is the origin where the function blows up, and $A ∖ B$ means $A$ minus $B$ . Since $(0, 0)$ is a point and points have measure zero, it is not necessary to define $f$ for $(x, y) = (0, 0)$ . Since

\begin{matrix} (54) & \frac{d}{d x} \frac{- x}{x^{2} + y^{2}} = \frac{x^{2} - y^{2}}{(x^{2} + y^{2})^{2}}, \end{matrix}

we have

\begin{matrix} (55) & \int_{0}^{1} d y \int_{0}^{1} d x f (x, y) = - \frac{π}{4}, \end{matrix}

however if we interchange the order of integral, we have

\begin{matrix} (56) & \int_{0}^{1} d x \int_{0}^{1} d y f (x, y) = \frac{π}{4}, \end{matrix}

note the extra minus sign. Hence $f$ is not integrable. As a matter of fact we have

\begin{matrix} (57) & \int_{0}^{1} d y \int_{0}^{1} d x | \frac{x^{2} - y^{2}}{(x^{2} + y^{2})^{2}} | = \infty . \end{matrix}

If interchanging the order of the repeated integrals gives identical results, that doesn’t prove that Fubini’s theorem applies, for the integral of the absolute value of the function might not be convergent. Consider, for example, function $f$ defined on $R_{+}^{2}$ , $f : (x, y) \mapsto \sin (x^{2} + y^{2})$ . Expanding $\sin (x^{2} + y^{2})$ and taking into consideration the Fresnel’s integral

\begin{matrix} (58) & \int_{0}^{\infty} e^{i x^{2}} d x = \sqrt{2 π} (1 + i) / 4 \end{matrix}

we have

\begin{matrix} (59) & \int_{0}^{\infty} d y \int_{0}^{\infty} d x \sin (x^{2} + y^{2}) = \int_{0}^{\infty} d x \int_{0}^{\infty} d y \sin (x^{2} + y^{2}) = \frac{π}{4} . \end{matrix}

However, $\sin (x^{2} + y^{2})$ is not measurable on $R_{+}^{2}$ since the integral of its absolute value is divergent,

\begin{matrix} (60) & \int_{0}^{\infty} d y \int_{0}^{\infty} d x | \sin (x^{2} + y^{2}) | = \infty, \end{matrix}

this result if readily obtained using the change of variables: $x = ρ \cos θ, y = ρ \sin θ$ .

Note that the for Fubini’s theorem to be valid, the measure $μ, ν$ have to be $σ$ -finite. The most popular example for a non-sigma-finite measure maybe the counting measure on $R$ . The counting measure gives the cardinal (number of elements) of a set, and the set of all the real numbers is uncountable. On the contrary, the set of rational numbers $Q$ in countable.

Consider a function $f$ defined on $[0, 1] \times [o, 1]$ by

\begin{matrix} (61) & f (x, y) = {\begin{cases} 1, & x = y, \\ 0, & otherwise. \end{cases} \end{matrix}

Here comes the interesting part: on the $y$ -axis we adopt the counting measure $ν (y)$ instead of the familiar Lebesgue measure, while on the $x$ -axis we still use the Lebesgue measure $μ (x)$ . Recall that $ν (point) = 1$ , we have

\begin{matrix} (62) & \int_{0}^{1} f (x, y) d ν (y) = 1, \end{matrix}

if we integrate with respect to $y$ first then to $x$ , we have

\begin{matrix} (63) & \int_{0}^{1} d μ (x) \int_{0}^{1} d ν (y) f (x, y) = \int_{0}^{1} d μ (x) 1 = 1. \end{matrix}

But if we integrate with respect to $x$ first, we get

\begin{matrix} (64) & \int_{0}^{1} d ν (y) \int_{0}^{1} d μ (x) f (x, y) = \int_{0}^{1} d μ (x) 0 = 0, \end{matrix}

Apparently the Fubini’s theorem doesn’t apply here.

The convolution of two real-valued integrable (with respect to Lebesgue measure) functions $f, g$ is defined as

\begin{matrix} (65) & f * g (x) = \int_{- \infty}^{\infty} f (x - y) g (y) d y, \end{matrix}

which is also integrable, since

\begin{matrix} (66) & \int | f * g | d m \leq \int | f | d m \int | g | d m, \end{matrix}

where $d m$ is whatever measure used for the integral.

Next we give an example where the interchange of integral and summation makes sense. Consider the integral

\begin{matrix} (67) & \int_{0}^{\infty} \frac{\sin x}{e^{x} - 1} d x \end{matrix}

which is convergent. We may write

\begin{aligned} (68) & \int_{0}^{\infty} \frac{\sin x}{e^{x} - 1} d x & = \int_{0}^{\infty} \frac{\sin x}{e^{x} (1 - e^{- x})} d x \\ (69) & = \int_{0}^{\infty} d x \sum_{k = 1}^{\infty} e^{- k x} \sin x \\ (70) & = \sum_{k = 1}^{\infty} \int_{0}^{\infty} d x e^{- k x} \sin x, \end{aligned}

where we have used

\begin{matrix} (71) & (1 - e^{- x})^{- 1} = \sum_{n = 0}^{\infty} e^{- (n + 1) x} . \end{matrix}

The interchange of the sign $\sum$ and $\int$ is justified, since the Fubini theorem applies to the function $(x, k) \mapsto e^{- k x} \sin x$ , which is due to the fact that

\begin{matrix} (72) & \int e^{- k x} \sin x d x < \infty \end{matrix}

the exponential suppression makes the integrand Lebesgue integrable. By the end of the day we get

\begin{matrix} (73) & \int_{0}^{\infty} \frac{\sin x}{e^{x} - 1} d x = \sum_{k = 1}^{\infty} \frac{1}{1 + k^{2}} . \end{matrix}

The right hand side of the above expression can be calculated by the residual theorem. Finally, we have

\begin{matrix} (74) & \int_{0}^{\infty} \frac{\sin x}{e^{x} - 1} d x = \frac{1}{2} (π \coth π - 1) . \end{matrix}

As we can see, Lebesgue’s theory is kind of a generalization of Riemann’s theory in term of the concept of measure. Lebesgue’s theorem can be stated as follows,

Theorem A function defined on a bounded interval $[a, b]$ is Riemann integrable if and only if it is bounded and continuous almost everywhere.

3.1. Notes

Given a map $f$ from a set $X$ into a set $Y$ , the preimage of a $σ$ -algebra in $Y$ is a $σ$ -algebra in $X$ but the inverse is not necessary true. However, if $A$ is a $σ$ -algebra in $X$ , the set of all the subsets of $Y$ such that their preimage is in $A$ forms as $σ$ -algebra and is called the induced $σ$ -algebra.

Let $(a_{n})$ be a sequence of $R$ and for any $k \in N$ , define

\begin{matrix} (75) & b_{k} = inf (a_{k}, a_{k + 1}, \dots) = inf {a_{n} ∣ n \geq k} \end{matrix}

and

\begin{matrix} (76) & B_{k} = sup (a_{k}, a_{k + 1}, \dots) = sup {a_{n} ∣ n \geq k} \end{matrix}

then

\begin{matrix} (77) & lim_{n \to \infty} inf a_{n} = sup (b_{1}, b_{2}, \dots) \end{matrix}

and

\begin{matrix} (78) & lim_{n \to \infty} sup a_{n} = inf (B_{1}, B_{2}, \dots) . \end{matrix}

lim inf is called the lower limit of the sequence $a_{n}$ , respectively lim sup is called the upper limit.

As the last part of the note, let’s briefly summarize the integration theory for function defined on a finite interval $[a, b]$ .

By a partition $π$ of a closed interval $[a, b]$ we mean the finite set

\begin{matrix} (79) & π = {a = x_{0}, x_{1}, \dots, x_{m - 1}, x_{m} = b} \end{matrix}

where $x_{0} < x_{1} < \dots < x_{m}$ . In other words, a partition is a way to divide an interval into finite disjoint subsets without neglecting anything.

The norm of the partition is

\begin{matrix} (80) & δ (π) = sup (x_{k} - x_{k - 1}) . \end{matrix}

Given a continuous function on $[a, b]$ , the Cauchy sum associated to $f$ and $π$ is defined as

\begin{matrix} (81) & S_{π} (f) := \sum_{k = 1}^{m} f (x_{k - 1}) (x_{k} - x_{k - 1}) . \end{matrix}

$f$ being continuous, it can be shown that as long as the norm of the partition is sufficiently small, the difference between Riemann integral and Cauchy sum is arbitrarily small. To be more specific, if $(π_{n})$ is a sequence of partitions of $[a, b]$ such that $lim_{n \to \infty} δ (π_{n}) = 0$ , then $(S_{π_{n}} (f))$ is a Cauchy sequence. Cauchy sequence is a sequence whose elements become arbitrarily close to each other as the sequence progresses. This limit is called the Cauchy integral of $f$ on $[a, b]$ , denoted by

\begin{matrix} (82) & \int_{a}^{b} f (x) d x . \end{matrix}

The class of Cauchy integrable functions is much larger than the class of continuous functions.

Any finite linear combination of characteristic functions of open bounded intervals of $R$ is called a step function, and the step functions on $R$ form a vector space. Given a step function $s (x)$ , the mapping

\begin{matrix} (83) & s \mapsto ‖ s ‖ = sup s (x) for all x \in R \end{matrix}

is called the norm of the uniform convergence. If a sequence of step functions converges uniformly to a function $f$ , this function is said to be regulated. Obviously regulated function has Cauchy integrals.

A regulated function can be discontinuous at countable number of points.

A function is Cauchy integrable if and only if it is regulated. Riemann integrability applies to a slightly larger class of functions. Recall that a function $f$ defined on $[a, b]$ is said to be Riemann integrable if, for all positive $ϵ$ , there exists two step function $g_{ϵ}, h_{ϵ}$ such that

\begin{matrix} (84) & g_{ϵ} \leq f \leq h_{ϵ} \end{matrix}

and

\begin{matrix} (85) & | \int h_{ϵ} d x - \int g_{ϵ} d x | < ϵ . \end{matrix}

If a function is bounded and Cauchy integrable, then it is also Riemann integrable.

Basic Measure Theory Part II

Table of Contents

1. Integral of Measurable Functions

2. Lebesgue’s Dominated Convergence Theorem

3. Fubini’s Theorem

3.1. Notes

Enjoy Reading This Article?