Posted by: Shuanglin Shao | September 18, 2020

Measure and Integration theory, Lecture 7

Let f be a real valued measurable function. Let

f^+=\max \{f, 0\}\ge 0; f^-= -\min\{f,0\}\ge 0.

These are the positive and negative parts of f. An easy observation is f= f^+-f^-, |f| = f^++f^-.

Definition. Let f be a real valued measurable function. If \int f^+ or \int f^- is finite, we denote \int f = \int f^+ - \int f^-.

We say f is integrable if both terms are finite.

Proposition 2.21. The set of integrable real valued functions on X is a real vector space, and the integral is a linear functional on it.

Proof. We need to prove two claims.

(1). \int \alpha f =\alpha \int f for any \alpha \in \mathbb{R}. This is easy: we distinguish three cases, \alpha>0, =0,<0.

(2). \int f+g = \int f + \int g. It is to observe that, if $h = f+g,$

then h^+-h^- = f^+-f^- +g^+-g^-. We rearrange it to obtain

h^++f^-+g^- = h^- +f^++g^+. Taking integrals on both sides yields (2).

Definition. The complex valued function f is integrable if Re f and Im f are both integrable.

Remark. It is easy to see that the space of complex valued integrable functions is a complex vector space and the integral is a linear functional over it. It is denoted by L^1.

Remark. We will regard L^1 as a set of equivalence classes of a.e. defined integrable functions on X, where f and f are considered to be equivalent if and only if f=g, a.e..

Proposition 2.22. If f\in L^1, then |\int f | \le \int |f|.

Proof. When f is real valued, the proof is easy. For complex valued f, there exists \theta,

|\int f| = e^{i\theta} \int f = \int e^{i\theta} f

by linearity of integrals. It further equals

= \int Re( e^{i\theta} f ) \le \int | Re( e^{i\theta} f ) | \le \int |f|.

Proposition 2.24. (The dominated convergence theorem. ) Let \{f_n\} be a sequence in L^1 such that

(a). f_n\to f, a.e.,

(b). There exists g\in L^1 s.t. |f_n|\le g for a.e. for all n. Then

f\in L^1 and \int f = \lim_{n\to \infty} \int f_n.

Proof. Without loss of generality, we assume that f is real valued.

Step 1. The claim that f\in L^1 is easy.

(2). By Fatou’s theorem,

\int g-f = \int \liminf_{n\to \infty} (g-f_n) \le \liminf ( \int g -\int f_n) = \int g -\limsup \int f_n.

This yields,

\limsup \int f_n \le \int f.

The same process can be applied to $\int g+f $ to show

\int f \le\liminf \int f_n.

Remark. Examples. (1). f_n = n1_{[0,\frac 1n]} and f=0.

(2). f_n= 1_{[n,n+1]} and f=0.

(3). f_n = \frac {1}{n} 1_{[0,n]} and f=0.

Theorem 2.26. If f\in L^1 and \epsilon>0, there is an integrable simple function \phi=\sum a_j 1_{E_j} such that \int |f-\phi|<\epsilon. If \mu is the Lebesgue-Stieljes measure on \mathbb{R}, the sets E_j in the definition of \phi can be taken to finite unions of open intervals; moreover there is a continuous function g that vanishes outside a bounded interval such that \int |f-g|<\epsilon.

Proof. Step 1. For an integrable function f, there exists a sequence of simple functions \phi_n such that

0\le |\phi_1| \le |\phi_2| \le \cdots \le |f|

and \phi_n\to f pointwise. By DCT, |f-\phi_n|\le 2|f|,

\int |f-\phi_n| \to 0.

Step 2. From Step 1, \int |\phi|\le \int |f|+\epsilon. That is to say

\sum |a_j| m (E_j)<\infty for a_j\neq 0.

Thus m(E_j)<\infty for each j. $ For each j, by Theorem 1.20, there exists a set A_j that is a finite unions of open intervals such that m(A_j\Delta E_j)<c_j\epsilon for some small constant c_j. By taking c_j small, we can take the sets in the definition of \phi to be finite unions of open intervals.

Step 3. Since we can approximate each 1_{E_j}, where E_j is an open interval of finite length, by continuous functions f_j with the following property

\int | 1_{E_j} - f_j| \le d_j \epsilon

for some constant d_j. By taking d_j small, we can replace \phi by a continuous function g that vanishes outside a bounded interval such that \int |f-g| <\epsilon.

Theorem 2.27. Suppose that f: X\times [a,b]\to \mathbb{C}, -\infty<a<b<\infty and that f(\cdot, t):\, X\to \mathbb{C} is integrable for each t\in [a,b]. Let F(t) =\int_X f(x,t) d\mu(x).

(a). Suppose that there exists g\in L^1(\mu) such that |f(x,t) |\le g(x) for all x,t. If \lim_{t\to t_0} f(x,t) =f(x,t_0) for every x, then

\lim_{t \to t_0} F(t) =F(t_0). In particular, if f(x, \cdot) is continuous for each $latrex x$, then F is continuous.

(b). Suppose that \frac {\partial f}{\partial t} exists and there is a g\in L^1(\mu) such that |\frac {\partial f}{\partial t} (x,t)|\le g(x) for all x,t. Then F is differentiable and

F'(t) = \int \frac {\partial f}{\partial t} (x,t) d\mu(x).

Proof. (a). We are proving the claim in (a) by using sequential characterization of continuity. Let t_n\to t_0, the sequence of measurable functions \{f(x,t_n)\} is uniformly by a L^1 function and it converges to $f(x,t_0)$ pointwise. Therefore by the dominated convergence theorem,

\lim_{t\to t_0} \int f(x,t_n)d\mu(x) = \int f(x,t_0) d\mu, i.e., \lim_{n\to \infty} F(t_n)= F(t_0).

(b). The proof is similar. We need to prove that, for each t\in [a,b], h_n\to 0,

\lim_{n\to \infty} \frac {F(t+h_n)-F(t)}{h_n} = F(t_0).

By applying the mean function for f in t, we see that the sequence $ \{ \frac {f(x,t+h_n) -f(x,t)}{h_n}\}$ is uniformly bounded by a L^1 function g; moreover it converges to \frac {\partial f} { \partial t}. Again by the dominated convergence theorem,

\lim_{n\to \infty} \frac {F(t+h_n)-F(t)}{h_n} = \int \frac {\partial f} { \partial t}(x,t ) d\mu = F(t).

We next discuss the relation between Riemann integrable functions and Lebesgue measurable function.

Theorem. Let f be a bounded real-valued function on [a,b]. If f is Riemann integrable, then it is Lebesgue integrable. Moreover the two integrals are equal.

Remark. Without loss of generality, we assume that f is a nonnegative function. As in Theorem 2.10, there is a sequence of simple functions \{\phi_n\} that satisfies,

\phi_1 (x) \le \phi_2(x) \le \cdots \le f

and \phi_n\to f uniformly on [a,b]. Although f is a pointwise limit of \{\phi_n\}, the Lebesgue measurability of \phi_n is not immediately clear. We should seek to prove the Lebesgue measurability f in other ways.

Proof. Let P= \{a= t_0<t_1<\cdots t_n=b\} by a partition of [a,b]. Let

G_P= \sum M_j 1_{[t_{j-1}, t_j]}, g_P = \sum m_j1_{[t_{j-1}, t_j]},

where M_j, m_j are supremum and infimum of f on [a,b]. Since $f$ is Riemann integrable, we can choose a sequence of partitions \{P_n\} whose mesh size \max_j (t_j-t_{j-1}) \to 0 and

\lim_{n\to \infty} \int_{[a,b]} G_{P_n} = \lim_{n\to \infty} \int_{[a,b]} g_{P_n} = \int_{[a,b]} f (x)dx.

These are understood that the two sequences \sum M_j (t_j-t_{j-1}) and \sum m_j (t_j-t_{j-1}) has the same limit.

This implies

\int_{[a,b]} (G_{P_n}-g_{p_n})dx =0.

On the other hand, by the monotone convergence theorem for functions, g:= \lim g_{P_n} \le f \le G:= \lim G_{P_n}.

Therefore (the following integrals are understood as in the Lebesgue sense. )

\int_{[a,b]} (G-g) dx= \lim_{n\to \infty} \int_{[a,b]} (G_{P_n} - g_{P_n}) dx

=\lim_{n\to \infty} \left( \sum M_j (t_j-t_{j-1})-\sum m_j (t_j-t_{j-1})\right) =0.

Since G\ge g pointwise, G=g, a.e., which yields that G=g=f, a.e.. Since $G$ is Lebesgue measurable as simple functions and G=f a.e., f is Lebesgue measurable.

That the two integrals are equal is immediate.

Posted by: Shuanglin Shao | September 13, 2020

Measure and Integration Theory, Lecture 6

Definition. Let (X, \mathcal{M}, \mu) be a measure space. Let

L^+= the space of all measurable functions from X to [0,\infty]. Let \phi\in L^+ and \phi is simple, \phi= \sum_{j=1}^n a_j 1_{E_j}. We define the integral of \phi with respect to \mu by

\int \phi d\mu = \sum_{j=1}^n a_j \mu(E_j).

With the convention, 0\cdot \infty =0.

Definition. If A \in \mathcal{M}, we define

\int_A \phi d\mu = \int \phi 1_A d\mu.

Remark. The definition of \int_A makes sense because

\phi1_A = \sum a_j 1_{E_j \cap A}.

Proposition. Let \phi and \psi be simple functions in L^+.

(a). If c\ge 0, \int c\phi = c\int \phi.

(b). \int \phi + \psi = \int \phi + \int \psi.

(c). If \phi \le \psi, then \int \phi \le \int \psi.

(d). The map A \mapsto \int_A \phi d\mu is a measure on \mathcal{M}.

Proof. (a) is easy.

(b). We write \phi = \sum a_j 1_{E_j}, \, \psi= \sum b_k 1_{F_k}, where \{E_j\} are disjoint sets in X and \cup E_j =X; so are \{F_k\}. The observation is that \phi+\psi takes values a_j+b_k on E_j \cap F_k. Then

\int \phi+\psi = \sum_{j,k} (a_j+b_k) \mu (E_j\cap F_k) =\sum_j a_j \mu(E_j)+\sum_k b_k \mu(F_k).

The latter sum is equal to \int \phi+\int \psi. Here we have used that

\sum_k \mu(E_j \cap F_k) = \mu(E_k).

(c) follows from (b) by observing that \psi = \phi+ (\psi-\phi).

The proof of (d) is standard.

Definition. The integral f\in L^+,

\int fd\mu = \sup\{\int \phi d\mu:\, 0\le \phi \le f, \phi \text{  simple functions} \}.

Remark. (1). This definition coincides with the old definitions when f is simple.

(2). If f\le g, f,g\in L^+, \int f \le \int g.

(3). For c\in [0,\infty), \int cf = c\int f. Indeed, for c=0, it is true. For c>0, 0\le \phi \le cf, we have 0\le \frac 1c \phi \le f. Then

\int f \ge \int \frac 1c \phi =\frac 1c \int \phi . The latter equality is because \phi is simple. Then we prove

c\int f \ge \int cf.

By the same analysis, \int cf \ge c\int f.

Thus \int cf = c\int f.

Theorem 2.14. (The monotone convergence theorem. ) If \{f_n\} is a sequence in L^+ such that f_j \le f_{j+1} for all j, and f= \lim_{n\to \infty} f_n, then

\int f = \lim_{n\to \infty} \int f_n.

Proof. The idea is to work with an approximate function to f. To begin it, we choose a sequence of simple functions \phi_n, 0\le \phi_n \le f and \int f = \lim_{n\to \infty } \int \phi_n.

For each \phi_n, for any \epsilon>0, define

A_n= \{f_k: f_k\ge (1-\epsilon) \phi_n\}.  Since \lim_{n\to \infty} f_n = f, \text{ and } 0\le \phi \le f and \phi_n is simple, there exists k_n such that

f_{k_n}\ge (1-\epsilon) \phi_n. Therefore \int f_{k_n} \ge (1-\epsilon) \int \phi_n. Since \{f_n\} is increasing,

\lim_{n\to \infty } \int f_{k_n} \ge (1-\epsilon) \int f.

This proves the claim.

Remark. (1). Positivity can not dropped. Example:

f= \begin{cases} 1, x\in [-1,0)\\ -\frac 1n, x\in [0,n] \end{cases}.

Then f_n\to f=\begin{cases} 1, x\in [-1,0), \\ 0, x\in [0,\infty). \end{cases}. It is clear that \int f_n =0 but \int f =1.

(2). Monotonicity can not be dropped. Example:

f_n(x) = 1_{[n,n+1]}. It converges to f=0. It is clear that

\int f_n =1 but \int f =0.

Theorem 2.15. If \{f_n\} is a finite or infinite sequence in L^+ and f=\sum f_n, then

\int f =\sum \int f_n.

The proof considers the partial sums of \sum_n f_n.

Theorem 2.16. If f\in L^+, then

\int f =0 \Leftrightarrow f=0, a.e.

The ``\Rightarrow" considers the set

A_n =\{f\ge \frac 1n\}, n\in \mathbb{N}.

If \mu(A_n)>0, because f\ge \frac 1n 1_{A_n}, then

\int f \ge \frac 1n \mu(A_n)>0. A contradiction.

It shows that \mu(A_n)=0.

Corollary 2.17. If \{f_n\}\in L^+, f\in L^+, and f_n increasingly converges to f a.e. x, then

\int f = \lim_{n\to \infty} \int f_n.

Theorem 2.18. (Fatou’s lemma.) If f_n is any sequence in L^+, then

\int \liminf_{n\to \infty} f_n \le \liminf_{n\to \infty} \int f_n.

The proof relies on the definition that

\liminf_{n\to \infty} f_n(x) = \lim_{n\to \infty } \inf \{f_k(x):\, k\ge n\}.

Proof. By the monotone convergence theorem,

LHS = \lim_{n\to \infty} \int \inf\{ f_k(x):\, k\ge n\} \le \lim_{n\to\infty} \inf_{k\ge n} \{ \int f_k:\, k\ge n\}.

Proposition 2.20. If f\in L^+ and \int f<\infty, then \{x:\, f(x)=\infty\} is a null set and \{x:\, f(x)>0\} is \sigma finite.

The proof uses similar idea as Theorem 2.16.

Posted by: Shuanglin Shao | September 9, 2020

Measure and Integration theory, Lecture 5

In the first chapter, we are introduced to the measure concepts and how we construct measures via the Carath\’eodory theorem. We have seen one application of this. In this chapter, we start the integration theory.

Definition. Let (X, \mathcal{M}), (Y, \mathcal{N}) be two measurable spaces. Let f: X\to Y be a map between X and Y. We say f is measurable if f is (\mathcal{M}, \mathcal{N})-measurable if f^{-1}(E)\in \mathcal{M} for all E\in \mathcal{N}.

Remark. It is clear that if \mathcal{N} is generated by \mathcal{E}, then f: X\to Y is (\mathcal{M}, \mathcal{N}) measurable if and only if f^{-1}(E) \in \mathcal{M} for all E\in \mathcal{E}. The reason is because the inverse map f^{-1} commutes with complements and unions of sets. Recall that the Borel \sigma algebra \mathcal{B}_\mathbb{R} is generated by open sets on \mathbb{R}. If X, Y are metric spaces, every continuous map is measurable.

Definition. Let f: \mathbb{R} \to \mathbb{R} or \mathbb{C} be a real valued or complex valued functions. The function f is called Lebesgue measurable if f is (\mathcal{L}, \mathcal{B}_\mathbb{C}) or (\mathcal{L}, \mathcal{B}_\mathbb{R}) measurable; it is called Borel measurable if it is (\mathcal{B}_\mathbb{R}, \mathcal{B}_\mathbb{C}) or (\mathcal{B}_\mathbb{R}, \mathcal{B}_\mathbb{R}) measurable.

Remark. Not every Lebesgue measurable set is a Borel set, i.e., \mathcal{L}\setminus \mathcal{B}_\mathbb{R} \neq \emptyset. See Exercise 9 in this chapter.

Definition. Let X be a set. Let \{ (Y_\alpha, \mathcal{N}_\alpha)\}_{\alpha \in A} be a family of measurable spaces. The function f_\alpha: X\to Y_\alpha is a map between X and Y_\alpha. The \sigma algebra generated by \{f_\alpha\} is the unique smallest \sigma algebra on X with respect to which the f_\alpha‘s are all measurable.

Remark. One can prove that this \sigma algebra is the same as the \sigma algebra generated by the sets \{f^{-1}_\alpha(E_\alpha):\, E_\alpha \in \mathcal{N}_\alpha\}_{ \alpha \in A}.

Proposition 2.4. Let X be a set. Let \{ (Y_\alpha, \mathcal{N}_\alpha)\}_{\alpha \in A} be a family of measurable spaces. Let Y= \Pi_\alpha Y_\alpha and \mathcal{N}= \otimes_{\alpha\in \mathcal{A}}\mathcal{N}_\alpha. Let \pi_\alpha:\, Y\to Y_\alpha be the coordinate maps. Then f:\, X \to Y is (\mathcal{M}, \mathcal{N})- measurable if and only if f_\alpha = \pi_\alpha \circ f is (\mathcal{M},\mathcal{N}_\alpha) measurable for all \alpha.

Proof. ``\Leftarrow".

f^{-1}(\pi^{-1}_\alpha (E_\alpha))=(\pi_\alpha \circ f)^{-1}(E_\alpha) .

``\Rightarrow". Easy by compositions.

Corollary 2.5. A function f:\, X\to \mathbb{C} is measurable if and only Re (f), Im(f) are measurable.

The point of the proof is to realize that \mathcal{B}_\mathbb{C} = \mathcal{B}_{\mathbb{R}^2}=\mathcal{B}_\mathbb{R} \otimes \mathcal{B}_\mathbb{R} and Re(f) = \pi_1\circ f where \pi_1 is the projection \mathbb{C} \to \mathbb{R}; likewise for Im(f).

Proposition 2.6. If f, g:\, X\to \mathcal{C} are measurable functions, then so are f+g and fg.

Proof. (1). The map X\to \mathbb{C}^2 defined by x\to (f(x),g(x)) are measuarable by Proposition 2.4.

(2). The mappings \mathbb{C}\otimes \mathbb{C}\to \mathbb{C} defined by (x,y)\to x+y, and (x,y)\to xy are continuous functions.

Let \bar{\mathbb{R}} = \mathbb{R}\cup {\infty}. We define Borel sets in \bar{\mathbb{R}} by

\mathcal{B}_{\bar{\mathbb{R}}} = \{E\subset \bar{\mathbb{R}}:\, E\cap \mathbb{R} \in \mathcal{B}_\mathbb{R}\}.

It is easy to verify that \mathcal{B}_{\bar{\mathbb{R}}} is the \sigma algebra generated by (a,\infty], \, a\in \mathbb{R}. Indeed, we denote the latter \sigma algebra by \sigma.

(1). \sigma \subset \mathcal{B}_{\bar{\mathbb{R}}} is obvious.

(2). Secondly we define a set

\mathcal{A}=\{E\subset \bar{\mathbb{R}}:\, E\cap \mathbb{R} \in \sigma \}.

It is easy to see that \mathcal{A} contains (a,b], a,b\in \mathbb{R} because it can be written as the difference of two sets in form of (c,\infty] with c\in \mathbb{R}. Moreover, it is a \sigma algebra. So it contains \mathcal{B}_{\mathbb{R}}. Given E\in \bar{\mathcal{B}_{\mathbb{R}}}. Either \infty\notin E or \infty \in E. If in the first case, then E= E\cap \mathbb{R} \in \mathcal{B}_{\mathbb{R}} \subset \sigma. If in the second case, E= (E\cap \mathbb{R})\cup\infty. The first element is in \mathcal{B}_{\mathbb{R}}; the second element \{\infty \} =\cap_{n\ge 1} (n,\infty] \in \sigma. So \bar{ \mathcal{B}_{\mathbb{R}}} \subset \sigma.

In general, one can not expect that two operations such as taking limits, continuity, integration and differentiation inter-changable. However it is not the case for measurability.

Proposition 2.7. If \{f_j\} is a sequence of \bar{\mathbb{R}}-valued measurable functions on (X,\mathcal{M}), then the functions

g_1(x)=\sup_j f_j(x), \, g_3(x) = \limsup_{j\to\infty} f_j(x),

g_2(x)=\inf_j f_j(x), \, g_3(x) = \liminf_{j\to\infty} f_j(x)

are all measurable. If f(x) =\lim_{j\to\infty} f_j(x) exists for every x\in X, then f is measurable.

Proof. For any a\in \mathbb{R},

\{g_1(x)= \sup_j f_j(x) >a\} =\cup_{j\ge 1}\{ f_j(x)>a\} . Thus g_1 is measurable.

For \limsup, one write

\limsup f_j(x) = \lim_{N\to \infty } \sup_{k\ge N} \{ f_j(x)\} = \inf_{N\ge 1} \sup_{k\ge N} \{ f_j(x)\}. Firstly \sup_{k\ge N} \{ f_j(x)\} is measurable; secondly \inf of measurable functions is measurable.

For the last claim, \{x:\, f(x) =\lim f_j(x)= \limsup f_j(x)\} =X. Since \limsup f_j is measurable, so f is measurable.

Corollary 2.8. If f, g: \, X\to \bar{\mathbb{R}} are measurable, then so are \max (f,g) and \min (f,g).


(1). We write

\max(f(x),g(x)) = \frac {( f(x)+g(x))+ |f(x)-g(x)|}{2}.

(2). The function |f| is a composition of f and x\to |x|. The latter is a continuous function. The composition of a measurable function and a continuous function is measurable.

Corollary 2.9. If \{f_j\} is a sequence of complex-valued measurable functions and f(x)= \lim_{j\to \infty} f_j(x) exists for all x, then f is measurable.

Proof. f(x) = Re f(x)+i Im f(x).

Definition. Let (X, \mathcal{M}) is a measurable space. If E\subset X, define the characteristic function (or the indicator function ) of E,

1_E = \begin{cases} 1, \text{ if } x\in E; \\ 0, \text{ if } 0\notin E. \end{cases}.

Remark. For $E\subset X$, 1_E is measurable if and only if E\in \mathcal{M}.

Definition. A simple function on X is a finite combination, with complex coefficients, of characteristic functions of sets in \mathcal{M}.

Theorem 2.10. Let (X,\mathcal{M}) be a measurable space.

(a). If f:\, X\to [0,\infty] is measurable, there is a sequence \{\phi_n\} of simple functions such that 0 \le \phi_1 \le \phi_2 \le \cdots, \le f, \phi_n\to f pointwise, and \phi_n \to f uniformly on any set on which f is bounded.

(b). If f:\, X\to \mathbb{C} is measurable, there is a sequence \{phi_n\} of simple functions such that 0\le |\phi_1|\le |\phi_2|\le \cdots \le |f|, \phi_n\to f pointwise, and \phi_n\to f uniformly on any set on which f is bounded.

Proof. The proof is divided into 3 steps.

Step 1. We will truncate the function f by 2^{n-1}, n\ge 1, and do a dyadic division of f whose height is between [0,2^{n-1}]. Then we approximate f by simple functions whose values are below f and dyadic numbers. More precisely,

(1). For n=1, the height is 2^{n-1} =1. Let

E_1^1= \{0\le f(x)<\frac  12\}, E_1^2= \{\frac 12 \le f(x)<1\}, and F_1= \{f(x)\ge 1\}. Then we define

\phi_1(x) = 0 1_{E_1^1}+ \frac 12 1_{E_1^2}+ 1\cdot 1_{F_1}. It is clear that $0 \le phi_1 \le f$.

For n=2, the height is 2^{n-1}=2. Let

E_2^1= \{0\le f(x)<\frac 14\}, E_2^2= \{\frac 14 \le f(x)<\frac 12\},

and E_2^{j+1}= \{ \frac {j}{4}\le f(x)< \frac {j+1}{4}\}, for 2\le j \le 7, and F_2= \{f(x) \ge 2 \}.

Then we define

\phi_2 = \sum_{j=0}^7 \frac {j}{ 4 }1_{E_1^{j+1}}+21_{F_2}.

Because \phi_2 is obtained in the way that binarily dividing each dyadic intervals between [0,2], so \phi_1\le \phi_2. It is clear that \phi_2\le f.

For general n, define 0\le j\le 2^{2n-1}-1,

E_n^{j+1}= \{\frac {j}{2^{n}} \le f(x)<\frac {j+1}{2^{n} }\} and F_n=\{f(x)\ge 2^{n-1}\}.

Then we define

\phi_n =\sum_{j=0}^{2^{2n-1}-1} \frac {j}{2^n}1_{E_n^{j+1}}+2^{n-1} 1_{F_n}.

Then it is clear that

\phi_{n-1}\le phi_n \le f .

Step 2. By monotone convergence theorem, \lim_{n\to \infty} \phi_n exists, and \lim_{n\to \infty} \phi_n(x)\le f(x). Next we show that \lim_{n\to\infty} \phi_n(x) =f(x). Given x\in X, for any \epsilon>0, we choose N so that f(x)- 2^{-N}> f(x)-\epsilon , for $n\ge N$, noting that on E_n^k for some $k$, the distance between \phi_n(x) and f(x) is less that $2^{-n}$,

f(x)- \epsilon<f(x)- \phi_n(x) \le f(x) .

This proves that \lim_{n\to \infty} \phi_n(x) =f(x). This proof also gives the uniform convergence on the set where f is bounded.

Step 3. The complex function f= Re f+ i Im f, and Re f, Im f can be decomposed into positive and negative parts. By applying the previous results, the claim for complex-valued functions is proven.

Proposition 2.11. The following propositions are valid if and only if the measure \mu is complete.

(a). If f is measurable, and f=g, \mu a.e., then g is measurable.

(b). If f_n is measurable for n\in N, and f_n\to f, \mu a.e., then f is measurable.

Proof. Here we prove (a) only.

\Rightarrow.” Taking f=1_N for a null set N and g=1_F, F\subset N.

\Leftarrow.” Assuming that \mu is complete and f=g, a.e.. Then there exists N such that \mu(N)=0 and on N^c, f=g.

\{g>a\} = \{x\in N^c:\, f>a\}\cup \{x\in N:\, g>a\}

=\left( \{f>a\}\cap N^c \right)\cup \{x\in N:\, g>a\}.

The first set is measurable because of \sigma algebra; the second set is measurable if \mu is complete. Therefore g is measurable.

Posted by: Shuanglin Shao | September 4, 2020

Measure and Integration theory, Lecture 4

In last lecture, we have seen how to construct measures from an outer measure, even from very elementary pre-measures. The key ingredient is the Carath\’eodory theorem. Usually one is given a premeasure on an algebra, which is usually formed from taking the collection of finitely disjoint elementary sets; then one needs to extend this “measure” to an outer measure \mu^*, which is acting on all subsets of the underlying space X. Then one restrict it to all subsets that satisfy the Carath\’eodory condition, which are called \mu^* measurable sets. This collection of sets form a \sigma algrbra. This idea is due to Carath\eodory,

In this lecture, we construct Borel measures on \mathbb{R}.

Definition. Sets of the form (a, b], (a,\infty), \emptyset, where -\infty\le a<b<\infty, are called h-intervals.

Proposition. The collection \mathcal{A} of finite disjoint unions of h-intervals is an algebra.

The observation is the following.

(1). The intersection of 2 h-intervals is an h-interval.

(2). The complement of an h-interval is an h-interval or the disjoint union of 2 h-intervals.

The \sigma algrbra generated by \mathcal{A} is \mathcal{B}_\mathbb{R}.

(1). Since (a,b] can be written as an intersection of open intervals, and (a,\infty), \emptyset are open, the \sigma algebra generated by \mathcal{A} belong to \mathcal{B}_\mathbb{R}.

(2). Any open interval can be written as unions of intervals (a,b]. Then \mathbb{B}_\mathbb{R} belongs to the \sigma algebra generated by \mathcal{A}.

Proposition 1.15. Let F:\mathbb{R}\to \mathbb{R} be increasing and right continuous. If (a_j, b_j], j=1,2, \cdots, n are disjoint h-intervals. Let

\mu_0(\cup_{j=1}^n(a_j, b_j]) = \sum_{j=1}^n \left( F(b_j)-F(a_j)\right)

and let \mu_0(\emptyset ) =0. Then \mu_0 is a premeasure on the algebra \mathcal{A}.

Proof. Step 1. We first prove that \mu_0 is well defined. If \cup_{j=1}^n (a_j, b_j] can be represented as in another union \cup_{k=1}^m (c_k,d_k], we need to prove that

\sum_{j=1}^n \left( F(b_j)-F(a_j)\right) = \sum_{k=1}^m \left( F(d_k)-F(c_k)\right), \, (*)

By gluing neighboring h-intervals, we may assume that each (a_j, b_j] is maximal in the sense that we can not add to it any other h-interval to form a larger interval. Then one can prove that n=m, and (a_j,b_j] is one of the intervals (c_k,d_k]. Then (*) follows.

Step 2. We prove that \mu_0 is a premeasure on \mathcal{A}. Namely if \{I_j\}_{j=1}^\infty \in \mathcal{A} and \cup_{j=1}^\infty I_j \in \mathcal{A}, then

\mu_0(\cup_{j=1}^\infty I_j) =\sum_{j=1}^\infty \mu_0(I_j), \, (**)

The observation is that \cup_{j=1}^\infty I_j \in \mathcal{A} then it can be written as a finite disjoint union of h-intervals. That is to say, \cup_{j=1}^\infty I_j can be partitioned into finitely many groups that are labelled by each single h-interval. Here we need to use connected sets on \mathbb{R} are intervals. Then we just need to prove the claim (**) under the hypothesis that the infinite union is one single h-interval. Without loss of generality, we assume that \cup_{j=1}^\infty I_j = (a,b] with -\infty <a<b<\infty. All I_j‘s are in form of (a_j,b_j], it is not hard to deduce that, after relabelling,

b_1 =b, b_j= a_{j-1}, j\ge 2,

and a_j decreasingly converges to a. Then

\mu_0(a,b]= F(b)-F(a) = F(b)-\lim_{n\to \infty} F(a_n) = \lim_{n\to\infty} \mu_0(\cup_{j=1}^n (a_j,b_j]) =\sum_{j=1}^\infty \mu_0((a_j,b_j]).

This finishes the proof.

Theorem 1.16. If F: \mathbb{R}\to \mathbb{R} is any increasing, right countinuous function, there is a unique Borel measure \mu_F on \mathbb{R} such that

\mu_F((a,b]) = F(b)-F(a)

for all a,b. If G is another such function, \mu_F= \mu_G if and only if F-G is a constant.

Conversely, if \mu is a Borel measure on \mathbb{R} that is finite on all bounded Borel sets and we define F(x) = \mu((0,x]) if x>0, F(x) = -\mu((x,0]) if x<0, and F(x) = 0 if x=0. Then F is increasing and right continuous, and \mu=\mu_F.

Proof. Step 1. The measure \mu_0 is \sigma finite: \mathbb{R} = \cup_{j=-\infty}^\infty (j, j+1]. \mu_0(j, j+1]) = F(j+1)-F(j) is a real number. By Theorem 1.14, \mu|_\mathcal{M} is the unique Borel measure generated by \mu_0. It is also clear about the uniqueness.

Step 2. The function F is well-defined. It is obvious that F is increasing. To prove right continuity, we let a\in \mathbb{R} and a_n decreasingly converges to a. We need to prove that \lim_{n\to \infty} F(a_n) = F(a). Without loss of generality, we assume that a>0. We observe that

[0, a_1]=(0,a]\cup(a,a_1]=(0,a]\cup \left(  \cup_{j=1}^{\infty}(a_{j+1}, a_j] \right).

Thus F(a_1)-F(a) = \mu((a,a_1]) =\sum_{j=1}^\infty \mu_0((a_j,a_{j=1}])= \sum_{j\ge 1} (F(a_j)-F(a_{j+1})).

Therefore F(a) = \lim_{j\to \infty} F(a_j). From Step 1, it is obvious that \mu=\mu_F.

Definition. Let F: \mathbb{R}\to \mathbb{R} be increasing and right continuous. Define \bar\mu the completion of \mu_F, the Lebesgue-Stieltijes measure. We still denote it by \mu. We denote \mathcal{M}_\mu the domain of \mu.

Next we discuss some of the regularity properties of the measure $\mu$. Namely, a measurable set can be well approximated by open sets from above, and by compact sets from below.

We recall that for E\in \mathcal{M}_\mu,

\mu(E) = \inf\{\sum_{j=1}^\infty [F(b_j) -F(a_j)]=\sum_{j=1}^\infty \mu((a_j,b_j]):\, E\subset \cup_{j=1}^\infty (a_j, b_j]\}.

Lemma 1.17.

\mu(E) = \inf\{\sum_{j=1}^\infty \mu((a_j,b_j)):\, E\subset \cup_{j=1}^\infty (a_j, b_j)\}.

Proof. Let \nu(E) = \inf\{\sum_{j=1}^\infty \mu((a_j,b_j)):\, E\subset \cup_{j=1}^\infty (a_j, b_j)\}.

Step 1. For any \epsilon>0, there exists (a_j, b_j] such that E \subset \cup_{j=1}^\infty (a_j,b_j] and

\sum_{j=1}^\infty \mu((a_j,b_j]) \le \mu(E) +\epsilon.

For any j, there exists \delta_j>0,

\mu((a_j,b_j+\delta_j)) \le \mu((a_j,b_j])+\frac {\epsilon}{2^j}

by the right continuity of F. Thus for any $\epsilon>0$,

\nu(E) \le \mu(E) +2\epsilon \rightarrow \nu(E)\le \mu(E).

Step 2. For any \epsilon>0, there exists (a_j,b_j) such that E\subset \cup_{j=1}^\infty (a_j,b_j) and

\sum_{j=1}^\infty \mu((a_j,b_j]) \le \nu(E)+\epsilon.

We write each open interval (a_j,b_j) as a disjoint union of (c_{jk}, c_{j(k+1)}], where c_{j1} = a_j, c_{jk}\to b_j as k\to \infty. Thus

\mu((a_j,b_j)) = \sum_{k\ge 1} \mu((c_{jk}, c_{j(k+1)}]).

Since A\subset \cup_{j=1}^\infty \cup_{k=1}^\infty (c_{jk}, c_{j(k+1)}], we have

\sum_{j=1}^\infty \sum_{k=1}^\infty  \mu((c_{jk}, c_{j(k+1)}] )\le \nu(E)+\epsilon.

So \mu(E) \le \nu(E)+\epsilon \rightarrow \mu(E)\le \nu(E).

This proves that \mu(E) = \nu(E).

The previous lemma says that we can replace the h-intervals in the definition of the outer measures by open intervals.

Theorem 1.18. For any E\in \mathcal{M}_\mu, then

\mu(E) =\inf\{\mu(U):\, E\subset U, U \text{ is open}\}

\,\,\,= \sup\{\mu(K):\, K\subset E, K \text{ is compact}\}.

Proof. Step 1. The approximation by open sets follows from Lemma 1.17.

Step 2. We need to prove that for any E\in \mathcal{M}_\mu,

\mu(E) \le \sup \{\mu(K):\, K\subset E, K \text{ is compact}\}.

(a). If \mu(X)<\infty. For any \epsilon>0, there exists open set V such that E^c \subset V and

\mu(V\setminus E^c) <\epsilon,

that is to say, \mu(E\setminus V^c) <\epsilon. If V^c is bounded, then by Heine-Borel, take K= V^c. If V^c is not bounded, we can write

\mu(V^c) = \lim_{R\to \infty} \mu(V^c \cap \bar{B}(0, R)).

(b). If X=\cup_{j=1}^\infty X_j, X_j increases, and \mu(X_j) \to \infty. If \mu(E)<\infty,

\mu(E) \le \mu(E\cap X_j) +\epsilon \le \mu(K_j)+2\epsilon,

where K_j is some compact set and K_j\subset E\cap X_j.

If \mu(E)=\infty, \lim_{j\to \infty} \mu(E\cap X_j)=\infty. For any N>0, there exists K compact, \mu(K)>N and \sup\{\mu(K): \, K\subset E\} =\infty.

Theorem 1.19. If E\subset \mathbb{R}, the following are equivalent.

(a). E\in \mathcal{M}_\mu.

(b). E= V\setminus N_1, where V is a G_\delta set and \mu(N_1) =0.

(c). E= H\cup N_2, where H is a F_\sigma set and \mu(N_2) =0.

Proposition 1.20. If E\in \mathcal{M}_\mu and \mu(E)<\infty, then for any \epsilon>0, there is a set A that is a finite union of open intervals such that \mu(E\Delta A)<\epsilon.

This follows from Lemma 1.17 and 1.18. Here E\Delta A = (E \setminus A )\cup (A\setminus E).

Definition. Let F(x): = x. The complete measure \mu_F= the Lebesgue measure that is denoted by m. The domain of m is called the class of Lebesgue measure sets; we denote it by \mathcal{L}. The restriction of m to \mathcal{B}_\mathbb{R} is also called the Lebesgue measure.

The Cantor set. C is obtained from [0,1] by removing the open middle third (\frac 13, \frac 23), then removing the open middle thirds (\frac 19,\frac 29), (\frac 79, \frac 89) of the two remaining intervals, and so on.

Proposition 1.22. Let C be the Cantor set.

(a). C is compact, nowhere dense, and totally disconnected (i.e., the only connected subsets of C are single points.) Moreover, C has no isolated points. (Content of Math 765/766).

(b). m(C)=0.

(c). \text{card}(C) = \mathfrak{c}.

Proof. We only prove (c). Each x\in [0,1] has a base 3 decimal expansion, x=\sum_{j=1}^\infty a_j 3^{-j}, where $a_j=0,1,2$. Every number in [0,1] has a unique such representation. Indeed, there are 2 representations of \frac 13:

\frac 13= \frac 13

and \frac {1}{3} = \sum_{j\ge 2}\frac {2}{3^j}.

In this case, in order for \frac 13 in [0,1] has a unique representation, we choose the first representation. We can apply this choice for all the endpoints of the trinary divisions of $[0,1]$. Under this choice, the Cantor set is

\{x\in [0,1]:\, x= \sum_{j=1}^\infty \frac {a_j}{3^j}, a_j=0, 2.\}

Then we define a map f from C to [0,1]: for x=\sum_{j=1}^\infty \frac {a_j}{3^j}, a_j=0, 2,

f(x)=  \sum_{j=1}^\infty \frac {a_j/2}{2^j}= \sum_{j=1}^\infty \frac {a_j}{2^{j+1}}, a_j=0, 2.

One can prove that such f is well defined and a bijection. So C has the same cardinality as that of [0,1].

Construction of a non-measurable set. See the first section in Chapter 1 of Folland’s book.

Posted by: Shuanglin Shao | August 29, 2020

Measure and Integration theory, Lecture 3

In this section, the main theorem is the Carath\’eodory theorem. It enables us to construct measures from outer measures. We introduce the concept of outer measure.

Definition. X\neq \emptyset. The function \mu^*:\, \mathcal{P}(X) \to [0,\infty] is called an outer measure if

(a). \mu^*(\emptyset) =0.

(b). If A\subset B, then \mu^*(A) \le \mu^*(B).

(c). \mu^*(\cup_{n=1}^\infty A_n) \le \sum_{n=1}^\infty \mu^*(A_n).

Remark. (1). (c) does not imply (b) as there is \le sign instead of = sign.

(2). We note that the domain of \mu^* is the power set of X, i.e., the collection of all the subsets of X.

The next proposition tells us how to build up an outer measure from some elementary sets.

Proposition 1.10. Let \mathcal{E} \subset \mathcal{P}(X) and \rho:\, \mathcal{E} \to [0,\infty] be such that \emptyset \in \mathcal{E},  X\in \mathcal{E} and \rho(\emptyset) =0.

For any A\subset X, define

\mu^*(A) = \inf \{ \sum_{j=1}^\infty \rho (E_j):\, E_j\in \mathcal{E} \text{ and } A \subset \cup_{j=1}^\infty E_j \}. Then \mu^* is an outer measure.

Proof. We need to verify the three points in the definition of an outer measure.

(a). \rho(\emptyset) =0 is a condition.

(b). We observe that if A\subset B, then any cover of B is certainly a cover of A.

(c). To prove (c), we need to give an \epsilon room for each A_j. More precisely, given \epsilon >0, for each A_j, there exists a cover of A_j, \{E_n^j\}_{n=1}^\infty \in \mathcal{E} such that

(1). A_j\subset \cup_{n=1}^\infty E_n^j .

(2). \sum_{n=1}^\infty \rho(E_n^j) \le \mu^*(A_j)+\epsilon/2^j.

Since \cup_{j=1}^\infty \cup_{n=1}^\infty E_n^j convers \cup_{j=1}^\infty A_j, We see that

\mu^*(\cup_{j=1}^\infty A_j )\le \sum_{j=1}^\infty \sum_{n=1}^\infty \rho(E_n^j) \le \sum_{j=1}^\infty \mu^*(A_j)+\epsilon.

Since \epsilon is arbitrary, we see that

\mu^*(\cup_{j=1}^\infty A_j) \le \sum_{j=1}^\infty \mu^*(A_j).

Given an outer measure \mu^* on X, a set A\subset X is called \mu^*-measurable if

\mu^*(E) = \mu^*(E\cap A) + \mu^*(E\cap A^c)

for all E\subset X.

Theorem 1.11. If \mu^* is an outer measure on X, the collection \mathcal{M} of \mu^*-measurable sets is a \sigma algebra, and the restriction of \mu^* to \mathcal{M} is a complete measure.

Proof. The proof is divided into 3 Steps.

Step 1. We prove that \mathcal{M} is a \sigma algebra.

(a). \emptyset, X \in \mathcal{M}.

(b). If A \in \mathcal{M}, then A^c \in \mathcal{M} because (A^c)^c=A.

(c). If A_n\in \mathcal{M} for all n, we need to prove that \cup_{n=1}^\infty A_n \in \mathcal{M}. We may assume that all A_n are disjoint. We look at the case where we only have A_1, A_2 . We need to prove that

\mu^**((A_1\cup A_2) \cap E) + \mu^*(A_1^c \cap A_2^c \cap E) \le \mu^*(E). The left hand side equals

\mu^*((A_1\cup A_2) \cap E ) + \mu^*(E\cap A_2^c) -\mu^*(A_1 \cap A_2^c \cap E) .

It equals, because A_1 \subset A_2^c,

\mu^* ( (A_1\cup A_2) \cap E) + \mu^*(E) -\mu^*(E\cap A_2) - \mu^*(A_1\cap E) \le \mu^*(E).

In general, we can prove a recursive inequality,

\mu^*(E\cap (\cup_{j\ge 1} A_j))+ \mu^*(E\cap \cap_{j\ge 1}A_j^c) \le \mu^*(E). If \mu^*(E) = \infty, nothing to prove. Otherwise, the left hand side,

\le \mu^*(E\cap (\cup_{j\ge 1} A_j)) - \sum_{j=1}^N \mu^*(E\cap A_j) + \mu^*(E\cap \cap_{j\ge N+1} A_j^c)

If \sum_{j=1}^\infty \mu^*(E\cap A_j) = \infty and \mu^*(E) <\infty, then for large N,

\le 2\mu^*(E) - \sum_{j=1}^N\mu^* (E\cap A_j) \le \mu^*(E).

If \sum_{j\ge 1} \mu^*(E\cap A_j)<\infty, then as N\to \infty,

\le \sum_{j=N+1}^\infty \mu^*(E\cap A_j) + \mu^*(E) \le \mu^*(E).

Thus \mathcal{M} is a \sigma algebra.

Step 2. The outer measure \mu^*:\, \mathcal{M}\to [0,\infty] is a measure.

(a). Firstly \mu^*(\emptyset) =0;

(b). Secondly we need to prove that

\mu^*(\cup_{n=1}^\infty A_n)\ge \sum_{n=1}^\infty \mu^*(A_n), \, (*).

Indeed, for n,

\mu^*(\cup_{j=1}^\infty A_j)\ge \mu^*(\cup_{j=1}^n A_j) = \mu^*(A_1) + \mu^*(\cup_{j=2}^n A_j) = \cdots = \sum_{j=1}^n \mu^*(A_j).

For the first equal sign above, we take E = \cup_{j=1}^n A_j and apply the \mu^* measurable condition. The inequality (*) follows by taking n\to \infty.

Step 3. Let A \in \mathcal{M} and \mu^*(A )=0. We prove that for any F\subset A, F is \mu^*-measurable. For any E,

\mu^*(F\cap E) + \mu^* (F^c \cap E) \le \mu^*(A \cap E) + \mu^*(F^c \cap E)

\le \mu^*(A) + \mu^*(F^c \cap E) = \mu^*(F^c \cap E ) \le \mu^* (E).

Thus F is \mu^* measurable.

Next we discuss how we start with elementary sets and functions to build up a measure by use of Carath\’eodory’s theorem. This is an abstraction process. We work formally with sets and premeasures. The example to keep in mind is the collection of rectangles on \mathbb{R}^2 and the usual area. This will be used to define the Lebesgue measure. We will first see an example of this construction in the next section, the construction of Lebesgue measures on the real line.

Definition. If \mathcal{A}\subset \mathcal{P}(X) is an algebra, a function \mu_0:\, \mathcal{A} \to [0,\infty] will be called a premeasure if

(a). \mu_0(\emptyset) =0;

(b). If \{A_j\}_{j=1}^\infty is a sequence of disjoint sets in \mathcal{A} such that \cup_{j=1}^\infty A_j \in \mathcal{A}, then

\mu_0 (\cup_{j=1}^\infty A_j ) =\sum_{j=1}^\infty \mu_0 (A_j).

Remark. If \mathcal{A} is a \sigma algebra, \mu_0 is a measure.

If \mu_0 is a premeasure on \mathcal{A} \subset P(X), it induces an outer measure on X by Proposition 1.10.

\mu^* (E) = \inf \{ \sum_{j=1}^\infty \mu_0 (A_j) :\, A_j\in \mathcal{A}, E\subset \cup_{j=1}^\infty A_j\},\, (**).

Proposition 1.13. If \mu_0 is a premeasure on \mathcal{A} and \mu^* is defined by (**), then

(a). \mu^*|_\mathcal{A} = \mu_0.

(b). Every set in \mathcal{A} is \mu^* measurable.

Proof. (a). If E\in \mathcal{A}, then the right hand side of (**) is \le \mu_0(E) as E can be a cover of itself. On the other hand, for any \epsilon>0, there exists \{A_j\}_{j\ge 1}, such that

\mu^*(E) + \epsilon\ge \sum_{j\ge 1} \mu_0(A_j) .

We know that E\in \mathcal{A}, A_j \in \mathcal{A}, and \mathcal{A} is an algebra, then

\mu_0(A_j)\ge \mu_0(A_j\cap E) .

Furthermore E = (\cup_{j\ge 1} A_j )\cap E = \cup_{j\ge 1} ( A_j \cap E )\in \mathcal{A}, then

\mu_0(E) = \mu_0 ( \cup_{j\ge 1} A_j \cap E ) \le \sum_{j\ge 1} \mu_0(A_j\cap E)\le \sum_{j\ge 1} \mu_0(A_j).

which follows from the countable additivity in the definition of premeasures. Let \epsilon \to 0, then we prove the reverse inequality.

(b). We need to prove every set in \mathcal{A} is \mu^* measurable. That is to say, we need to prove that for any A\in \mathcal{A},

\mu^*(A\cap E ) + \mu^* (A^c \cap E ) \le \mu^*(E).

Given \epsilon>0, there exists A_j\in \mathcal{A} so that E\subset \cup_{j\ge 1} A_j and \mu^*(E)+\epsilon\ge \sum_{j\ge 1} \mu_0(A_j).

We perform the following

\mu^*(A\cap E ) + \mu^* (A^c \cap E ) \le \mu^*(A\cap E\cap \cup_{j\ge 1} A_j ) + \mu^* (A^c \cap E \cap \cup_{j\ge 1} A_j )

By using the countable additivity, and monotonicity of outer measures,

\le \sum \mu^*(A\cap A_j)+ \sum \mu^*(A^c \cap A_j) = \sum \left(\mu_0(A\cap A_j) + \mu_0(A^c\cap A_j) \right)

Since \mu_0 is a premeasure, then

\le \mu_0(A_j) \le \mu^*(E)+\epsilon.

This finishes the proof.

Theorem 1.14. Let \mathcal{A} be an algebra, \mu_0 is a premeasure on \mathcal{A}, \mathcal{M} the \sigma algebra generated by \mathcal{A}. Then

(a). There exists a measure \mu on \mathcal{M} whose restriction to \mathcal{A} is \mu_0, namely, we can take \mu= \mu^*|_\mathcal{M} where \mu^* is given by (**).

(b). If \nu is another measure on \mathcal{M} that extends \mu_0, then \nu(E) \le \mu(E) for all E \in \mathcal{M}, with equality when \mu(E)<\infty.

(c). If \mu_0 is \sigma finite, then \mu is the unique extension of \mu_0 to a measure on \mathcal{M}.

Proof. Step 1. The \sigma algebra \mathcal{M} is the smallest \sigma algebra that contains \mathcal{A}, and thus is contained in the \sigma algebra of all \mu^* measurable sets. Taking \mu= \mu^*|_\mathcal{M} proves the existence, where \mu^* is given by (**).

Step 2. Let \nu be a measure on \mathcal{M} that extends \mu_0. For all A\in \mathcal{A},

\nu(A) = \mu(A) = \mu_0(A).

For all E\in \mathcal{M}, there exists \{A_j\}_{j\ge 1} such that E\subset \cup_{j\ge 1} A_j and

\mu^*(E)+ \epsilon\ge \sum_{j=1}^\infty \mu_0(A_j) = \sum_{j=1}^\infty \nu(A_j)\ge \nu(E) .

Since \epsilon is arbitrary, we prove that \mu(E)\ge \nu(E). On the other hand, if \mu(E)<\infty, we need to prove that \mu(E)\le \nu(E). Given \epsilon>0, by the definition of (**), there exists \{A_j\}_{j\ge 1} \in \mathcal{M} such that, if E\subset A := \cup_{j\ge 1} A_j,

\mu(E) +\epsilon\ge \sum_{j\ge 1} \mu(A_j) \ge \mu(A)=\mu(E)+\mu(A\setminus E).

Since \mu(E)<\infty, we see that \mu(A \setminus E) <\epsilon . So

\mu(E) \le \mu(A) = \nu(A)= \nu(E)+ \nu(A\setminus E) \le \nu(E) + \mu(A\setminus E) \le \nu(E)+\epsilon. Since \epsilon is arbitrary, we see that \mu(E) \le \nu(E).

Step 3. Let \mu_0 be \sigma finite. We prove that \mu is the unique extension of \mu_0 to a measure on \mathcal{M}. Let \nu be another measure on \mathcal{M}.

\nu|_\mathcal{A}= \mu|_\mathcal{A}= \mu_0. For any E\in \mathcal{M}, \nu(E)\le \mu(E) and \nu(E) = \mu(E) if \mu(E)<\infty. We only need to prove that if \mu(E)=\infty, \nu(E)=\infty. Since \mu_0 is \sigma finite, there exists a disjoint union X= \cup_{j\ge 1} X_n such that X_j\in \mathcal{A}, \mu_0(X_j) <\infty for each j. Then E = E\cap \cup_{j\ge 1} X_n = \cup_{j\ge 1} E\cap X_j. Therefore

\nu(E) =\sum_{j\ge 1} \nu(E\cap X_j) = \sum_{j\ge 1} \mu(E\cap X_j) = \mu(E), where we have used that \mu(E\cap X_j) <\mu(X_j) = \mu_0(A_j)\le \infty. Thus \nu(E) = \infty.

Posted by: Shuanglin Shao | August 21, 2020

Math 810, Measure theory and integration, Lecture 2

In this lecture, we study how we define measures given a \sigma algebra on X and then move to study its properties. A measure is a set function, namely, a function acting on sets. Here the sets belong to a given \sigma algebra. For instance, the “length” on \mathbb{R} enables us to measuring open or closed intervals on \mathbb{R}.

Let X be a set equipped with a \sigma algebra \mathcal{M}. A measure on \mathcal{M} is a function \mu:\, \mathcal{M}\to [0,\infty] such that

(1). \mu(\emptyset) =0;

(2). If \{E_j\}_{j=1}^\infty is a sequence of disjoint sets in \mathcal{M}, then

\mu(\cup_{j=1}^\infty E_j) = \sum_{j=1}^\infty \mu(E_j) .

Here property (2) is called countable additivity. We call (X, \mathcal{M}, \mu) measure space, the sets in \mathcal{M} measurable sets.

Remark 1. The countable additivity implies finite additivity. For instance, given E_1, \cdots, E_n, disjoint sets in \mathcal{M}, we add \{E_{i}\}_{i\ge n+1} with \mu(E_{i}) =0 for all i \ge n+1. Then by (2) in the definition, we see that

\mu(\cup_{j=1}^n E_j)=\mu(\cup_{j=1}^\infty E_j) = \sum_{j=1}^\infty \mu(E_j) = \sum_{j=1}^n \mu(E_j)+ \sum_{j=n+1}^n 0

=  \sum_{j=1}^n \mu(E_j).

Remark 2. The countable additivity is essential in defining measures. It is for developing a satisfactory integration theory. For instance, to measure the area of the unit disk on the plane, we divide it and write it a union of many small rectangles inside the unit disk. Then the area can be approximated by sums of area of rectangles.

Definitions. Let (X, \mathcal{M}, \mu) be a measure space.

(1). If \mu(X)<\infty, then \mu is called finite.

(2). If X= \cup_{j=1}^\infty E_j where E_j\in \mathcal{M} and \mu(E_j)<\infty for all j, then \mu is called \sigma-finite.

(3). If for each E\in \mathcal{M} with \mu(E)=\infty, there exists F\in \mathcal{M} with F\subset E with 0<\mu(F)<\infty, then \mu is called semifinite.


(1). Let X be any nonempty set, \mathcal{M} = \mathcal{P}(X) , and f:\, X\to [0,\infty] , f(x) =1 for all x\in X. We define the counting measure \mu(E)= \sum_{x\in E} f(x) for any E\in\mathcal{M}.

Proof. We need to verify the two conditions in the definition.

(a). \mu(\emptyset) =0.

(b). Let E_j be a disjoint sequence of sets in \mathcal{M}, we need to show that \mu(\cup_{j=1}^\infty E_j) = \sum_{j=1}^\infty \mu(E_j) . We recall that

\sum_{x\in \cup_{j=1}^\infty E_j} f(x) = \sup\{\mu(F): F\subset \cup_{j=1}^\infty E_j \text{ and } F \text{ is finite}\}.

Note that this definition of $\sup$ also treat the case with the set is uncountable.

\mu(F) = \sum_{x\in F} f(x)\le \sum_{j=1}^\infty \mu(E_j).

This is because F is finite and E_j disjoint, which comes from finitely many E_j's. Then taking supremum, we see that

\sum_{x\in \cup_{j=1}^\infty E_j} f(x) \le \sum_{j=1}^\infty \mu(E_j).

On the other hand, we show that for any N,

\sum_{j=1}^N \mu(E_j) \le \sum_{x\in \cup_{j=1}^\infty E_j} f(x) . Let N\to \infty, we obtain the reverse inequality. For any F_i\subset E_i, F_i finite, then

\sum_{i=1}^N \sum_{x\in F_i } f(x)  \le \sum_{x\in \cup_{j=1}^\infty E_j } f(x).

Taking supremum in each F_i, we see that the claim is proved.

(2). Let X be any nonempty set, \mathcal{M} = \mathcal{P}(X) , and f:\, X\to [0,\infty] , f(x_0) =1 for some x_0\in X and f(x)=0 for all x\neq x_0. We define the Dirac measure at x_0, \mu(E)= \sum_{x\in E} f(x) for any E\in\mathcal{M}. This proof is easy.

Proof. We need to verify the two conditions in the definition.

(a). \mu(\emptyset) =0 is obvious.

(b). Let E_j be a disjoint sequence of sets in \mathcal{M}, we need to show that \mu(\cup_{j=1}^\infty E_j) = \sum_{j=1}^\infty \mu(E_j) . For the left hand side, if x_0\in \cup_{j=1}^\infty E_j, then it is equal to 1; otherwise it is 0. If x_0\in \cup_{j=1}^\infty E_j, then x_0 is in only one of them because the sets E_j are disjoint. This implies that the right hand side is 1. The same analysis applies to the case where x_0 \notin  \cup_{j=1}^\infty E_j.

Next we summarize the basic properties of a measure.

Theorem 1.8. Let (X,\mathcal{M}, \mu) be a measure space.

(a). (Monotonicity) If E, F \in \mathcal{M} and E\subset F, then \mu(E) \le \mu(F).

(b). (Subadditivity) If \{E_j\}_{j=1}^\infty \subset \mathcal{M}, then \mu(\cup_{j=1}^\infty E_j )\le \sum_{j=1}^\infty \mu(E_j).

(c). (Continuity from below) If \{E_j\}_{j=1}^\infty \subset \mathcal{M} and E_1\subset E_2\subset \cdots, then \mu(\cup_{j=1}^\infty E_j )= \lim_{n\to \infty} \mu(E_n) .

(d). (Continuity from above) If \{E_j\}_{j=1}^\infty \subset \mathcal{M}, E_1\supset E_2\supset \cdots and \mu(E_1)<\infty, then \mu(\cap_{j=1}^\infty E_j )= \lim_{n\to \infty} \mu(E_n) .

Proof. (a). We write F= E \cup (F\setminus E). Since E,F\in \mathcal{M}, $E, F\setminusE$ is in \mathcal{M}. Then by finite additivity of measures,

\mu(F) = \mu(E)+ \mu(F\setminus E) \ge \mu(E) since the second one is \ge 0. This proves (a).

(b). We need to create some disjoint sums. We write F_1=E_1, F_k= E_k\setminus (\cup_{j=1}^{k-1} E_j) for k\ge 2, then we see that all F_j are all disjoint and \cup_{j=1}^\infty E_j = \cup_{j=1}^\infty F_j. Then by the countable additivity of measures, we see that

\mu(\cup_{j=1}^\infty E_j ) = \mu(\cup_{j=1}^\infty F_j) = \sum_{j=1}^\infty  \mu(F_j)  \le \sum_{j=1}^\infty \mu(E_j).

The last inequality follows because F_j is a subset of E_j.

(c). We will apply the same trick as above. We write

E_k = E_1 \cup (E_2\setminus E_1) \cup \cdots \cup(E_k\setminus E_{k-1}).

Then we see that \mu(E_k) = \sum_{j=1}^k \mu(E_j\setminus E_{j-1}), where E_0=\emptyset. By the countable additivity of measures,

\mu(\cup_{j=1}^\infty E_j ) = \mu(\cup_{j=1}^\infty E_j\setminus E_{j-1}) =\sum_{j\ge 1} \mu(E_j\setminus E_{j-1}) =\lim_{k\to \infty} \mu(E_k). This proves (c). For (d), the proof is similar.

Remark. The condition \mu(E_1)<\infty in (d) can not be dropped. The example: E_n= [n,\infty) on \mathbb{R}, and let $\mu$ be the Lebesgue measure, then \cap_{n=1}^\infty E_n=\emptyset. Therefore \mu(\cap_{n=1}^\infty E_n ) =0 . However for each n, \mu(E_n) =\infty.

Definition. Let (X, \mathcal{M}, \mu) be a measure space. A set E\subset \mathcal{M} such that \mu(E)=0 is called a null set.

Definition. A measure whose domain contains all subsets of null sets is called complete.

Theorem 1.9. Suppose that (X, \mathcal{M}, \mu) is a measure space. Let \mathcal{N}= \{N\in \mathcal{M}:\, \mu(N) =0\} and \overline{\mathcal{M}} = \{E\cup F:\, E\subset \mathcal{M} \text{ and } F\subset N \text{ for some } N\in \mathcal{N}\}. Then \overline{\mathcal{M}} is a \sigma algebra, and there is a unique extension \bar{\mu} of \mu to a complete measure on \overline{\mathcal{M}}.

Proof. The proof is divided into 4 parts.

(a). \emptyset, X\in \mathcal{M}. We show that \mathcal{M} is closed under complements. Let (E\cup F)\in \overline{\mathcal{M}} with E \in \mathcal{M} and F\subset N for some null set N \in \mathcal{M},

(E\cup F)^c = E^c\cap F^c = E^c \cap [N^c \cup (F^c \cap N)]= [E^c \cap N^c] \cup [E^c \cap F^c \cap N].

This is in \overline{\mathcal{M}}.

Next we prove \overline{\mathcal{M}} is closed under countable unions. It is easy to see that \cup_{j=1}^\infty (E_n\cup F_n) = (\cup_{j=1}^n E_n) \cup (\cup_{j=1}^\infty F_n) \in \overline{\mathcal{M}}, where F_n \subset N_n for null sets N_n. This proves that \overline{\mathcal{M}} is a \sigma algebra.

(b). Define the extension measure \bar\mu:\, \overline{\mathcal{M}} \to [0,\infty] by

\bar\mu(E\cup F) = \mu(E).

It is easy to verify that \bar\mu is a measure.

(c). The measure \bar\mu is complete. We need to show that for any subset F\subset E with E is a \bar\mu-measurable null set, F is measurable. This follows that

F\subset E= E_0\cup F_0 \subset E_0 \cup N_0, where \mu(E_0) = \mu(N_0) =0. So F is a subset of \mu-measurable null set.

(d). We establish the uniqueness of \bar\mu. Suppose there is another extension measure \nu on \overline{\mathcal{M}} such that \nu(E)=\mu(E) for E\in \mathcal{M}. We need to show that for each element E\cup F\in \mathcal{M} in E\in \mathcal{M}, F\subset N for a \mu-null set,

\nu(E\cup F) = \bar\mu(E\cup F).

Noting that E\cup F \subset E\cup N for a null set N, then

\mu(E)=\nu(E) = \nu(E\cup F) \le \nu (E\cup N) = \mu(E\cup N) = \mu(E).

Therefore \nu(E\cup F) = \mu(E) = \bar{\mu} (E\cup F). Thus \nu = \bar\mu.

Posted by: Shuanglin Shao | August 15, 2020

Math 810: measure theory and integration. Lecture 1

This semester we will study the measure theory and integration theory, namely Chapters 1,2,3, and 6 in Folland’s real analysis book. The development of measure theory has a long history starting with Weierstrass’s construction of a nowhere differential but continuous function. The milestone is Lesbesgue’s theory of measure and integration that is a generalization of the the classical Riemann integrals. Putting it simply, the measure theory allows us to quantitatively assign numerical values to more general sets or to make sense certain intuitional concepts such as area or volumes of two dimensional or three dimensional objects. The measure theory is a deep theory in analysis in the sense that it is probably one of the notorious courses in mathematics. To name another one, it may be differential geometry.

To give an overview of the course, Folland’s chapter 1 concerns the foundation of the measure theory: the sigma algebra and introduction of measures. It also includes how we construct measures by using some primitive sets at hand, namely the outer measure of simple sets and the Carath\’eodory’s theorem. Then the whole chapter is returning to our most familiar setting, the real line: how we construct the Lebesgue measures. Folland’s chapter 2 studies the measurable functions and its integration. We all know that the subject of analysis is to make sense various decompositions and summation. For instance, the Riemann integration gives us a good example of estimating area enclosed by a function y=f(x) on [a,b] by using rectangles. This is done by decomposing the domain [a,b] into intervals. In order to enclose functions such as the Dirichlet function, Lebesgue developed his theory. The key point is revolutionary: we decompose the range of a function. This is reflected by the theorem that any measurable function can be approximated by simple functions, i.e., a linear combination of characteristic functions of measurable sets. Based on it, integration theory of positive functions is developed until full definitions of complex integrations. This is done via maximizations or extremization. A natural question arises how to exchange limits and integration. The dominated convergence theorem is an example. This is probably the strongest convergence, the L^1 norm convergence. Continuing in this spirit, several modes of convergence are discussed such as pointwise or uniform convergences, convergence in measure. In the end of this chapter, product theory of measures and integration is discussed and cumulated in Fubini’s theorem. The third chapter is to discuss complex measures, and the Radon-Nikodym theorem. The chapter 6 is about function space theory that provides a surprising platform of modern analysis because various analysis and PDE problems are formulated in this setting. The difficult concept is duality between Lebesgue spaces. For instance, the dual space of L^p, 1<p<\infty is L^{p'} where \frac 1p+\frac {1}{p'}=1. The way out of it is to take a duality concept studying linear functionals on L^p. Then we talk about the flower of real analysis, the interpolation theory of L^p spaces, namely, the Riesz-Thorin interpolation theorem and the Marcinkiewicz interpolation theorem.

In my point of view, a theme of measure theory is to study simple sets or simple functions, and then gradually move on to more general functions. The setting for general functions is to realize them as solutions to some Euler-Lagrange equations or extremization problems. For instance, how we define measures via the Carath\’eodory theorem. In the hot research in modern harmonic analysis, a central conjecture is to quantify the decay of certain measures singular to the Lebesgue measure, which are often associated with some geometric objects with curvature such as sphere or the paraboloids. This may be taught in another graduate course, math 890, Fourier analysis.

Now we return to the first lecture. We will cover the second section of Chapter 1 in Folland’s book. We start with the definition of \sigma algebra. Given X a nonempty set. An algebra of sets is a nonempty collection \mathcal{A} of subsets of X that is closed under finite unions and complements. That is to say,

(1). If E_1, \cdots, E_n\in \mathcal{A}, then \cup_{j=1}^n E_j \in \mathcal{A};

(2). If E\in \mathcal{A}, then E^c\in \mathcal{A}.

An \sigma algebra is an algebra that is closed under countable unions. That is to say,

(1). If E_n\in \mathcal{A} for n=1,2,\cdots, then \cup_{j=1}^\infty E_j \in \mathcal{A};

(2). If E\in \mathcal{A}, then E^c\in \mathcal{A}.

It is to see that a \sigma algebra is also an algebra that is closed under countable intersections.


(1). If X is any set, \mathcal{P}(X), the set of all the subsets of X, is a \sigma algebra.

(2). \{\emptyset, X\} is a \sigma algebra.

(3). If X is uncountable, then \mathcal{A} =\{E\subset X: E \text{ is countable or } E^c \text{ is countable}\} is a \sigma algebra. It is called the \sigma algebra of countable or co-countable sets. To prove it, we need to verify the two points of the definition. (1). Given \{E_n\}_{n=1}^\infty \subset  \mathcal{A}, if all the E_n is countable, then \cup_{j=1}^\infty E_n is countable. If one of E_n‘s is uncountable. Then because E_n\in \mathcal{A}, E_n^c is countable. Then \bigl(\cup E_n\bigr)^c = \cap E_n^c is a subset of E_n^c and so is countable. (2). Given E\in \mathcal{A}, then E is countable or E^c is countable. Suppose we are in the first case: E is countable. Because (E^c)^c =E is countable, then E^c\in \mathcal{A}. The second case is handled similarly.

Definition. If \mathcal{E} is a subset of \mathcal{P}(X), there is a unique smallest \sigma algebra \mathcal{M}(\mathcal{E}) containing \mathcal{E}, namely the intersection of all \sigma algebras containing \mathcal{E}. This is called the \sigma algebra generated by \mathcal{E}.

Lemma 1.1. If \mathcal{E}\subset \mathcal{M}(\mathcal{F}), then \mathcal{M}(\mathcal{E})\subset \mathcal{M}(\mathcal{F}).

This is done by using the definition because \mathcal{M}(\mathcal{F}) is a \sigma algebra. Note that in the subsequent notes the labelling of all the lemmas and theorems are the same as in Folland’s book. This is for easy references.

Definition. If X is any metric space, the \sigma algebra generated by the family of open sets in X or generated by the closed sets in X is called the Borel \sigma algebra on $X$ and is denoted by \mathcal{B}_X.

It is very hard to describe what the Borel \sigma algebra is like. One may get some feeling by trying intersections, unions, complements of sets, and intersections of unions, unions of intersections, and so on so forth. But note that the definitions of \sigma algebras generated by sub collections of sets is an abstract one.

The Borel \sigma algebra on \mathbb{R} is fundamental in our study. By definition, it is generated by open sets on \mathbb{R}. It can be generated in a number of different ways.

Lemma 1.2. \mathcal{B}(\mathbb{R}) is generated by one of the following.

(a). The open intervals \mathcal{E}_1=\{(a,b):\, a<b\},

(b). The closed intervals, \mathcal{E}_2=\{[a,b]:\, a<b\},

(c). The half-open intervals, \mathcal{E}_3=\{(a,b]:\, a<b\}, or \mathcal{E}_4=\{[a,b):\, a<b\},

(d).The open rays, \mathcal{E}_5=\{(a,\infty):\, a\in \mathbb{R}\} or \mathcal{E}_6=\{(-\infty, a):\, a\in \mathbb{R}\},

(e). The closed rays, \mathcal{E}_7=\{[a,\infty):\, a\in \mathbb{R}\} or \mathcal{E}_8=\{(-\infty, a]:\, a\in \mathbb{R}\}.

Proof. We prove the first claim. The rest claims are proven similarly. Obviously \mathcal{B}(\mathbb{R}) contains the \sigma algebra generated by \mathcal{E}_1 because the sets in \mathcal{E}_1 are open sets. On the other hand, there is an exercise in Math 765/766 that any open set on \mathbb{R} can be written as a countable union of open intervals (a,b) with a <b. Hence the $\sigma$ algebra generated by \mathcal{E}_1 contains the family of open sets on \mathbb{R}. Hence it contains \mathcal{B}(\mathbb{R}).

Now we come to another abstract concept, the product \sigma algebra on X= \Pi_{\alpha\in A} X_\alpha where X_\alpha are nonempty sets. That is for defining a measure on product space; for instance, how we define Lebesgue measures on $\mathbb{R}^n$ provided that we know how it works on \mathbb{R}. We recall that X= \{f: A\to \cup_{\alpha\in \mathcal{A}} X_\alpha: \, f(\alpha) \in X_\alpha, \forall \alpha \in \mathcal{A}\}. Let \pi_\alpha: \, X\to X_\alpha be the coordinate maps, that is to say, \pi_\alpha (f) = f(\alpha) for all f\in X. This is easy to visualize in the usual Euclidean spaces \mathbb{R}^n.

Definition. If \mathcal{M}_\alpha is a \sigma algebra on X_\alpha, then the product \sigma algebra on X= \Pi_{\alpha\in \mathcal{A}} X_\alpha is the \sigma algebra generated by

\{ \pi_\alpha^{-1}(E_\alpha):\, E_\alpha \in \mathcal{M}_\alpha, \alpha \in \mathcal{A}\}.

We denote this by \otimes_{\alpha\in \mathcal{A}} \mathcal{M}_\alpha.

Proposition 1.3. If \mathcal{A} is countable, then \otimes_{\alpha\in mathcal{A}} \mathcal{M}_\alpha is the \sigma algebra generated by \{\Pi_{ \alpha\in \mathcal{A}}  E_\alpha:\, E_\alpha \in \mathcal{M}_\alpha\}.

Proof. It suffices to show two generating sets can be expressed in terms of each other via the usual (at most countable) set operations.

(1). For any \alpha, \pi_\alpha^{-1 }(E_\alpha ) = \Pi_{\beta \in \mathcal{A}} E_\alpha, where E_\beta = E_\alpha when \beta =\alpha, and E_\beta = X_\beta for \beta \neq \alpha.

(2). On the other hand, \Pi_{\alpha\in \mathcal{A}} E_\alpha = \cap_{\alpha=1}^\infty \pi_\alpha^{-1} (E_\alpha). This is because \mathcal{A} is countable.

The next proposition says that the product \sigma algebra can be generated by even smaller sets, namely, the generating sets for each \sigma algebra, comparing to its definition.

Proposition 1.4. Suppose that \mathcal{M}_\alpha is generated by each \mathcal{E}_\alpha, \alpha\in \mathcal{A}. Then \otimes_{\alpha\in \mathcal{A}} \mathcal{M}_\alpha is generated by

\mathcal{F}_1= \{ \pi_\alpha^{-1}(E_\alpha):\, E_\alpha\in \mathcal{E}_\alpha, \alpha\in \mathcal{A}\}.

If \mathcal{A} is countable and X_\alpha\in \mathcal{E}_\alpha for all $\alpha$, \otimes_{\alpha\in \mathcal{A}} \mathcal{M}_\alpha is generated by

\mathcal{F}_2= \{\Pi_{\alpha\in \mathcal{A}}E_\alpha:\, E_\alpha\in \mathcal{E}_\alpha.\}

It is quite natural to expect the product \sigma algebra is generated by \pi_\alpha^{-1}(E_\alpha) for E_\alpha \in \mathcal{E}_\alpha because each \mathcal{M}_\alpha is generated by \mathcal{E}_\alpha. But we don’t know how it is generated explicitly. For more discussion on this topic, see 1.2 in Section 1.6 in this chapter. The proof for this proposition is somehow counterintuitive and abstract.

Proof. Firstly \mathcal{M}(\mathcal{F}_1) \subset \otimes_{\alpha\in \mathcal{A}} \mathcal{M}_\alpha because \mathcal{F}_1 is a subset of the generating set for the product \sigma algebra. On the other hand, we construct

\{E \subset X_\alpha: \, \pi_\alpha^{-1} (E)\in \mathcal{M}(\mathcal{F}_1)\}.

It is not hard to show that this is a \sigma algebra because the operation of taking inverse projections \pi_\alpha^{-1} commutes with taking complements and unions of sets. It contains \mathcal{E}_\alpha and hence \mathcal{M}_\alpha. We rephrased it in another way. Given E\in \mathcal{M}_\alpha, \pi_\alpha^{-1} (E) \in \mathcal{M}(\mathcal{F}_1). That is to say, the generating set for the product \sigma algebra belongs to \mathcal{M}(\mathcal{F}_1). Hence \otimes_{\alpha\in \mathcal{A}} \mathcal{M}_\alpha \subset \mathcal{M}(\mathcal{F}_1) . This completes the proof.

Definition. An elementary family is a collection \mathcal{E} of subsets of X such that

(1). \emptyset \in \mathcal{E},

(2). If E, F\in \mathcal{E}, then E\cap F \in \mathcal{E},

(3). If E\in \mathcal{E}, then E^c is a finite union of members of \mathcal{E}.

Example: Let \mathcal{E} be a collection of rectangles on \mathbb{R}^2 with sides parallel to the coordinate axes including the infinite rectangles. Then \mathcal{E} is elementary family.

Proposition 1.7. If \mathcal{E} is an elementary family, the collection \mathcal{A} of finite disjoint unions of members of \mathcal{E} is an algebra.

The proof of this fact is left as an exercise.

Posted by: Shuanglin Shao | June 14, 2010

On localization of the Schrodinger maximal function

I was interested in the almost everywhere pointwise convergence problem of the Schrödinger solution e^{it\Delta}f as time goes to zero since I was a graduate student. The question was raised by Carleson, and remains open in high dimensions. It is closely related to the boundedness of the Schrodinger maximal operator (local or global); in turn it is closely related to the interesting oscillatory integrals in harmonic analysis and Strichartz estimates in PDE. Recently I understood some of the problem and wrote down a note to re-construct the two dimensional proof. These results contained in this note are not new; in the note I explore them from a slightly different perspective.

Posted by: Shuanglin Shao | December 6, 2009

Kato-smoothing effect

“Kato smoothing” “is ” important in dispersive PDE.“Smoothing”, as it stands, sounds like a very good word. But why is it good? How will it be used remains as vague questions to me.

In the last week, I began to report a paper by Alazard-Burq-Zuily, “On the water waves questions with surface tension”, which contains a “local-smoothing” result: roughly speaking, solutions to 2-D water waves with surface tension in C^0_tH^s_x will get \frac 14 regularity upgrade to L^2_tH^{s+\frac 14}_x, to the price of locally in space and being averaged in time. This result was first proven by Christianson-Hur-Staffilani.

A question came to me, why 1/4? Where can I quickly see it? Soon I found out it is determined by the structure of the water-wave equation. To oversimplify the major result in the paper on the derivation of water-wave equations by means of paralinearization, it can be written as a type of dispersive equations,
\partial_t u+iD^{3/2} u=0, \,D:=|\nabla|, \, (*)

(I have dropped a lot of terms, for instance, only looking at a linear equation, and no flow-terms in (*); it is because I am only looking at main terms which I think reflect dispersion). (*) suggests a dispersive relation \tau=|\xi|^{3/2}. If one look for
\int \int_{|x|\le 1} |D^\alpha u|^2 dxdt \lesssim \|u_0\|_{L^2}, the maximum value of \alpha turns out to be 1/4. (1/4 was proved for 2D, does this suggest that it was the case in all dimensions?)

To understand this necessary condition on \alpha, I would like to draw an analogy with the “1/2-local smoothing” for the Schrodinger equations: for any \epsilon >0,

\int\int \langle x \rangle^{-1-\epsilon} |D^{1/2} e^{it\Delta} f|^2 dx dt \le C \|f\|_2^2.

Why 1/2 there? and how it was proved? Then the clarifications of these matters provides a model (to me) that 1/4 for water-wave sounds reasonable.

Let us first motivate “local-smoothing” for Schr\”odinger? It is well known that the solutions e^{it\Delta} f to a free Schr\”odinger equation

i\partial_t u+\Delta u=0, u(0,x)=f(x)

obeys the conservation law of mass, i.e.,

\| e^{it\Delta} f \|_{L^{\infty}_tL^2_x} =\|f\|_{L^2_x}. \,(1)

Note that we can not add any D^\alpha with \alpha>0 to the left hand side of (1) due to the Galilean transform (or simply by creating a bump at high frequency). One may argue that, is this no-gain-of-derivative due to we asking for too much in time by requiring a L^\infty_t? For instance, on a time interval, [0,1], L^\infty-norm is stronger than any L^q norm with 1\le q<\infty? So is the following true,

\|D^\alpha e^{it\Delta} f\|_{L^q_tL^2_x}\le C\|f\|_2,\,(2)

for some \alpha>0 and 1\le q<\infty. Unfortunately (2) does not hold for any \alpha>0 due to the same reason as above (one can take the same examples.).

Is there any hope that some variant of (2) holds true? By using the heuristic that solutions to a dispersive equations at high frequency travels much faster than low frequency, after waiting for a long time, only low-frequencies are left behind around spatial origin and they do not hurt positive derivatives. So if we were asking for an estimate locally in space, was it possible? The answer turns out to be Yes. We have the following estimate,

\int \int_{|x|\le 1} |D^{1/2} e^{it\Delta}|^2 dxdt \lesssim \|f\|^2_2.\, (3)

This is referred to as “Kato smoothing estimate" for Schr\"odinger in literature. Moreover 1/2 is the most one can expect due to the obstruction (counterexample) we mentioned above. (3) will lead to a more general estimate, for any \epsilon>0

\int \int_{\mathbb{R}^d} (1+|x|)^{-1-\epsilon} |D^{1/2} e^{it\Delta}|^2 dxdt \lesssim \|f\|^2_2.\, (4)

The implication of (4) from (3) is easier by partitioning \{|x|\ge 1\} and rescaling, and using the information that \epsilon>0.

We focus on proving (3). It will follow from a Plancherel argument.

D^{1/2} e^{it\Delta} f(x)=\int_{\xi^\prime;, \tau} e^{ix^\prime\cdot \xi^\prime+it\tau}\int_{\xi_d} e^{ix_d\xi_d} \delta (\tau-|\xi|^2)|\xi|^{1/2} \widehat{f}(\xi)d\xi_d d\xi'd\tau,

where \delta is the Dirac mass, and \xi=(\xi^\prime,\xi_d). We set

F(\xi^\prime,\tau)=\int_{\xi_d} e^{ix_d\xi_d} \delta (\tau-|\xi|^2) |\xi|^{1/2}\widehat{f}(\xi)d\xi_d.

To prove (3), it suffices to prove

\int_{|x_d|\le 1} \int_{\mathbb{R}^d } |\widehat{F}(x^\prime,t)|^2 dx^\prime dt dx_d \le C\|f\|_2^2, \, (5)

where x=(x^\prime,x_d). Obviously by the Plancherel theorem, the left hand side of (5) is bounded by

\int_{|x_d|\le 1} \int_{(\xi^\prime, \tau)} |F|^2 d\xi^\prime d\tau dx_d.

Fixing (\xi^\prime \tau), Cauchy-Schwarz yields

|F|^2\le \int_{\xi_d} |\widehat{f}|^2 \delta(\tau-|\xi|^2 )d\xi_d \int_{\xi_d} |\xi|\delta(\tau-|\xi|^2) d\xi_d. \, (6)

The second factor on the right hand side of (6) is bounded by an absolute if we restricting \xi to to set \{\xi: |\xi|\le \sqrt{d} |\xi_d|\}; it is not hard to do so if at the beginning we aim to prove (3) under this restriction; then the general estimate follows from the triangle inequality and partition the frequency space into $laex d $ pieces.

So plugging (6) in (5) and interchanging the integration order, and using |x_d|\le 1, we find out the left hand side of (5) is bounded by \|f\|_2^2. This is exactly what we need. So we finish proving (3).

I like the previous type of argument very much; but I did not remember where it was the first place/time I saw it. So I recorded it for my own benefits.

Posted by: Shuanglin Shao | November 14, 2009

Reading: GWP for critical 2D dissipative SQG

Last weekend I attended a wonderful conference SCAPDE at UC Irvine, where I learned an interesting theorem from Kiselev: solutions to critical 2D dissipative quasi-geostrophic equations (SQG) are globally wellposed

\theta_t=u\cdot \nabla \theta-(-\Delta)^\alpha\theta, u=(u_1,u_2)=(-R_2\theta, R_1\theta),

where \theta:\mathbb{R}^2\to \mathbb{R} is a scalar function, R_1, R_2 are the usual Riesz transform in \mathbb{R}^2 defined via \widehat{R_i(f)}(\xi)=\frac {i\xi_i}{|\xi|}\widehat{f}(\xi) and \alpha\geq 0.

This is his joint work with Nazarov and Volberg. I took a look at this paper in the past few days. The proof makes good use of classical tools in Fourier analysis such as singular integrals and modulus of continuity, which are familiar topics in Stein’s book, Singular integrals and differentiability properties of functions. It is a place where I do not see Strichartz estimates or Littlewood Paley decompositions and see the power of classical Fourier analysis.

The monotonicity formula they have is

\|\nabla \theta\|_\infty\le C\|\nabla \theta_0\|_\infty \exp \exp\{C\|\theta_0\|_\infty\}, (1)

for periodic smooth initial data \theta_0. I am new to this field of fluid dynamics but I would still like to say a few words on this estimate. The proof is to find an “upper bound “, modulus of continuity \omega, for solutions \theta; then they show that a family of modulus of continuity is preserved under evolution of the equation, which is strong enough to control \|\nabla \theta\|_\infty. Quoted from the paper, the idea is to show that the critical SQG possesses a stronger “nonlocal” maximum principle than L^\infty control.

Recall a modulus of continuity \omega: [0,\infty)\mapsto [0,\infty) is just an arbitrarily increasing continuous and concave function such that \omega(0)=0. A function f: \mathbb{R}^n\mapsto \mathbb{R}^m has modulus of continuity \omega if |f(x)-f(y)|\le \omega (|x-y|) for all x,y\in \mathbb{R}^n.

We choose not to report the crucial/essential part of finding the continuity of modulus (maybe later). Instead we assume that f has modolus of continuity \omega, which is unbounded and \omega^\prime (0)<\infty and \lim_{\xi\to 0+}\omega^{\prime\prime}=-\infty. Then there holds

\|\nabla f\|_\infty< \omega^\prime (0), (2).

The proof is actually very simple. The explicit form of \omega will take care of the implication of (1) from (2).

Assume that \|\nabla f\|_\infty=|\nabla f(x)| for some x. We consider the point y=x+\xi e for e=\frac {\nabla f}{|\nabla f|}. On the one hand, we have

f(y)-f(x)\le \omega (\xi), (3)

for all \xi\geq 0. On the other hand, the left hand side of (3) is at least |\nabla f(x)|\xi -C\xi^2 where C=\frac 12 \|\nabla^2 f\|_\infty while its right hand side can be represented as \omega^\prime (0)-\rho(\xi)\xi^2 with \rho(\xi)\to\infty as \xi\to 0+.


|\nabla f(x)| \le \omega^\prime (0)-(\rho(\xi)-C)\xi

for all sufficiently small $\xi>0$, and it remains to choose some \xi>0 satisfying \rho(\xi)>C.

Older Posts »