Sadeep's Tech Blog

Saturday, May 31, 2014

A Tutorial Introduction to Reproducing Kernel Hilbert Spaces - Part III

In this article of the series, we provide a summary of the theory of Reproducing Kernel Hilbert Spaces (RKHS), which was originally established by Aronszajn in his 1950 paper [1].

In the following, unless otherwise stated, we use $\mathcal{X}$ to denote an arbitrary nonempty set and $\mathcal{H}$ to denote a Hilbert space of complex-valued functions defined on $\mathcal{X}$. Therefore, $\mathcal{H}$ is a subset of $\mathbb{C}^\mathcal{X}$, the set of all complex-valued functions on $\mathcal{X}$. Although the general theory is presented for more general complex-valued Hilbert spaces, one can think of real-valued Hilbert spaces for ease of understanding. The following example provides an example for a real-valued Hilbert space over a non-empty set.

Example:

Let $\mathcal{X} = [0, 1]$ and $\mathcal{H}$ be the set of real-valued, square-integrable functions on $\mathcal{X}$. That is,
$$
\mathcal{H} = \mathcal{F} = \{f | f : [0, 1] \to \mathbb{R}, \int_{0}^{1} |f(x)|^2 dx < \infty \}.
$$
As was hinted in Exercise 2 of the previous post, $\mathcal{F}$ defined above is a Hilbert space. Recall that a Hilbert space is a vector space which has an inner product (additionally, it should also be complete with respect to the norm induced by the inner product). Therefore, a Hilbert space should have three key operations: addition & scalar multiplication, inherited from vector space, and inner product. To convince the reader that $\mathcal{F}$ is indeed an Hilbert space, we now summarize these three key operations in $\mathcal{F}$. In the following, think of $\mathcal{F}$ as a vector space and $f, g \in \mathcal{F}$ as vectors (although, at the same time, they are functions as well).

1. Addition - For each $f \in \mathcal{H}$ and $g \in \mathcal{H}$, we can define an element $(f + g) \in \mathcal{H}$ as, $(f + g) : [0, 1] \to \mathbb{R} : (f + g)(x) = f(x) + g(x)$
2. Scalar multiplication - For each $f \in \mathcal{H}$ and $a \in \mathbb{R}$, the element $(af) \in \mathcal{H}$ is defined as: $(af) : [0, 1] \to \mathbb{R} : (af)(x) = a f(x)$
3. Inner product - For each $f \in \mathcal{H}$ and $g \in \mathcal{H}$, the inner product in $\mathcal{H}$ is defined as,
$$
\langle f, g \rangle_\mathcal{H} = \int_0^1 f(x)g(x) dx
$$

Kernels

Throughout this article, we use the term kernel to identify a real or complex-valued bivariate function defined on an arbitrary set. We interchangebly use terms a kernel on $\mathcal{X}$ and a kernel on $\mathcal{X} \times \mathcal{X}$ for a function $k : \mathcal{X} \times \mathcal{X} \to \mathbb{C}$ (or $\mathbb{R}$, depending on the context). For example, if $\mathcal{X} = \mathbb{R}$, $k : \mathbb{R} \times \mathbb{R} \to \mathbb{R} : k(x, y) = xy$ is a real-valued kernel.

We now define reproducing kernels, a special type of kernel.

Definition (Reproducing Kernel)

Let $\mathcal{X}$ be a nonempty set and $\mathcal{H} \subseteq \mathbb{C}^\mathcal{X}$ be a Hilbert space of complex-valued functions defined on $\mathcal{X}$. A kernel $k : \mathcal{X} \times \mathcal{X} \to \mathbb{C}$ is called a reproducing kernel of $\mathcal{H}$ if it has the two properties:

For every $x_0 \in \mathcal{X}$, $k(y, x_0)$ as a function of $y$ belongs to $\mathcal{H}$.
The reproducing property: for each $x_0 \in \mathcal{X}$ and $f \in \mathcal{H}$, $f(x_0) = \langle f(.), k(.,x_0) \rangle_\mathcal{H}$

The first property is easy to understand: since $k(y, x)$ is a bivariate function, when we fix the second variable $x$ at $x_0$, we get a function of the first variable $y$, which goes from $\mathcal{X}$ to $\mathbb{C}$. Now, the first property requires this function to be a member of $\mathcal{H}$ (a subset of all functions from $\mathcal{X}$ to $\mathbb{C}$) for all $x_0 \in \mathcal{X}$. The second property can now be understood as follows: Pick any element $x_0$ of $\mathcal{X}$ and $f$ of $\mathcal{H}$. From the first property, $k(.,x_0)$ is in $\mathcal{H}$. Therefore, we should be able to calculate the inner product between two elements of $\mathcal{H}$, $f$ and $k(.,x_0)$. The second property requires this value (a complex number) to be equal to $f(x_0)$.

Evaluation Functional

A Hilbert space is a set endowed with vector space structure and an inner product. When we consider $\mathcal{H}$ as a conventional set, there is nothing unusual in defining functions from $\mathcal{H}$ to $\mathbb{C}$. However, since elements of $\mathcal{H}$ are also functions, this can be little confusing at times. We therefore use the term functional to refer to a function from $\mathcal{H}$ to $\mathbb{C}$, i.e., a function of a function, and reserve the term function to refer to a member of $\mathcal{H}$, i.e., a function from $\mathcal{X}$ to $\mathbb{C}$.

Pick an element $x_0$ from $\mathcal{X}$. Now, one can map each $f \in \mathcal{H}$ to the element $f(x_0) \in \mathbb{C}$. Because $f$ is a function from $\mathcal{X}$ to $\mathbb{C}$, only one $f(x_0)$ exists. Therefore, the mapping $f \mapsto f(x_0)$ is a function from $\mathcal{H}$ to $\mathbb{C}$ (or a functional in our terminology). This functional is known as the point evaluation functional at $x_0$. The formal definition is given below.

Point evaluation functional - The point evaluation functional $L_x : \mathcal{H} \to \mathbb{C}$ at a point $x \in \mathcal{X}$ is defined as $L_x(f) := f(x)$.

We are now ready to study the formal definition of an RKHS.

Definition (Reproducing Kernel Hilbert Space)

A Hilbert space $\mathcal{H}$ of complex-valued functions on a nonempty set $\mathcal{X}$ is called a reproducing kernel Hilbert space if the point evaluation functional $L_x$ is a bounded (equivalently, continuous) linear operator for all $x \in \mathcal{X}$.

The above definition should be self-explanatory at this point since we have previously discussed all related terms. Definition of a bounded linear operator was discussed in the previous post. Note that a linear operator between two normed vector spaces is bounded if and only if it is continuous.

We next state and prove the following theorem to get more insight into the concept of an RKHS.

Theorem
A Hilbert space $\mathcal{H}$ is an RKHS if and only if it has a reproducing kernel.

Proof:
First, assume that $\mathcal{H}$ is an RKHS. Denote the dual space of $\mathcal{H}$ that consists of all continuous linear functionals from $\mathcal{H}$ to $\mathbb{C}$ by $\mathcal{H}^*$. Since $\mathcal{H}$ is an RKHS, $L_x$ belongs to $\mathcal{H}^*$ for any $x \in \mathcal{X}$. Therefore, from the Riesz representation theorem, any $L_x$ has a unique representation of the form
$$
L_x(f) = \langle f, k_x \rangle_\mathcal{H}
$$
where $k_x \in \mathcal{H}$ depends only on $x$. Now, define the bivariate function $k : \mathcal{X} \times \mathcal{X} \to \mathbb{C}$ as $k(y, x) = k_x(y)$. By its definition, $k(y, x)$ gets the first property of a reproducing kernel. It also has the reproducing property since,
$$
f(x) = L_x(f) = \langle f(.), k(.,x) \rangle_\mathcal{H}\;,
$$
for all $x \in \mathcal{X}$ and $f \in \mathcal{H}$. Therefore, $k(y, x)$ is a reproducing kernel of $\mathcal{H}$.

To prove the converse, assume the existence of a reproducing kernel $k(y, x)$ for the Hilbert space $\mathcal{H}$. Recall the definition of a bounded linear operator from the previous post. Note that $L_x$ is a function from $\mathcal{H}$ to $\mathbb{C}$, the norm in the former induced by its inner product, while the norm in $\mathbb{C}$ is simply the absolute value.

Let $f \in \mathcal{H}$, then,

$|L_x(f)| = |f(x)| = |\langle f(.), k(.,x) \rangle_\mathcal{H}|$
$\le \|f(.)\|_\mathcal{H} \|k(.,x)\|_\mathcal{H}$ (Cauchy-Schwarz inequality)
$= \|f(.)\|_\mathcal{H} \langle k(.,x), k(.,x) \rangle_\mathcal{H}^{1/2}$
$= \|f\|_\mathcal{H}k(x, x)^{1/2} \quad$ (from the second property of a reproducing kernel).

Therefore, from the definition, $L_x$ is a bounded linear operator for all $x \in \mathcal{X}$ and hence $\mathcal{H}$ is an RKHS.

For a given RKHS, the reproducing kernel is unique. We refer the reader to [1] for the proof of this uniqueness.

It is known that a reproducing kernel is characterized by the property of positive definiteness. In the next article, we shall discuss positive and negative definite kernels in detail and state their connection with reproducing kernel Hilbert spaces.

References

[1] Aronszajn, N. Theory of Reproducing Kernels. Transactions of the American Mathematical Society,(1950).

Friday, September 27, 2013

A Tutorial Introduction to Reproducing Kernel Hilbert Spaces - Part II

In the present article, we continue the tutorial on Reproducing Kernel Hilbert Spaces (RKHS). The previous part of this tutorial can be found here.

We start this article with a quick summary of the various spaces we introduced in the previous post. This summary is only supposed to give you some intuitive ideas about these different spaces. For rigorous definitions and more explanations, please refer to the previous post.

Vector space - A set endowed with two special operations: addition and scalar multiplication.
Normed vector space - A vector space with length of vectors defined. This notion of vector lengths also allows us to measure distances between vectors.
Metric space - A set (nor necessarily a vector space) where the distance between two points in the set is defined. A normed vector space is a metric space. But the converse is not true.
Banach space - A complete normed vector space. A complete space does not have missing elements (all Cauchy sequences have limits).

When we introduced normed vector spaces, we introduced an important structure on vector spaces: distance. Now, we would like to introduce another structure on vector spaces which is inner product. Note that, inner product is different from scalar product that any vector space possesses. The scalar product of a vector space operates on a scalar and a vector, yielding a vector as the output. In contrast, an inner product works on two vectors, giving a scalar as the output. When an inner product is defined on a vector space, it is known as as an inner product space. We now proceed to the formal definition.

Inner Product Space

An inner product space, $(\mathcal{V}, \langle., .\rangle)$ is a vector space $\mathcal{V}$ over a field $\mathbb{F}$ with a binary operator $\langle., .\rangle : \mathcal{V}\times \mathcal{V} \to \mathbb{F}$, known as the inner product, that satisfies the following axioms for all vectors $\mathbf{x}, \mathbf{y}, \mathbf{z} \in \mathcal{V}$ and all scalars $a \in \mathbb{F}$.

Conjugate symmetry: $\langle \mathbf{x}, \mathbf{y} \rangle = \overline{\langle \mathbf{y}, \mathbf{x} \rangle}$
Linearity in the first argument: $\langle a\mathbf{x}, \mathbf{y} \rangle = a \langle \mathbf{x}, \mathbf{y} \rangle$ and $\langle \mathbf{x} + \mathbf{z}, \mathbf{y} \rangle = \langle \mathbf{x}, \mathbf{y} \rangle + \langle \mathbf{z}, \mathbf{y} \rangle$
Positive definiteness: $\langle \mathbf{x}, \mathbf{x} \rangle \ge 0$ where the equality holds only when $\mathbf{x} = 0$.

The field $\mathbb{F}$ is either the field or real numbers ($\mathbb{R}$) or the field of complex numbers ($\mathbb{C}$). When $\mathbb{F}$ is the real numbers, axiom 1 implies symmetry since the conjugate of a real number is itself.

Exercise 1: The dot product of the $n$-dimensional Euclidean space $\mathbb{R}^n$ is defined as $\mathbf{x}.\mathbf{y} = \sum_{i = 1}^n x_i y_i$ where $x_i, y_i$, $i = 1,\ldots,n$, are the components of vectors $\mathbf{x}$, $\mathbf{y}$, respectively. Show that $\mathbb{R}^n$, endowed with the dot product, is an inner product space.

Exercise 2: Let $\mathcal{F}$ be the set of real valued, square-integrable functions. i.e.,
\[
\mathcal{F} = \{f | f: \mathbb{R} \to \mathbb{R}, \int_{-\infty}^\infty |f(x)|^2 dx < \infty \}.
\]
It is known that $\mathcal{F}$ forms an infinite dimensional vector space over the field $\mathbb{R}$ with the usual function addition and scalar multiplication. Show that $\mathcal{F}$ is also an inner product space where the inner product $\langle ., . \rangle: \mathcal{F} \times \mathcal{F} \to \mathbb{R}$ is defined for all $f, g \in \mathcal{F}$ as:

\[
\langle f, g \rangle = \int_{-\infty}^\infty f(x)g(x) dx < \infty.
\]
Note that in Exercise 2 above, vectors of the vector space $\mathcal{F}$ are actually functions. To help visualization of functions as vectors, one can assume that all functions in the vector space are sampled at $n$ fixed points $x_1, \ldots, x_n$ equally spaced in the domain of the functions (the real line in the above case). Then, a given function $f$, can be represented (not uniquely) by an $n$-vector $[f(x_1)\;\;f(x_2)\;\;\ldots\;\;f(x_n)]$ lying in the $n$-dimensional vector space $\mathbb{R}^n$. When $n \to \infty$, these functions form an infinite dimensional vector space.

The reader is encouraged to attempt the following simple exercise now.

Exercise 3: Let $(\mathcal{V}, \langle ., . \rangle_\mathcal{V})$ be an inner product space. Show, starting from the definitions of an inner product space and a normed vector space, that $(\mathcal{V}, \| . \|_\mathcal{V})$ where $\|\mathbf{x}\|_\mathcal{V} = \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle_\mathcal{V}}$ for all $\mathbf{x} \in \mathcal{V}$, is a normed vector space.

As seen from the above exercise, when we have the inner product structure on a vector space, we can also define a norm on the space with the help of the inner product. This norm, known as the $l^2$ norm, is defined by:$\|\mathbf{x}\| = \sqrt{\langle x, x \rangle}$ where $\mathbf{x}$ is a vector in the inner product space with inner product $\langle ., . \rangle$. As a specific example, when the inner product space is $\mathbb{R}^n$ with the usual dot product as the inner product, for all $\mathbf{x} \in \mathbb{R}^n$, $\|\mathbf{x}\| = \sqrt{x.x} = \sum_{i = 1}^n x_i^2$.

Although an inner product induces a norm, not every norm is induced by an inner product. For example, the $l^1$ norm, defined on $\mathbb{R}^n$ by $\|\mathbf{x}\|_{1} = \sum_{i = 1}^n |x_i|$, is not induced by an inner product.

An inner product space is also known as a pre-Hilbert space, meaning that it is one step behind a Hilbert space, which is our next subtopic.

Hilbert Space

As discussed above, an inner product space has a norm induced by its inner product. However, a generic inner product space might not be complete with respect to this norm (recall the definition of completeness from the previous post). When this condition is satisfied, i.e., when an inner product space is complete with respect to the norm induced by its inner product, it is known as a Hilbert space. An example for a Hilbert space would be $\mathbb{R}^n$ with the usual dot product.

It is easy to see that a Hilbert space with its canonical norm (the norm induced by the inner product) is also a Banach space. Although, a Hilbert space does not necessarily have to be infinite dimensional, the term Hilbert space usually implies an infinite dimensional space such as a space of functions. An example for an infinite dimensional Hilbert space would be the space of square-integrable functions introduced in Exercise 2. It is known as the $L^2$ space.

Reproducing Kernel Hilbert Space

We have now reached the main topic of this tutorial. We will start the topic by defining some of the terms we will be using. Since these definitions are quite straightforward and easy to understand, we directly present them below without further discussion.

Linear operator (Linear map) - Let $f:\mathcal{V}\to \mathcal{W}$ be a function between two vector spaces $\mathcal{V}, \mathcal{W}$ over the same field $\mathbb{F}$. $f$ is called a linear operator if the following two conditions are satisfied for all $\mathbf{x}, \mathbf{y} \in \mathcal{V}$ and $a \in \mathbb{F}$.
1. $f(\mathbf{u} + \mathbf{v}) = f(\mathbf{u}) + f(\mathbf{v})$
2. $f(a\mathbf{u}) = af(\mathbf{u})$

Bounded linear operator - Let $f:\mathcal{V}\to \mathcal{W}$ be a linear operator between two normed vector spaces $(\mathcal{V}, \|.\|_\mathcal{V})$ and $(\mathcal{W}, \|.\|_\mathcal{W})$. $f$ is called a bounded linear operator if there exists some $M > 0$ such that for all $\mathbf{u} \in \mathcal{V}$, $\|f(\mathbf{u})\|_{\mathcal{W}} \le M\|\mathbf{u}\|_{\mathcal{V}}$.

The next article will be dedicated to discussing Reproducing Kernel Hilbert Spaces (RKHS) and their properties.

Wednesday, November 21, 2012

A Tutorial Introduction to Reproducing Kernel Hilbert Spaces - Part I

In this post we will study different spaces encountered in Linear Algebra and Functional Analysis. Our final goal is to understand what is meant by Reproducing Kernel Hilbert Spaces or RKHS which have a wide range of applications in machine learning and related fields. We begin with vector spaces, which any engineering undergraduate student is familiar with.

Vector Space

In simple terms, a vector space is a set whose elements have two characteristic properties:

1) Two elements can be added to obtain an element in the same set.

2) A given element can be multiplied by a scalar to obtain an element in the same set.

Elements of such a set are dubbed vectors. Perhaps, the simplest example of a vector space is $\mathbb{R}^3$ with elements of $\mathbb{R}$ as scalars. Addition and multiplication in this vector space can be demonstrated as follows.

\[\text{Addition:}\hspace{1cm} \begin{bmatrix} 1 \\ 2 \\ 3 \\ \end{bmatrix} + \begin{bmatrix} 4 \\ 5 \\ 6 \\ \end{bmatrix} = \begin{bmatrix} 5 \\ 7 \\ 9 \\ \end{bmatrix} \hspace{3cm} \text{Scalar Multiplication:} \hspace{1cm} 0.2 \begin{bmatrix} 9 \\ 8 \\ 7 \\ \end{bmatrix} = \begin{bmatrix} 1.8 \\ 1.6 \\ 1.4 \\ \end{bmatrix} \]

Formally, a vector space over a field $\mathbb{F}$ is a set $\mathcal{V}$ with two binary operations: addition (denoted by $+$) and scalar multiplication (denoted by $.$ or by writing two objects close to each other), that satisfy number of axioms for all $a, b \in \mathbb{F}$ and $\mathbf{u}, \mathbf{v} \in \mathcal{V}$. We omit these axioms here for the sake of brevity. Virtually in all cases $\mathbb{F}$ is either the field of real numbers ($\mathbb{R}$) or complex numbers ($\mathbb{C}$).

As you may have already noted, we use simple letters ($a, b, $ etc.) to denote elements of $\mathbb{F}$, known as scalars and boldface letters ($\mathbf{u}, \mathbf{v}$, etc.) to denote elements of $\mathcal{V}$, known as vectors. Hereafter, we may use $\mathcal{V}$ to denote a vector space without explicit reference to the field or the two operations.

Vector space introduces a very rich structure for a set. However, to perform mathematical analysis on a vector space, we need additional tools on the vector space such as a vector lengths, distance between vectors and angle measures. Note that a plain vector space lacks all these ideas. We shall first consider the notion of length of a vector in a vector space.

Normed Vector Space

Simply put, a normed vector space is a vector space whose vectors have lengths. The length or the norm of a vector $\mathbf{v}$ is denoted by $\|\mathbf{v}\|$. The norm has to satisfy several axioms as formalized by the following definition of a normed vector space.

Definition (Normed Vector Space):

A normed vector space $(\mathcal{V}, \|.\|)$, is a vector space $\mathcal{V}$ over a field $\mathbb{F}$, endowed with a norm, $\|.\| : \mathcal{V} \to \mathbb{R}$ that satisfies following axioms for all $\mathbf{v} \in \mathcal{V}$ and $a \in \mathbb{F}$.

$\|\mathbf{v}\| \ge 0$.
$\|\mathbf{v}\| = 0$ if and only if $\mathbf{v} = \mathbf{0}$.
$\|a\mathbf{v}\| = |a| \|\mathbf{v}\|$.
$\|\mathbf{u} + \mathbf{v}\| \le \|\mathbf{u}\| + \|\mathbf{v}\|$.

Defining a norm on a vector space enables us to measure distance between vectors in the space. Distance between two vectors $\mathbf{u}, \mathbf{v} \in \mathcal{V}$ is equal to the the length of $\mathbf{u} - \mathbf{v}$ ¹. Formally, $d(\mathbf{u}, \mathbf{v}) = \|\mathbf{u} - \mathbf{v}\|$ where $d(., .)$ is the distance function.

In fact, the distance between points can be defined on any nonempty set which is not necessarily a vector space. A set with a distance measure between its elements is known as a metric space. We now present the formal definition of a metric space.

Metric Space

Definition (Metric Space):

A metric space $(M, d)$ is a set $M$ along with a function $d : M \times M \to \mathbb{R}$ that satisfies following axioms for all $x, y, z \in M$.

$d(x, y) \ge 0$.
$d(x, y) = 0$ if and only if $x = y$.
$d(x, y) = d(y, x)$.
$d(x, z) \le d(x,y) + d(y, z)$.

The function $d$ that satisfies above axioms is known as the metric or the distance function. The last axiom (or property when it's satisfied by a metric), known as the triangle inequality, is perhaps the most interesting property of a metric.

Every normed vector space is a metric space. But the converse is not true; we don't require a vector space structure on a set to define a valid metric on it.

Banach Space

A Banach space is a complete normed vector space. What exactly is completeness? To define completeness, we first need to understand what is meant by a Cauchy sequence.

Cauchy Sequence

A Cauchy sequence is a sequence of elements from a set where nearby elements get increasingly close to each other (see Figure 1). For the above sentence to make sense, we need a way to measure the closeness (or the distance) between two elements of the set. Hence, Cauchy sequences are defined on metric spaces. Below we give the definition of a Cauchy sequence in a normed vector space and a more general metric space. The reader can easily see that the first definition is just a special case of the second.

Definition (Cauchy Sequence):

Normed Vector Space: In a normed vector space $(\mathcal{V}, \|.\|)$ a sequence of vectors $\mathbf{x}_1,\mathbf{x}_2,\mathbf{x}_3,\ldots$ is called a Cauchy sequence if for all $\varepsilon > 0$, there exists an $N \in \mathbb{Z}^+$ such that, for all $n, m > N$, $\|\mathbf{x}_n - \mathbf{x}_m\| < \varepsilon$.

Metric Space: In a metric space $(M, d)$, a sequence $x_1, x_2, x_3, \ldots$ where $x_i \in M$ for $i = 1,2,3,\ldots$ is called a Cauchy sequence if for all $\varepsilon > 0$, there exists an $N \in \mathbb{Z}^+$ such that, for all $n, m > N$, $d(x_n, x_m) < \varepsilon$.

Intuitively, a Cauchy sequence is a sequence that converges or has a limit as $n$ goes to infinity. However, for some spaces and for some Cauchy sequences, this limit might not be an element of the space on which the sequence is defined. Therefore, such a limit does not exist, and the Cauchy sequence does not converge.

Figure 1. A Cauchy sequence on $\mathbb{R}$. As $n$ grows, $x_n$ s become arbitrary close to each other and to the limit of the sequence (1 in this case).

Completeness

A metric space is called complete if every Cauchy sequence defined on it is convergent, i.e., every Cauchy sequence has a limit which is a point in the same space. It's worth noting again that we must have a metric defined on the space to talk about Cauchy sequences and hence about completeness. Therefore, the completeness is always defined with respect to a metric.

Example

An example for a complete space is $\mathbb{R}^n$ with respect to the usual distance measure (Euclidean distance). Let's consider $\mathbb{R}$, to be more specific. Euclidean distance narrows down to absolute value in this case. Every Cauchy sequence defined on $\mathbb{R}$ with respect to this metric converges to an element of $\mathbb{R}$, hence $\mathbb{R}$ is complete.

Non-Example

Consider the set of rational numbers, $\mathbb{Q}$ with absolute value as the metric. The sequence defined recursively on $\mathbb{Q}$ as $x_1 = 1$, $x_n = x_n/2 + 1/x_n$ is Cauchy but does not have a limit in $\mathbb{Q}$. In fact, if the same sequence is defined on $\mathbb{R}$, its limit is identified as $\sqrt{2}$ which is not an element of $\mathbb{Q}$.

Now, coming back to the definition of a Banach space, we understand that a Banach space is a normed vector space where all Cauchy sequences converge (or equivalently, have limits). We can think of an incomplete space as a space with holes. A complete space, on the other hand, does not have any holes. Completing a space (something we will encounter shortly), is similar to filling holes in the space by adding missing elements to it.

In the next article we shall continue the tutorial to discuss inner product spaces, Hilbert spaces and Reproducing Kernel Hilbert Spaces.

1 Subtraction $-$ in a vector space is defined as $\mathbf{u} - \mathbf{v} := \mathbf{u} + (-\mathbf{v})$ where $-\mathbf{v}$ is the additive inverse of $\mathbf{v}$^↩