Gram-Schmidt Process

Nov. 01, 2023

Gram-Schmidt process

Gram-Schmidt process (or Gram-Schmidt algorithm) is a method for orthonormalizing a set of vectors in an inner product space, most commonly the Euclidean space $\mathbb{R}^n$ equipped with the standard inner product1. The Gram-Schmidt process takes a finite, linearly independent set of vectors \(S=\{\boldsymbol{v}_1,\cdots,\boldsymbol{v}_k\}\) for $k\le n$ (NOTE WELL HERE) and generates an orthogonal set \(S'=\{\boldsymbol{u}_1,\cdots,\boldsymbol{u}_k\}\) (and hence orthonormal set \(\{\boldsymbol{e}_1,\cdots,\boldsymbol{e}_k\}\)) that spans the same $k$-dimensional subspace of $\mathbb{R}^n$ as $S$. The whole Gram-Schmidt process shows as follows.

At first, the vector projection2 of a vector $\boldsymbol{v}$ on a nonzero vector $\boldsymbol{u}$ is defined as:

\[\mathrm{proj}_\boldsymbol{u}(\boldsymbol{v})=\dfrac{\langle\boldsymbol{v},\boldsymbol{u}\rangle}{\langle\boldsymbol{u},\boldsymbol{u}\rangle}\boldsymbol{u}\label{eq1}\]

where $\langle\boldsymbol{v},\boldsymbol{u}\rangle$ is the inner product of $\boldsymbol{u}$ and $\boldsymbol{v}$. Definition $\eqref{eq1}$ means that \(\mathrm{proj}_\boldsymbol{u}(\boldsymbol{v})\) is the orthogonal projection of $\boldsymbol{v}$ onto the line spanned by $\boldsymbol{u}$. If $\boldsymbol{u}$ is the zero vector, then \(\mathrm{proj}_\boldsymbol{u}(\boldsymbol{v})\) is defined as the zero vector.

Given $k$ vectors $\boldsymbol{v}_1,\cdots,\boldsymbol{v}_k$, the Gram-Schmidt process defines the vectors $\boldsymbol{u}_1,\cdots,\boldsymbol{u}_k$ as follows:

\[\begin{split} &\boldsymbol{u}_1=\boldsymbol{v}_1\\ &\boldsymbol{u}_2=\boldsymbol{v}_2-\mathrm{proj}_{\boldsymbol{u}_1}(\boldsymbol{\boldsymbol{v}_2})\\ &\boldsymbol{u}_3=\boldsymbol{v}_3-\mathrm{proj}_{\boldsymbol{u}_1}(\boldsymbol{\boldsymbol{v}_3})-\mathrm{proj}_{\boldsymbol{u}_2}(\boldsymbol{\boldsymbol{v}_3})\\ &\boldsymbol{u}_4=\boldsymbol{v}_4-\mathrm{proj}_{\boldsymbol{u}_1}(\boldsymbol{\boldsymbol{v}_4})-\mathrm{proj}_{\boldsymbol{u}_2}(\boldsymbol{\boldsymbol{v}_4})-\mathrm{proj}_{\boldsymbol{u}_3}(\boldsymbol{\boldsymbol{v}_4})\\ &\qquad\vdots\\ &\boldsymbol{u}_k=\boldsymbol{v}_k-\sum_{i=1}^{k-1}\mathrm{proj}_{\boldsymbol{u}_i}(\boldsymbol{\boldsymbol{v}_k})\\ \end{split}\]

and

\[\begin{split} &\boldsymbol{e}_1=\dfrac{\boldsymbol{u}_1}{\vert\vert\boldsymbol{u}_1\vert\vert}\\ &\boldsymbol{e}_2=\dfrac{\boldsymbol{u}_2}{\vert\vert\boldsymbol{u}_2\vert\vert}\\ &\boldsymbol{e}_3=\dfrac{\boldsymbol{u}_3}{\vert\vert\boldsymbol{u}_3\vert\vert}\\ &\boldsymbol{e}_4=\dfrac{\boldsymbol{u}_4}{\vert\vert\boldsymbol{u}_4\vert\vert}\\ &\quad\vdots\\ &\boldsymbol{e}_k=\dfrac{\boldsymbol{u}_k}{\vert\vert\boldsymbol{u}_k\vert\vert}\\ \end{split}\]

The vector sequence \(\boldsymbol{u}_1,\cdots,\boldsymbol{u}_k\) is the required system of orthogonal vectors, and is known as Gram-Schmidt orthogonalization. The normalised vector \(\boldsymbol{e}_1,\cdots,\boldsymbol{e}_k\) is from an orthonormal set, and known as Gram-Schmidt orthonormalization.

Generally, I feel that Gram-Schmidt process is a process of finding vector rejection2.


MATLAB algorithm implementation

Wikipedia also provides a MATLAB function gramschmidt [1: Algorithm] to conduct Gram–Schmidt algorithm:

1
2
3
4
5
6
7
8
9
10
11
12
function U = gramschmidt(V)
[n,k] = size(V);
U = zeros(n,k);
U(:,1) = V(:,1)/norm(V(:,1));
for i = 2:k
    U(:,i) = V(:,i);
    for j = 1:i-1
        U(:,i) = U(:,i)-(U(:,j)'*U(:,i))*U(:,j);
    end
    U(:,i) = U(:,i)/norm(U(:,i));
end
end

It view as each column of input variable V as $\boldsymbol{v}_i$, and each column of output variable U corresponds to orthonormalized $\boldsymbol{e}_i$. In the following text, I would use this function to analyse Gram-Schmidt process.

Case $n=k$

If $k$, the number of given vector set \(S=\{\boldsymbol{v}_1,\cdots,\boldsymbol{v}_k\}\), is equal to the dimension of vector space $\mathbb{R}^n$, i.e., $k=n$, Gram-Schmidt process will find a complete orthonormal basis set. For example, for a random vector set \(S=\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}\in\mathbb{R}^3\):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
clc,clear,close all

rng("default")

n = 3;
k = 3;
A = rand(n,k);
U = gramschmidt(A);

idxCombinations = nchoosek(1:k,2);
for i = 1:height(idxCombinations)
    idx = idxCombinations(i,:);
    fprintf("<e%s,e%s>: %.4f\n", ...
        num2str(idx(1)),num2str(idx(2)),dot(U(:,idx(1)),U(:,idx(2))));
end

figure("Position",[817.67,580.33,973.33,486.67])
hold(gca,"on"),grid(gca,"on"),box(gca,"on")
view(26.499999959417345,41.611284029194252)
markerSize = 20;
colors = [
    0.2314,0.2863,0.5725;
    0.9333,0,0;
    0,0.5451,0.2706;
    0.3882,0.0941,0.4745;
    0,0.5098,0.5020];

for i = 1:width(U)
    p1 = plot3([0,A(1,i)],[0,A(2,i)],[0,A(3,i)], ...
        "LineWidth",1.5,"Color",colors(i,:),"Marker",".","MarkerSize",markerSize, ...
        "DisplayName",sprintf("Random vector, v%s",num2str(i)));
    p2 = plot3([0,U(1,i)],[0,U(2,i)],[0,U(3,i)], ...
        "LineWidth",1.5,"Color",colors(i,:),"Marker",".","MarkerSize",markerSize, ...
        "DisplayName",sprintf("Gram-Schmidt orthonormalization, e%s",num2str(i)), ...
        "LineStyle","--");
end
legend("Location","westoutside")

function U = gramschmidt(V)
[n,k] = size(V);
U = zeros(n,k);
U(:,1) = V(:,1)/norm(V(:,1));
for i = 2:k
    U(:,i) = V(:,i);
    for j = 1:i-1
        U(:,i) = U(:,i)-(U(:,j)'*U(:,i))*U(:,j);
    end
    U(:,i) = U(:,i)/norm(U(:,i));
end
end
1
2
3
4
U =
    0.6651    0.7463   -0.0256
    0.7395   -0.6631   -0.1162
    0.1037   -0.0583    0.9929

The \(S=\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}\) and \(S'=\{\boldsymbol{e}_1,\boldsymbol{e}_2,\boldsymbol{e}_3\}\) shows as follows:

image-20231101212229019

It shows the orthonormal relation between $\boldsymbol{e}_1$, $\boldsymbol{e}_2$, and $\boldsymbol{e}_3$, and we could verify this point from the perspective of the results of dot product:

1
2
3
<e1,e2>: -0.0000
<e1,e3>: -0.0000
<e2,e3>: 0.0000

Case $n>k$

If $n>k$, Gram-Schmidt process will find an incomplete orthonormal basis set in the subspace of $\mathbb{R}^n$ where \(\boldsymbol{v}_1,\cdots, \boldsymbol{v}_k\) span. For example, for $n=3$ and $k=2$:

1
2
3
4
5
6
7
rng("default")

n = 3;
k = 2;
A = rand(n,k);
U = gramschmidt(A);
...
1
2
3
4
U =
    0.6651    0.7463
    0.7395   -0.6631
    0.1037   -0.0583

image-20231101213205656

1
<e1,e2>: -0.0000

Case $n<k$

At the case of $n>k$, Gram-Schmidt process will find a complete orthonormal basis set in $\mathbb{R}^n$, and besides that, $k-n$ redundant vectors will be generated as well. For example, for $n=3$ and $k=5$:

1
2
3
4
5
6
7
rng("default")

n = 3;
k = 5;
A = rand(n,k);
U = gramschmidt(A);
...
1
2
3
4
U =
    0.6651    0.7463   -0.0256    0.6757    0.0057
    0.7395   -0.6631   -0.1162    0.7372   -0.0052
    0.1037   -0.0583    0.9929         0    1.0000

image-20231101213621375

We could find that, \(S'=\{\boldsymbol{e}_1,\boldsymbol{e}_2,\boldsymbol{e}_3\}\) is actually the complete orthonormal basis set what we want to get, and $\boldsymbol{e}_4$ and $\boldsymbol{e}_5$ are redundant:

1
2
3
4
5
6
7
8
9
10
<e1,e2>: -0.0000
<e1,e3>: -0.0000
<e1,e4>: 0.9946
<e1,e5>: 0.1036
<e2,e3>: 0.0000
<e2,e4>: 0.0155
<e2,e5>: -0.0507
<e3,e4>: -0.1029
<e3,e5>: 0.9933
<e4,e5>: 0.0000

Actually, Fig. 1 shows the results of Gram-Schmidt process for ONLY three vectors of \(S=\{\boldsymbol{v}_1,\boldsymbol{v}_2,\boldsymbol{v}_3\}\). We could verity this point:

1
2
3
4
5
6
7
8
9
rng("default")

n = 3;
k = 5;
A = rand(n,k);
k = 3;
A = A(:,1:k);
U = gramschmidt(A);
...
1
2
3
4
U =
    0.6651    0.7463   -0.0256
    0.7395   -0.6631   -0.1162
    0.1037   -0.0583    0.9929

image-20231101214330722

If we choose another three vectors, like \(S=\{\boldsymbol{v}_2,\boldsymbol{v}_3,\boldsymbol{v}_5\}\), Gram-Schmidt process will give another $S’$:

1
2
3
4
5
6
7
8
9
rng("default")

n = 3;
k = 5;
A = rand(n,k);
A = A(:,[2,3,5]);
k = 3;
U = gramschmidt(A);
...
1
2
3
4
U =
    0.8190   -0.2425    0.5200
    0.5670    0.2040   -0.7980
    0.0875    0.9485    0.3046

image-20231101214930097


References