You are on page 1of 75

MAJORIZATION AND THE SCHUR-HORN THEOREM

A Thesis
Submitted to the Faculty of Graduate Studies and Research
In Partial Fulfillment of the Requirements
for the Degree of
Master of Science
In
Mathematics
University of Regina

By
Maram Albayyadhi
Regina, Saskatchewan
January 2013


c Copyright 2013: Maram Albayyadhi
UNIVERSITY OF REGINA

FACULTY OF GRADUATE STUDIES AND RESEARCH

SUPERVISORY AND EXAMINING COMMITTEE

Maram Albayyadhi, candidate for the degree of Master of Science in Mathematics, has
presented a thesis titled, Majorization and the Schur-Horn Theorem, in an oral
examination held on December 18, 2012. The following committee members have found
the thesis acceptable in form and content, and that the candidate demonstrated
satisfactory knowledge of the subject material.

External Examiner: Dr. Daryl Hepting, Department of Computer Science

Supervisor: Dr. Martin Argerami,


Department of Mathematics and Statistics

Committee Member: Dr. Douglas Farenick,


Department of Mathematics and Statistics

Chair of Defense: Dr. Sandra Zilles, Department of Computer Science


Abstract

We study majorization in Rn and some of its properties. The concept of ma-


jorization plays an important role in matrix analysis by producing several useful re-
lationships. We find out that there is a strong relationship between majorization and
doubly stochastic matrices; this relation has been perfectly described in Birkhoff’s
Theorem. On the other hand, majorization characterizes the connection between
the eigenvalues and the diagonal elements of self adjoint matrices. This relation is
summarized in the Schur-Horn Theorem. Using this theorem, we prove versions of
Kadison’s Carpenter’s Theorem. We discuss A. Neumann’s extension of the concept
of majorization to infinite dimension to that provides a Schur-Horn Theorem in this
context. Finally, we detail the work of W. Arveson and R.V. Kadison in proving a
strict Schur-Horn Theorem for positive trace-class operators.

i
Acknowledgments

Throughout my studying, I could have not done my project without the support
of my professors whom I insist on thanking even though my words cannot adequately
express my gratitude. Dr. Martin Argerami, I would like to thank you from the
bottom of my heart for all the support and the guidance that you provided for me.
Dr. Shaun Fallat and Dr. Remus Floricel, if it were not for your classes, I would
not have learned as much about my field. I’m also honored to thank all my amazing
colleagues in the math department, especially my friend Angshuman Bhattacharya.
To my father Ibrahim Albayyadhi and my Mother Suad Bakkari, words cannot express
my love for you. Your prayers, belief in me and encouragement are the main reasons
for my success. If I kept on thanking you all of my life, I could not pay you back.
To my husband, Dr. Hadi Mufti, you always make it easier for me whenever I face
obstacles; you have always been the wind beneath my wings. Finally, I would like
to thank the one who kept wiping my tears on the hard days and saying, “Mom . . .
don’t give up” – my son Yazan.

ii
Contents

Abstract i

Acknowledgments ii

Table of Contents iii

1 Preliminaries 1
1.1 Majorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Doubly Stochastic Matrices . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Doubly Stochastic Matrices and Majorization . . . . . . . . . . . . . 6

2 The Schur-Horn Theorem in the Finite Dimensional Case 14


2.1 The Pythagorean Theorem in Finite Dimension . . . . . . . . . . . . 14
2.2 The Schur-Horn Theorem in the Finite Dimensional Case . . . . . . . 20
2.3 A Pythagorean Theorem for Finite Doubly Stochastic Matrices . . . . 24

3 The Carpenter Theorem in the Infinite Dimensional Case 27


3.1 The Subspaces K and K ⊥ both have Infinite Dimension . . . . . . . 28
3.2 One of the Subspaces K, K ⊥ has Finite Dimension . . . . . . . . . . 45
3.3 A Pythagorean Theorem for Infinite Doubly Stochastic Matrices . . 47

iii
4 A Schur-Horn Theorem in Infinite Dimensional Case 52
4.1 Majorization in Infinite Dimension . . . . . . . . . . . . . . . . . . . 52
4.2 Neumann’s Schur-Horn Theorem . . . . . . . . . . . . . . . . . . . . 55
4.3 A Strict Schur-Horn Theorem for Positive Trace-Class Operators . . . 57

5 Conclusion and Future Work 66

iv
Chapter 1

Preliminaries

In this chapter we provide some basic information about majorization and some
of it properties, which we will use later. The material in this chapter is basic and can
be found in many matrix analysis books [4, 7].

1.1 Majorization

Let x = (x1 , · · · , xn ) ∈ Rn . Let x↑ = (x↑1 , · · · , x↑n ) and x↓ = (x↓1 , · · · , x↓n ) denote


the vector x with it’s coordinates rearranged in increasing and decreasing orders
respectively. Then
x↑i = x↓n−i+1 , 1 ≤ i ≤ n.

Definition 1.1. Given x, y in Rn , we say x is majorized by y, denoted by x ≺ y,

1
if

k
X k
X
x↓i ≤ yi↓ , 1 ≤ k ≤ n (1.1)
i=1 i=1
and
n
X n
X
x↓i = yi↓ . (1.2)
i=1 i=1

Pn
Example 1.2. If xi ∈ [0, 1], and i=1 xi = 1, then we have

1 1
( , · · · . ) ≺ (x1 , · · · , xn ) ≺ (1, 0, · · · , 0).
n n

Proposition 1.3. If x ≺ y and y ≺ x, then there exists a permutation matrix P such


that y = Px.

Proof. Assume first that x, y are rearranged in decreasing order,i.e. x1 ≥ x2 ≥ · · · ≥


xn , y1 ≥ y2 · · · ≥ yn . We proceed by induction on k. For k = 1 we get x1 ≤ y1 , and
y1 ≤ x1 from the majorization equation (1.1), which implies x1 = y1 . By induction
hypothesis we assume that xi = yi , for i = 1, · · · , k and we will show that it works
up to k + 1. From ( 1.1) x1 + x2 + · · · + xk + xk+1 ≤ y1 + y2 + · · · + yk + yk+1 , and
by our assumption x1 + x2 + · · · + xk = y1 + y2 + · · · + yk , so if we cancel them we
get xk+1 ≤ yk+1 , as the roles of x and y can be reversed yk+1 ≤ xk+1 , and this implies
xk+1 = yk+1 .
In general, if we write x↓ and y ↓ for the non-increasing rearrangements, there exist
permutations P1 and P2 such that x↓ = P1 x, y ↓ = P2 y. By the first part of the proof,
x↓ = y ↓ , i.e. x = P−1
1 P2 y.

Although majorization is defined through non-increasing rearrangements, it can

2
also be done by non-decreasing ones:

Proposition 1.4. If x, y ∈ Rn , then x ≺ y if and only if

k
X k
X
x↑i ≥ yi↑ , 1 ≤ k ≤ n (1.3)
i=1 i=1
and
n
X n
X
x↑i = yi↑ (1.4)
i=1 i=1

Proof. For equation(1.4), we have ni=1 x↓i = ni=1 yi↓ , since x ≺ y, but
P P Pn ↑
i=1 xi =
Pn ↓ Pn ↑ Pn ↓ Pn ↓ Pn ↑ Pn ↑ Pn ↑
i=1 xi , so i=1 xi = i=1 xi = i=1 yi = i=1 yi , and i=1 xi = i=1 yi .

And for equation (1.3), we know that

x↑i = x↓n−i+1 , 1 ≤ i ≤ n.

So
k
X k
X
x↑i = x↓n−i+1 .
i=1 i=1

Then

k
X n
X n
X n−k
X
x↑i = x↓l = xl − x↓l
i=1 l=n−k+1 l=1 l=1
n−k
X
= tr (x) − x↓l
l=1
n−k
X
≥ tr (y) − yl↓
l=1
n
X n−k
X n
X k
X
= yl − yl↓ = yl↓ = yi↑ .
l=1 l=1 l=n−k+1 i=1

3
1.2 Doubly Stochastic Matrices

There is a deep relation between majorization and doubly stochastic matrices; we


will discuss that relation in this section.

Definition 1.5. Let B = (bij ) be a square matrix. We say B is doubly stochastic


if :

• bij ≥ 0 ∀i, j,

Pn
• i=1 bij =1 ∀j,

Pn
• j=1 bij =1 ∀i.

Proposition 1.6. The set of square doubly stochastic matrices is a convex set and it
is closed under multiplication and the adjoint operation. But it is not a group.

Proof. If t1 , · · · , tr ∈ [0, 1] with rj=1 tj = 1 and P1 , · · · , Pr are permutations, let


P

A = rj=1 tj Pj . Clearly every entry of A is non-negative. Also,


P

n
X n X
X r
Akl = tj (Pj )kl
k=1 k=1 j=1
Xr X n
= tj (Pj )kl
j=1 k=1
Xr
= tj = 1, ∀l.
j=1

Pn
A similar computation shows that l=1 Akl = 1 for all k.
If A, B are doubly stochastic matrices then AB is also doubly stochastic. Indeed,

4
the sum over the rows is

n
X n X
X n
(AB)ij = Aik Bkj
j=1 j=1 k=1
Xn n
X
= Aik Bkj
k=1 j=1
Xn
= Aik = 1.
k=1

And the same thing can be done for the columns, which shows that AB is also
doubly stochastic. Also it is clear that if A doubly stochastic, then so is its adjoint
A∗ .
The class of doubly stochastic matrices is not a group since not every doubly
stochastic matrix is invertible; for example if we take the 2 × 2 matrix, with all its
entries equal to 21 , this matrix is doubly stochastic and its determinant is zero.

Proposition 1.7. Every permutation matrix is doubly stochastic and is an extreme


point of the convex set of all doubly stochastic matrices.

Proof. A permutation matrix has exactly one entry +1 in each row and in each column
and all other entries are zero. So it is doubly stochastic.
Now let A = α1 B + α2 C, with A a permutation matrix such that α1 , α2 ∈ (0, 1),
α1 + α2 = 1, and B, C are doubly stochastic matrices. Then every entry of B, C
that corresponds to the zero element aij = 0 of A must be zero; indeed, 0 = aij =
α1 bij + α2 cij . As α1 , α2 both are non zero, and B, C are both nonnegative, we have
bij = cij = 0. Hence, nonzero entries must be all +1, A = B = C. This shows
that every permutation matrix is an extreme point of the set of doubly stochastic
matrices.

5
1.3 Doubly Stochastic Matrices and Majorization

Theorem 1.8. A matrix A ∈ Mn (R) is doubly stochastic if and only if Ax ≺ x, for


all x ∈ Rn .

Proof. For the implication (⇐), assume Ax ≺ x for all vectors x. Then

n
X n
X
Ax = x.
i=1 i=1

From the definition of majorization. If we choose x to be ej , where ej is the vector


ej = (0, · · · , 0, 1, 0, · · · , 0), 1 ≤ j ≤ n, then

       
 a11 · · · a1n  0  a1j  0
 . ..     .   
 ..    .   
  1 =  .  ≺ 1 .
.  ×
       
an1 · · · ann 0 anj 0

Then min{a1j , · · · , anj } ≥ min{1, 0} = 0, so akj ≥ 0 for all k. Also this implies,
as j was arbitrary, that the sum over each column is 1. To show the sum over the
rows is also 1 we use the vector e = nj=1 ej ; we get
P

       
Pn
 a11 · · · a1n  1  j=1 a1,j  1
 . ..  .  ..  .
 .. .   ≺  ..  .
  × . = 
.  .   
    P   
n
an1 · · · ann 1 j=1 an,j 1

Pn Pn
aij : j} ≤ 1, min{ nj=1 aij ; j} ≥ 1.
P
Then j=1 aij = 1 for all i, since max{ j=1

For the other direction (⇒), let A be doubly stochastic, and let y = Ax. To prove

6
y ≺ x, we first show that we can assume x and y have their entries in non-increasing
order; this because x = Px↓ , and y = Qy ↓ for some permutation matrices P and Q,
so

Qy ↓ = APx↓

y ↓ = Q−1 APx↓ ,

y ↓ = Bx↓ .

where Q−1 AP = B is doubly stochastic, since the permutation matrices are doubly
stochastic, and the product of doubly stochastic matrices is doubly stochastic by
Proposition 1.6.
For any k ∈ {1, · · · , n} we have

k
X k X
X n
yj = bji xi . (1.5)
j=1 j=1 i=1

Pk Pn
Let si = j=1 bji , then 0 ≤ si ≤ 1, i=1 si = k and

k
X n
X
yj = s i xi .
j=1 i=1

Then
k
X k
X n
X k
X
yj − xj = s i xi − xi . (1.6)
j=1 j=1 i=1 i=1

7
Pn
By adding and subtracting i=1 si xk from equation (1.6),

k
X k
X n
X k
X n
X n
X
yj − xj = s i xi − xi + ( si − si )xk
j=1 j=1 i=1 i=1 i=1 i=1
n
X k
X n
X n
X
= s i xi − xi + (k − si )xk , since k = si
i=1 i=1 i=1 i=1
Xk Xn k
X k
X n
X
= s i xi + s i xi − xi + xk − s i xk
i=1 i=k+1 i=1 i=1 i=1
k
X n
X k
X k
X k
X n
X
= s i xi + s i xi − xi + xk − s i xk − si x k
i=1 i=k+1 i=1 i=1 i=1 i=k+1
k
X n
X k
X
= − (1 − si )xi + (xi − xk )si + (1 − si )xk
i=1 i=k+1 i=1
k
X k
X
= (si − 1)(xi − xk ) + (xi − xk )si
i=1 i=k+1
≤ 0.

Pk Pk
So j=1 yj ≤ j=1 xj for all k. When k = n,

n
X n X
X n
yj = bji xj
j=1 j=1 i=1
X X
= ( bi1 )x1 + · · · + ( bin )xn
i i
n
X
= xj .
j=1

Definition 1.9. Let A ∈ Rn be a linear map. We say that A is a T-transform if


there exists a ∈ [0, 1] and j, k such that

Ay = (y1 , · · · , yj−1 , ayj + (1 − a)yk , yj+1 , · · · , (1 − a)yj + ayk , yk+1 , · · · , yn ),

8
i.e. A = aI + (1 − a)P, where P is the transposition (jk).

Theorem 1.10. Given x, y ∈ Rn , the following statements are equivalent:

1. x ≺ y.

2. x = T y, where T is a product of T -transforms.

3. x ∈ conv Sy = conv{Py : P is a permutation}.

4. x = Ay , where A is doubly stochastic matrix.

Proof. (1) =⇒ (2):


We want to show that if x ≺ y, then x = (Tr · · · T1 )y for some T-transforms
T1 , · · · , Tr . We will show this is true for any n by induction. Let x, y ∈ Rn . We can
assume that x, y have their coordinates in decreasing order by permuting them, and
each of these permutations is a product of T -transforms (because transpositions are
T - transforms, and permutations are products of transpositions). So when x ≺ y, we
have yn ≤ x1 ≤ y1 . If we take k ≤ n such that yk ≤ x1 ≤ yk−1 , then x1 = ty1 +(1−t)yk
for some t ∈ [0, 1], and we define T1 y = (ty1 + (1 − t)yk , y2 , · · · , yk−1 , (1 − t)y1 +
tyk , yk+1 , · · · , yn ). Let

x0 = (x2 , · · · , xn ), y 0 = (y2 , · · · , yk−1 , (1 − t)y1 + tyk , yk+1 , · · · , yn ) ∈ Rn−1 .

Note that the first coordinate of T1 y is x1 . If we take off x1 from x and T1 y, then
x = x0 and T1 y = y 0 . We will show that x0 ≺ y 0 . For m such that 2 ≤ m ≤ k − 1,

m
X m
X m
X
yj ≥ x1 ≥ xj .
j=2 j=2 j=2

9
And for k ≤ m ≤ n we have

m−1
X k−1
X m
X
yj0 = yj + [(1 − t)y1 + tyk ] + yj
j=1 j=2 j=k+1
k−1
X m
X
= yj + y1 − ty1 + tyk + yj
j=2 j=k+1
Xm
= yj − ty1 − (1 − t)yk , by adding and subtracting yk
j=1
m
X
= yj − x1
j=1
m
X m−1
X
≥ xj − x1 = x0j
j=1 j=1

The last inequality is an equality when m = n since x ≺ y and so x0 ≺ y 0 . By


induction hypothesis there exist a finite number of T-transforms T2 , · · · , Tr on Rn−1
such that x0 = (Tr , · · · , T2 )y 0 . We can regard each of them as T-transform on Rn if
we don’t touch the first coordinate of any vector. Then we will have

(Tr · · · T1 )y = (Tr · · · T2 )(x1 , y 0 ) = (x, x0 ) = x.

(2)=⇒(3):
It is clear that each T -transform is a convex combination of permutations. Now
we have to show that a product of two convex combinations of permutations is a
convex combination of permutations. For Pj Qj where each of them is a permutation;
tj sj ≥ 0, ∀j, k. We have

l
! m
! l X
m
X X X
tj Pj sk Qk = tj sk Pj Qk .
j=1 k=1 j=1 k=1

10
Where
l X
X m l
X m
X l
X
t j sk = tj sk = tj = 1.
j=1 k=1 j=1 k=1 j=1

(3) =⇒(4):
Is trivial, since we have x ∈ conv(Py), and from Proposition 1.6 we know that a
convex combination of permutation matrices is doubly stochastic.
(4)=⇒(1):
This is Theorem 1.8.

Definition 1.11. If B is a square matrix, and P is a permutation, then we call the


set { b1P(1) , b2P(2) , · · · , bnP(n) } a diagonal of B.

Each element in the diagonal of B has one entry from each column and each row.

Theorem 1.12. (The König- Frobenius Theorem)


Given a square matrix B, then every diagonal of B contains a zero element if and
only if B has an i × j submatrix with all entries zero for some i, j such that i + j > n.

Proof. This is equivalent to Hall’s Theorem.

Theorem 1.13. (Birkhoff ’s Theorem)


The set of n × n doubly stochastic matrices is a convex set whose extreme points are
the permutation matrices.

Proof. To prove this we have to show two things. First, that a convex combination of
doubly stochastic matrices is doubly stochastic; this was proven in Proposition 1.7.
Second we have to show that every extreme point is a permutation matrix, and
for this we will show that each doubly stochastic matrix is a convex combination of a
permutation matrix. This can be proved by induction on the number of nonnegative

11
entries of the matrix. When A has n positive entries, if A is doubly stochastic, then
A is a permutation matrix.
Let A be doubly stochastic. Then A has at least one diagonal with no zero entry;
indeed, let [0k×l ] be a submatrix of zeros that A might have. In such case we can find
permutation matrices Q1 , Q2 such that
 
 0 B
Q1 AQ2 =   ; 0 is a k × l submatrix with all entries zero.
C D

Q1 AQ2 is doubly stochastic which means the sum of the rows of B is 1 and the sum
of the columns of C is 1. i.e.

n−l
X
bhi = 1, ∀h = 1, · · · , k,
i=1

and
n−k
X
cih = 1, ∀h = 1, · · · , l.
i=1

Also, looking at the rows and columns that intersect D,

l
X n−l
X k
X n−k
X
chi + dhi = 1, bhi + dhi = 1.
i=1 i=1 h=1 h=1

Let
k X
X n−l l X
X n−k
k= bhi , and l = cih .
i=1 i=1 i=1 i=1

12
Then

k X
X n−l n−l X
X k
k= bji = bji
j=1 i=1 i=1 j=1
n−l n−k
!
X X
= 1− dji
i=1 j=1
= n − l − d, where d is a positive number.

Hence , k + l ≤ n. So by The König- Frobenius Theorem 1.12 at least one diagonal


of A must have all its entries positive.
Now suppose A is a doubly stochastic matrix with n + k non-zero entries. By
the previous paragraph A has a “never zero” diagonal, given by some permutation
Q. Let a be the minimum element of this diagonal. Clearly a < 1, because otherwise
A would have a 1 in each row and column, making it a permutation, with only n
non-zero entries. Let
A − aQ
B= .
1−a

Then B is doubly stochastic, and the entry in B corresponding to the location of a


is zero, so B has at most n + k − 1 non-zero entries. By induction hypothesis B is
a convex combination of permutation matrices. Since A = (1 − a)B + aQ, it is clear
that A is a convex combination of a permutation matrices too.

13
Chapter 2

The Schur-Horn Theorem in the Finite


Dimensional Case

In this chapter we study some variants of the Pythagorean Theorem. The Py-
thagorean Theorem plays an important role in describing the relation between the
three sides of the right triangle in Euclidean geometry. Among the variations of the
Pythagorean Theorem that we will consider, some are trivial while others are not.
We will find that these can be solved by using the Schur-Horn Theorem.

2.1 The Pythagorean Theorem in Finite Dimension

In the following we will represent the Pythagorean Theorem (PT) in different


dimensions, beginning with the classical variant. Also we will formulate the converse
of the Pythagorean Theorem, which we call the Carpenter Theorem (CT)[8].

Theorem 2.1. (PT-1)


If we have right triangle, such that its two sides are x,y and the angle between them
is θ = π2 , then x2 + y 2 = z 2 .

14
Although less known, the converse of (PT-1) holds. We call this the Carpenter
Theorem (CT).

Theorem 2.2. (CT-1)


If we have a triangle with sides x, y, z, such that x2 + y 2 = z 2 , then θ = π2 , i.e. we
have a right triangle.

Let {e1 , e2 } be an orthonormal basis for R2 . Then for x ∈ R2 , we can write x as


linear combination of {e1 , e2 }, and in this case we can re-write Theorem 2.1 as

Theorem 2.3. (PT-2) If x = t1 e1 + t2 e2 , and kxk = 1, then |t1 |2 + |t2 |2 = 1.

Proof. Since the norm of x is one we have

1 = kxk2 = kt1 e1 k2 + kt2 e2 k2

= |t1 |2 ke1 k2 + |t2 |2 ke2 k2

= |t1 |2 + |t2 |2 .

Note that (PT-2) is Parseval’s equality, which says that if {ej : j ∈ J} is an


orthonormal basis in H, then for every x ∈ H the following equality holds:

X
kxk2 = |hx, ej i|2 .
j∈J

In what follows, we denote by PK x the orthogonal projection of x onto the sub-


space K.

Theorem 2.4. (CT-2):


If t1 , t2 ∈ R+ , and t21 + t22 = 1, then there exists x ∈ R2 such that kxk = 1 and
kPRe1 xk = t1 , kPRe2 xk = t2 .

15
Proof. Let x = t1 e1 + t2 e2 . Then

kxk2 = kt1 e1 + t2 e2 k2

= kt1 e1 k2 + kt2 e2 k2

= t21 + t22 = 1.

As PRe1 x = hx, e1 ie1 ,

2
X
2 2
kPRe1 xk = |hx, e1 i| , where x = ti ei
i=1
X2
= |h ti ei , e1 i|2
i=1
= t21 .

And the same thing for kPRe2 xk2 = t22 .

Since kxk = ke1 k = 1, then kPRe1 xk = |hx, e1 i|2 = |he1 , xi|2 = kPRx e1 k. From this
point of view, we can rephrase the (PT-2) and (CT-2) as,

Theorem 2.5. (PT-3):


If K is a one-dimensional subspace of R2 , then kPK e1 k2 + kPK e2 k2 = 1.

Theorem 2.6. (CT-3):


If t1 , t2 ∈ R+ , and t1 + t2 = 1, then there exists one-dimensional K ⊂ R2 , such that
kPK e1 k2 = t1 , kPK e2 k2 = t2 .

Next we see that the same results hold in Rn :

Theorem 2.7. (PT-4):

16
If K is a one-dimensional subspace of Rn , and {ej }nj=1 an orthonormal basis, then
Pn 2
j=1 kPK ej k = 1.

tj ej to be a unit vector for K ⊂ Rn , then it spans K and


P
Proof. We choose x =

kPK ei k2 = khei , xixk2

= |hei , xi|2 kxk2

= |hei , xi|2 .

kPK ej k2 = |hej , xi|2 = kxk2 by Parseval’s equality.


P P
Then j j

Theorem 2.8. (CT-4):


Pn
If t1 , · · · , tn ∈ [0, 1], and j=1 tj = 1, then there exists a one-dimensional subspace
K ⊂ Rn such that kPK ej k2 = tj , j = 1, · · · , n.

Pn 1
Proof. Let x = j=1 tj ej and put K = Cx. Then
2

PK ei = hei , xix
n
X 1
= hei , tj2 ej ix
j=1
n
X 1
= tj2 hei , ej ix
j=1
1
= ti2 x.

1
So kPK ei k2 = kti2 xk2 = ti .

In the following we are going to generalize the Pythagorean Theorem in Rn , by


allowing K to have different dimensions.

17
Theorem 2.9. (PT-5):
Pn
If K is an m-dimensional subspace of Rn , then i=1 kPK ei k2 = m.

Proof. If we choose f1 , · · · , fm to be an orthonormal basis for K ⊂ Rn , then the


projection of ei onto K is
m
X
P K ei = hei , fj ifj .
j=1

So

n
X n
X m
X
2
kPK ei k = k hei , fj ifj k2
i=1 i=1 j=1
n
XX m
= |hei , fj i|2
i=1 j=1
Xm X n
= |hei , fj i|2
j=1 i=1
Xm m
X
2
= kfj k = 1 = m.
j=1 j=1

The converse of (PT-5) would be

Theorem 2.10. (CT-5):


If {ti }ni=1 ⊂ [0, 1], and ni=1 ti = m, then there exists an m-dimensional subspace K
P

of Rn , such that kPK ei k2 = ti , i = 1, · · · , n.

Suddenly, its is not so obvious how to construct K. So first we will attempt to


reformulate the theorem.
If K ⊂ Rn , PK is the orthogonal projection of Rn on K, and e1 , · · · , en is an

18
orthonormal basis with (tij ) the matrix of PK , then

kPK ej k2 = hPK ej , PK ej i

= hPK ej , ej i

= tjj ,

Pn Pn
since PK = PK2 = PK∗ . Then i=1k kPK ei k2 = i=1 ti . Which we can write as
Pn
i=1 kPK ei k2 = tr (PK ). With this in mind, we can rewrite (PT,CT-5) as:

Theorem 2.11. (PT-6):


If K is an m-dimensional subspace of Rn , then tr (PK ) = m.

Theorem 2.12. (CT-6):


Pn n
If t1 , · · · , tn ∈ [0, 1], and j=1 tj = m, then there is K ⊂ R , such that the diagonal

of PK is (t1 , . . . , tn ).

This formulation of (CT-6) makes it clear that its proof is not going to be trivial
as the previous results of (PT-CT). If we have these numbers t1 , · · · , tn ∈ [0, 1],
such that their sum is m ∈ N and we want to look for K ∈ Rn with kPK ei k2 = ti
for all i; in short, we want to form a matrix of PK with diagonal of ti , such that
PK = PK∗ = PK2 . It is not obvious that such a thing is even possible. If we try to find
n(n+1) n(n−1)
a projection in that way we will get 2
equations with 2
variables, as we see
in the next example.

Example 2.13. Take PK to be a 2 × 2 matrix such that PK = PK∗ = PK2 , and such

19
that the diagonal of PK is (t, 1 − t) for a fixed t ∈ [0, 1]. So

   
2 2
t x  2 t + x x
PK =   , PK =  .

2 2
x 1−t x x + (1 − t)

As these two should be equal, we get 2 equations t = t2 + x, 1 − t = x2 + (1 − t)2



with the single unknown x. In this particular case one can check that x = 1 − t2
gives a solution. But for a 10 × 10 matrix we will have 55 equations in 45 unknowns.
The bigger the projection, the more equations we have to deal with, and the systems
will always be over-determined. This is an issue, because such systems may have no
solution. We will soon see, however, that this problem can be solved in general and
that the Schur-Horn Theorem is the way to go.

2.2 The Schur-Horn Theorem in the Finite Dimensional Case

The Schur-Horn theorem characterizes the relation between the eigenvalues and
the diagonal elements of selfadjoint matrix by using majorization.

Theorem 2.14. (Schur 1923 [14])


If A ∈ Mn (C)sa , then diag(A) ≺ λ(A), where diag(A) is the diagonal of A and λ(A)
is the eigenvalue list of A.

Proof. Let A = U DU ∗ , where D is diagonal matrix and U is a unitary. Then the

20
diagonal of A is given by

X
akk = Dhl Ukh Ulk∗
h,l
X
= λl Ukl Ukl (2.7)
l
X
= λl |Ukl |2 .
l

Define a matrix T by Tkl = |Ukl |2 . The fact that U ∗ U = U U ∗ = I implies that T


is doubly stochastic. Equation 2.7 shows that diag(A) = T λ(A). By Theorem 1.10,
diag(A) ≺ λ(A).

Horn [6] proved in 1954 the converse of Schur’s Theorem 2.14. We offer a proof
following ideas of Kadison [8, Theorem 6]. A very similar proof appears in Arveson-
Kadison [3, Theorem 2.1], but using only results from Kadison with no acknowledge-
ment whatsoever of the well-known results in majorization theory that we outlined
in Chapter 1.
The following lemma contains Kadison’s key idea.

Lemma 2.15. Let A ∈ Mn (C) with diagonal y. Let T be a T -transform. Then there
exists a unitary U ∈ Mn (C) such that U AU ∗ has diagonal T y.

21
Proof. Let A be an n × n matrix. Define a unitary U by

i j
1
 
p p
...

 p 0 p 0 
 1 p p 
i 
- - - ξ sin θ - - - − cos θ - - - 

U=

 p 1 p 

...
0 0  .
 
 p p
1

 p p 
j - - - ξ cos θ - - - sin θ - - - 


 p p 1 
... 
 

0 p 0 p
p p 1

Where ξ ∈ C, with |ξ| = 1, and bij ξ = −bij ξ. Then a straightforward computa-


tion shows that diag(U AU ∗ ) = t . y+(1−t) . yσ , Where t = sin2 θ, σ = (i j) ∈ Sn .

Theorem 2.16. (Horn 1954 [6])


If x, y ∈ Rn , and x ≺ y, then ∃ A ∈ Mn (C)sa such that

diag(A) = x, λ(A) = y

Proof. Let x = (x1 , · · · , xn ), y = (y1 , · · · , yn ) such that x ≺ y. By Theorem 1.10

x ≺ y ⇐⇒ x = (Tr · · · T1 )y, where T1 , · · · , Tr are T − transform.

Let A1 ∈ Mn (R) with diagonal y and zeroes elsewhere. By Lemma 2.15, there exists
a unitary V1 such that A2 = V1 A1 V1∗ has diagonal T1 y. Similarly, there exists a
unitary V2 such that A3 = V2 A2 V2∗ has diagonal T2 (T1 y) = T2 T1 y. Repeating this,

22
after r steps, we will have unitaries V1 , · · · , Vr such that A = Vr · · · V1 A1 V1∗ · · · Vr∗
has diagonal Tr · · · T1 y = x. As unitary conjugation preserves the spectrum, A has
spectrum y and diagonal x.

We can rephrase Schur’s result by saying that, that for every x ∈ Rn ,

{Mx : x ≺ y} ⊇ D{U My U ∗ : U ∈ U(n)},

where Mx , My are diagonal matrices that have x, y at the diagonal. So if we conjugate


My with the unitary matrix U we still have the same vector y at the diagonal of My .
And Horn proved the other inclusion, i.e. for every x ∈ Rn ,

{Mx : x ≺ y} ⊆ D{U My U ∗ : U ∈ U(n)}.

So we can rephrase both theorems together as follows :

Theorem 2.17. (Schur-Horn Theorem)


For every x ∈ Rn ,

{Mx : x ≺ y} = D{U My U ∗ : U ∈ U(n)}.

Now we can prove (CT-6) as follows:


Pn
Proof of (CT-6). If a = (a1 , · · · , an ) ∈ [0, 1]n and j=1 aj = m, then a ≺
m times
z }| {
(1, · · · , 1, 0, · · · , 0). By the Schur-Horn Theorem, there exists a self adjoint matrix
m times
z }| {
P ∈ Mn (C) with diagonal a and eigenvalues (1, · · · , 1, 0, · · · , 0). The minimal poly-
nomial of P is f (t) = t(1 − t). So P (I − P ) = 0, i.e. P = P 2 . Thus, P is a

23
projection.

2.3 A Pythagorean Theorem for Finite Doubly Stochastic

Matrices

In this section we consider, following Kadison [8, 9], certain differences of sums of
entries of doubly stochastic matrices. We will use results here to generalize Theorem
2.18 below to the infinite dimensional case (Chapter 3).

Theorem 2.18. Let K be an m-dimensional subspace of H, and e1 , · · · , en an or-


thonormal basis for H. If a = ri=1 kPK ei k2 , b = ni=r+1 kPK ⊥ ei k2 , then
P P

a − b = m − n + r.

Proof. As PK ⊥ = I − PK , we have

kPK ⊥ ei k2 = hPK ⊥ ei , ei i

= h(I − PK )ei , ei i

= 1 − hPK ei , ei i = 1 − kPK ei k2 .

Pr Pn
So a = i=1 ai , b = i=r+1 1 − ai , where ai = kPK ei k2 , and thus (using Theorem

24
2.9)

r
X n
X
a−b = ai − ( 1 − ai )
i=1 i=r+1
r
X n
X n
X
= ai + ai − 1
i=1 i=r+1 r+1
n
X
= ai − (n − r)
i=1
= tr (PK ) − (n − r)

= m − n + r.

Definition 2.19. Let A ∈ Mm,n (R). Fix subsets K ⊂ {1, · · · , m}, L ⊂ {1, · · · , n}.
Then we can construct the block or submatrix B by taking only the rows in K and the
columns in L of A. The complement of the block B is the matrix B 0 with remaining
rows and columns of A. The sum of all entries of the block B is the weight of the
block B, and we write it as w(B).

Definition 2.20. A doubly stochastic matrix A = (aij ) is said to be Pythagorean if


there is a Hilbert space H with {ei }, {fj } orthonormal basis, such that

aij = |hei , fj i|2 , for all i, j.

The following result can be seen as a “Pythagorean Theorem” for doubly stochas-
tic matrices.

Theorem 2.21. If A is a doubly stochastic matrix and B is a block in A with p rows


and q columns, then w(B) − w(B 0 ) = p − q + n.

25
Proof.

X X
w(B) − w(B 0 ) = ajl − ajl
j∈K j∈K ⊥
l∈L l∈L0
! !
X X X X
= 1− ajl − 1− ajl
j∈K l∈L0 l∈L0 j∈k
 
X X 
= |K| − ajl − |L0 | − ajl 

j∈K j∈K
l∈L0 l∈L0
0
= |K| − |L |

= |K| − (n − |L|)

= |K| + |L| − n

We can use Theorem 2.21 to give another proof of Theorem 2.18:

Proof. Let K be an m-dimensional subspace of H, and K ⊥ its orthogonal comple-


ment. Choose {e1 , · · · , en }, {f1 , · · · , fm }, {fm+1 , · · · , fn } to be orthonormal bases
for H, K and K ⊥ . Let A = (ajk ) be the n × n doubly stochastic matrix given
by ajk = |hej , fk i|2 . Let B be the submatrix of A given by the first r rows and m
columns of A. If Pk is the projection onto K, then Pk ei = m
P
j=1 hei , fj ifj , (1−Pk )ei =
Pn 2
Pm 2
Pm Pr 2
j=m+1 hei , fj ifj and kPK ei k = j=1 |hei , fj i| = j=1 aij , so i=1 kPK ei k =
Pr Pm
i=1 j=1 aij = w(B).
Pn Pn Pn
Similarly, kPK ⊥ ei k2 = 2
j=m+1 |hei , fj i| = j=m+1 aij , so
2
i=r+1 kPK ⊥ ei k =
Pn Pn 0
i=r+1 j=m+1 aij = w(B ). By Theorem 2.21,

r
X n
X
2
kPK ei k − kPK ei k2 = w(B) − w(B 0 ) = m − n + r.
i=1 i=r+1

26
Chapter 3

The Carpenter Theorem in the Infinite


Dimensional Case

In the second chapter we dealt with the finite dimensional space Rn , and we
showed many cases of the Pythagorean Theorem. Here we will deal with infinite
dimensional Hilbert spaces and we will discuss two cases of the Carpenter Theorem.
The first case when the subspace K ⊂ H and its orthogonal complement K ⊥ have
infinite dimension, and the second case when one of the subspaces K, K ⊥ has finite
dimension [9].
We include here several definitions that we will need to refer to operators on an
infinite-dimensional Hilbert space.

Definition 3.1. For an operator A ∈ B(H) we say A is positive if A = A∗ , and the


spectrum of A consists of positive real numbers. i.e. if (Ax, x) ≥ 0, ∀x ∈ H.

Definition 3.2. Let {ei } be an orthonormal basis in B(H) and A ∈ B(H) be a

27
positive operator. Then we say A is a trace-class operator if


X
tr (A) = hAei , ei i < ∞.
i=1

If the sum is finite for one orthonormal basis, then it is finite and has the same
value for any other orthonormal basis. For an arbitrary A, we say it is trace-class if
1
(A∗ A) 2 is trace-class.

3.1 The Subspaces K and K ⊥ both have Infinite Dimension

We will start with some facts that we are going to use later.

Lemma 3.3. If P ∈ B(H) is a projection that has p1 , p2 , · · · as diagonal elements,


and σ : N → N is any bijection, then there exists a projection P0 ∈ B(H) with
pσ(1) , pσ(2) , · · · as diagonal elements.

Proof. We have P ∈ B(H) is a projection with diagonal {pi }i∈{1,2,··· } i.e.

pi = hPei , ei i,

where {ei } is an orthonormal basis. We define a unitary U by U ei −→ eσ(i) . Then we


define a projection P0 = U ∗ PU. The ith diagonal element of this projection is given

28
by

hP0 ei , ei i = hU ∗ PU ei , ei i

= hP U ei , U ei i

= hPeσ(i) , eσ(i) i

= Pσ(i) .

Lemma 3.4. Let α1 , α2 , · · · , αn , β ∈ [0, 1], such that α1 +α2 +· · ·+αn = β+m, m ∈
N. Then
m
z }| {
(α1 , α2 , · · · , αn ) ≺ (β, 1, . . . , 1, 0).

Proof. Since αj ∈ [0, 1] for all j, we have

α1 ≤ 1

α1 + α2 ≤ 1 + 1
..
.

α1 + · · · + αm ≤ m.

As α1 + α2 + · · · + αn = β + m, we have for any k ∈ {m + 1, · · · , n},

α1 + · · · + αk ≤ α1 + α2 + · · · + αn = m + β.

The following lemma is proven in [5]. While the result certainly looks obvious
—as expected from the finite-dimensional analogue— and its proof is not very hard,
it is not elementary either. We will generalize it in the proof of Theorem 3.8.

29
Lemma 3.5. Let P, Q ∈ B(H) be two orthogonal projections such that P − Q is
trace-class. Then tr (P − Q) ∈ Z.

Lemma 3.6. Let T ∈ B(H) be a trace class operator. Let R1 , R2 , · · · be pairwise


P
orthogonal finite-rank projections with k Rk = I. Then

X
tr (T ) = tr (T Rk ).
k

P P P
Proof. k (T Rk ) =T k (Rk ) = T I. This implies T = k T Rk . So, if we construct
an orthonormal basis {ei } by joining orthonormal bases corresponding to each Rk H,
we get

X
tr (T ) = hT ei , ei i
i
X
= h T Rk ei , ei i
i,k
X X
= hT Rk ei , ei i
k ei ∈ rang Rk
X
= tr (T Rk ).
k

Let K be subspace in H and consider the orthogonal projection QK onto K.


Let q1 , q2 , · · · be the diagonal of QK . If QK has finite rank (i.e. if dim K < ∞),
P P
then j qj = tr (QK ) ∈ N. But what happens if j qj = ∞ (i.e. dim K = ∞)?
Since QK is a projection, then I − QK is going to be a projection too with diagonal
P
1 − q1 , 1 − q2 , · · · . Therefore tr (I − QK ) = j (1 − qj ) will have to be an integer
if finite (this is an elementary exercise in Functional Analysis, or it can be obtained
from Effros’ Lemma 3.5). For example, if qj = 1 − j12 , then the diagonal of I − QK
2
would be j j12 = π6 , not an integer. So no projection with diagonal {1 − j12 }
P

30
exists. That is, when K is finite-dimensional or it is orthogonal complement K ⊥ is,
we obtain an obstruction to what the possible diagonals of QK are. We will address
this case in Section 3.2.
What about when K, K ⊥ are both infinite-dimensional (i.e.
P P
j qj = j (1−qj ) =
∞)? The next Theorem (3.8) will illustrate this condition and give us the complete
idea of the whole situation. The proof of Theorem 3.8 in the original paper [9] is kind
of complicated, so we tried to simplify it as much as we could.

Definition 3.7. Let {en } be an orthonormal basis. Then we define the “ matrix
units” {Emn }m,n associated to {en } as the rank-one operators

Emn x = hx, en iem , x ∈ H.

The following is the main result in this thesis.

Theorem 3.8. Given {ej }j∈N an orthonormal basis of H and {aj }j∈N ⊂ [0, 1], the
following statements are equivalent:

1. There exists an infinite dimensional subspace K ⊂ H with infinite dimensional


complement such that kPk ej k2 = aj ∀j ∈ N.

P P
2. j∈N aj = ∞ and j∈N (1 − aj ) = ∞; and either (i) or (ii) holds:
(i) a = ∞ or b = ∞;
(ii) a < ∞, b < ∞, and a − b ∈ Z,

P P
where a = aj ≤1/2 aj , b = aj >1/2 aj .

Proof. 2(i) =⇒ (1):


First we will show that when a = ∞, then there exists a projection P with diagonal

31
{aj }. Let N0 = {j : aj = 0 or aj = 1}, K = span{ej : j ∈ N0 }. Then for
/ N0 , aj ∈ (0, 1). If we find P0 on B(K ⊥ ) with diagonal {aj : j ∈
j ∈ / N0 }, then
P
P = P0 + P1 satisfies 1, where P1 = j∈N0 aj Ejj . So we will assume that aj ∈ (0, 1)
for all j.
We consider a decomposition {aj } = {a00j } {a0j }, where 0 ≤ a00j ≤ 12 , and 12 <
S
P∞ 00 P∞
a0j ≤ 1, so a, b will be j=1 aj ,
0 00 0
j=1 1 − aj . Let aj = aγ(j) , aj = aδ(j) , and

N0 = {δ(n) : n ∈ N}, N00 = {γ(n) : n ∈ N}. Let n(1) = min{n : a01 +a001 +· · ·+a00n ≥ 3};
notice that each a00j ≤ 21 , a01 < 1 so n(1) ≥ 5. Let

b1 = a001

(b2 , · · · , bn(1) ) = (a002 , · · · , a00n(1) )↓ .

Let m(1) = min{n : a01 + b1 + · · · + bn ≥ 3}; then 5 ≤ m(1) ≤ n(1). Let

m(1)−1
X
ǎ = 3 − a01 − bj
1
b0m(1)−1 = bm(1)−1 + ǎ

b0m(1) = bm(1) − ǎ.

Pm(1)−1 Pm(1)−2
As a01 + m(1)−2 bj < 3, a01 + 1 bj + b0m(1)−1 = 3, we have ǎ ≥ 0 and

0 ≤ b0m(1) < bm(1) ≤ bm(1)−1 < b0m(1)−1 ≤ 1.

Let N1 = {δ(1), γσ1 (1), · · · , γσ1 (m(1) − 1)}, where bj = a00σ1 (j) = aγσ1 (j) for certain
permutation σ1 . Let j(1) = γσ1 (m(1)), j(2) = δ(2), and {j(n)}n≥3 an increasing
enumeration of N00 \ (N1 ∪ {j(1), j(2)}). Let n(2) = min{n : a02 + b0m(1) + n3 aj(k) ≥ 3}.
P

32
Let

c1 = b0m(1)

c2 = a02

c3 = aj(3)

(c4 , · · · , cn(2) ) = (aj(4) , · · · , aj(n(2)) )↓ .

Pm Pm(2)−1
Let m(2) = min{m : j=1 cj ≥ 3}; then j=1 cj < 3, 6 ≤ m(2) ≤ n(2). Define

m(2)−1
X
b̌ = 3 − cj
j=1

c0m(2)−1 = cm(2)−1 + b̌

c0m(2) = cm(2) − b̌.

Then 0 ≤ b̌ ≤ cm(2)−1 , and 0 ≤ c0m(2) ≤ cm(2) ≤ cm(2)−1 ≤ c0m(2)−1 ≤ 1. Let N2 =


{j(1), j(2)} ∪ {n : ∃k, 3 ≤ k ≤ m(2) − 1 with ck = an }. Let k(1) be such that ak(1)
is cn(2) , k(2) = δ(3), write

N00 \ (N1 ∪ N2 ) = {k(1), k(2), · · · }

with {k(3), k(4), · · · } in ascending order.

33
Pn
Let n(3) = min{n : a03 + c0m(2) + r=3 ak(r) ≥ 3} and let

d1 = c0m(2)

d2 = a03

d3 = ak(3)

(d4 , · · · , dn(3) ) = (ak(4) , · · · , ak(n(3)) )↓ ,

Pm Pm Pm(3)−1
and let m(3) = min{n : j=1 dj ≥ 3}. Then j=1 dj ≥ 3, j=1 dj ≤ 3, and
6 ≤ m(3) ≤ n(3). Let

m(3)−1
X
č = 3 − dj ,
j=1

d0m(3)−1 = dm(3)−1 + č,

d0m(3) = dm (3) − č,

so that 0 < č ≤ dm(3) ≤ 21 , and 0 ≤ d0m(3) < dm(3) ≤ dm(3)−1 < d0m(3)−1 ≤ 1.
By repeating these processes we will build pairwise disjoint subsets N1 , N2 , · · · of
N such that N1 ∪ N2 ∪ · · · = N. We can write Nj = {pj (1), · · · , pj (m(j) − 1)}, then

m(j)−2
X
b0m(j−1) + a0j + apj (k) + b0m(j)−1 = 3. (3.8)
k=3

By Theorem 2.12 we will get a self adjoint projection Ej with diagonal

{b0m(j−1) , a0j , apj (3) , · · · , apj (m(j)−2) , b0m(j)−1 }.

34
We also have
b0m(1) + b0m(j)−1 = bm(1) + bm(j)−1 , (3.9)

and
0 ≤ b0m(j) ≤ bm(j) ≤ bm(j)−1 ≤ b0m(j)−1 . (3.10)

M
If we write P = Ej , we get
j=1

 
0
a1 
 
 a
 p1 (2) 0 0 

 
..
.
 
 
 
 

 ap1 (m(1)−2) 

 

 b0m(1)−1 

 
b0m(1)
 
 
 
 
 a02 
P = .
 



0 ap2 (2) 0 

 ... 

 
 

 b0m(2)−1 

 
b0m(2)
 
 
 
 
 
 
.. 
0 0


 .

 

The projection P has all the aj in its diagonal with the exception of the pairs

35
b0m(j)−1 , b0m(j) in place of bm(j)−1 , bm(j) . We will now construct a unitary operator
that will conjugate b0m(j)−1 , b0m(j) into bm(j)−1 , bm(j) . So let U be

 
1
 
..
0 0
 

 . 

 

 1 

 
sin θ1 − cos θ1
 
 
 
 

 cos θ1 sin θ1 

 
1
 
 
 
...
U =


0 0 ,


 

 1 

 

 sin θ2 − cos θ2 

 
 

 cos θ2 sin θ2 

 
 
 
..
 



0 0 . 

where θ1 , θ2 , · · · are to be determined. Let Q = U P U ∗ . Then every entry of Q


outside the 2 × 2 blocks agrees with P . In the 2 × 2 blocks,

   ∗
0
 sin θj − cos θj  bm(1)−1 0   sin θj − cos θj 
    =
cos θj sin θj 0 b0m(1) cos θj sin θj

36
 
0 2 0 2
bm(1)−1 sin θ + bm(1) cos θ ∗ 
 
∗ b0m(1) sin2 θ + b0m(1)−1 cos2 θ

The conditions 3.9 and 3.10 guarantee that for each j there exists tj ∈ [0, 1]
such that

bm(j) = tj b0m(j) + (1 − t)b0m(j)−1 , bm(j)−1 = (1 − tj )b0m(j) + tj b0m(j)−1 .

Choosing θj so that tj = sin θj , we get the desired diagonal for Q. This proves the
case a = ∞. When b = ∞, we can repeat the proof for the coefficients bi = 1 − ai .
That way we obtain a projection with E with diagonal 1 − aj . Then I − E is the
projection we are looking for.
2(ii) =⇒ 1:
We will show that if a < ∞, b < ∞, a − b ∈ Z, then there exists a projection P
with diagonal {aj }. By using Lemma 3.3, we can reorder the numbers {aj } ∈ [0, 1]
as needed. Again write 0 ≤ a00j ≤ 21 < a0j ≤ 1, where {a00j } {a0j } = {aj }. Let
S

a00j = aγ(j) , a0j = aδ(j) such that N00 = {γ(n) : n ∈ N}, N0 = {δ(n) : n ∈ N}. Then
00 0
P P
a = aj ∈N00 aj , b = aj ∈N0 1 − aj . Since a, b are finite, we can get finite subsets

N001 ⊂ N00 , N01 ⊂ N0 such that

X X
γ1 = a00j < 1, δ1 = 1 − a0j < γ1 .
N00 \N00
1 N0 \N01

37
We are given a − b ∈ Z, so

X X
a−b = aj + γ1 − ( (1 − aj ) + δ1 )
N00
1 N01
X X
= aj + aj + γ1 − δ1 − |N01 |
N00
1 N01
X
= aj − |N01 | + γ1 − δ1 ∈ Z.
N00 0
1 ∪N1

aj + γ1 − δ1 ∈ Z. Write N100 ∪ N1 = {k1 , · · · , kr }. Since


P
Then m = N00 0
1 ∪N1

0 ≤ γ1 − δ1 < 1, 0 ≤ aj ≤ 1, and

m = ak1 + · · · + akr + (γ1 − δ1 ),

we get from Lemma 3.4 that

m times r+1−m times


z }| { z }| {
(ak1 , · · · , akr , (δ1 − γ1 )) ≺ (1, · · · , 1, 0, · · · , 0) .

By Theorem 2.12 (CT-6), there exists a projection P0 such that

 
ak1 
...
*
 
 
P0 =  .
 

 akr 

 
* γ1 − δ1

38
By adding the element 1 to the diagonal of P0 we get another projection P1
 
 ak 1 
..
 



. * 0 


 
 akr 
 
*
 
 δ1 − γ1 0 0 
P1 =  .
 
0 1 0
 
 
 
 

 0 0 0 

0
 

 0 

 .. 
.

Now we will pay attention only to the 3 × 3 block at the middle. Let N00 \ N001 =
{l1 , l2 , · · · }, N0 \ N01 = {m1 , m2 , · · · }. We know 0 ≤ ali < 1, 0 ≤ ami < δ1 ∀i ∈ N.
Let
γ2 = γ1 − al1 , δ2 = δ1 − (1 − am1 ),

then

γ2 − δ2 = γ1 − δ1 − (al1 + am1 ) + 1 (3.11)

γ2 − δ2 + al1 + am1 = γ1 − δ1 + 1 (3.12)

From Lemma 3.4,


(al1 , am1 , γ2 − δ2 ) ≺ (γ1 − δ1 , 1, 0).

By the Schur-Horn Theorem 2.17 there exists a 3 × 3 self adjoint matrix U1 such that

39
   
al1 0 0  γ1 − δ1 0 0
    ∗
 = diag (U1  0  U1 ).
0 a 0   1 0
 m1
   
0 0 γ2 − δ2 0 0 0

So  
 ak 1 
 ... 

 0 

 
akr ∗
 
 
 
 

 al 1 

U1 P1 U1∗ = 
 
∗ am1 .
 
 
γ2 − δ2
 
 
 
 

 0 

 

 0 0 

...
 

Now we define P2 as

40
 
 ak 1 
 .. 

 . 0 

 
ak r
 
 
 
 

 al 1 

 
P2 = 
 am1 

 
γ2 − δ2
 
 
 
 

 1 

 

 0 0 

..
 
.

i.e. P2 = U1 P1 U1∗ + Er+4,r+4 . Since 0 ≤ al2 ≤ γ2 , 0 ≤ 1 − am2 ≤ δ2 , and by letting

γ3 = γ2 − al2 , δ3 = δ2 − 1 − am2 ,

then again by Lemma 3.4,

(al2 , am2 , γ3 − δ3 ) ≺ (γ2 − δ2 , 1, 0).

If we keep repeating the process we will end up with a sequence of projections {Pn } ∈
B(H) such that

diag(PN ) = ak1 , · · · , akr , al1 , · · · , aln , am1 , · · · , amn , γn+1 − δn+1 , 0, · · · .

As the unitary Uj+1 is the identity except possibly in the first r + 3j + 3 basis

41
elements,
Pi Qr+3j+4 = Pj Qr+3j+4 , i ≥ j (3.13)

Ps
where is Qs = h=1 Ehh . This shows that the sequence {Pn } converges strongly,
Indeed, for any i ≥ j, ξ ∈ H,

k(Pi − Pj )ξk = k(Pi − Pj )Iξk

= k(Pi − Pj )[(I − Qr+3j+4 ) + Qr+3j+4 ]ξk

≤ k(Pi − Pj )(I − Qr+3j+4 )ξk + k(Pi − Pj )Qr+3j+4 ξk

≤ (kPi k + kPj k)k(I − Qr+3j+4 )ξk = 2k(I − Qr+3j+4 )ξk.

When h → ∞, Qh → I strongly, so we find out that the sequence {Pn ξ} is Cauchy in


H, and so convergent. This shows {Pn } ∈ B(H) is strongly convergent to a projection
P. As Pi Qr+3j+4 = Pj Qr+3j+4 , the diagonal of P is {aj }.
(1) =⇒ (2):
The fact that K, K ⊥ are infinite-dimensional imply that
P P
j aj = j 1 − aj = ∞. So
we need to show that either 2(i) or 2(ii) holds. If 2(i) holds, we are done. Otherwise
we want to show that if there is a projection P ∈ B(H) such that diag(P ) = an , then
a − b ∈ Z, where

X X
a= ai < ∞, and b = (1 − ai ) < ∞.
ai ∈N0 ai ∈N00

The proof of this is inspired by Effros’ proof of Lemma 3.5. Let Q ∈ B(H) be the

42
P
projection Q = n∈N0 Enn . Then

X
tr (QP Q) = tr ( Enn P Enn )
n∈N0
X
= tr ( an Enn )
n∈N0
X
= an
n∈N0
= a,

and

X
tr (Q⊥ P ⊥ Q⊥ ) = tr ( Enn (I − P )Enn )
n∈N00
X
= tr ( (1 − an )Enn )
n∈N00
X
= (1 − an )
n∈N00
= b.

This implies that both QP Q, Q⊥ P ⊥ Q⊥ are trace class, since they are positive. We
now notice that P − Q⊥ is Hilbert-Schmidt, and so in particular it is compact.

43
Indeed, using that P = P 2 , (1 − P )2 = (1 − P ), we have

X
tr ((P − Q⊥ )2 ) = [(P − Q⊥ )]hh
h
XX
= |Phk − Q⊥
hk |
2
(no problem exchanging the sums
h k
XX
= |Phk − Q⊥
hk |
2
since every term is non-negative)
k h
XX X X
= |Phk |2 + (1 − Pkk )2 + | − Phk |2
k∈N0 h k∈N00 n6=k
X X
= Pkk + (1 − Pkk )
k∈N0 k∈N00
= a + b < ∞.

Since (P − Q⊥ )2 is positive and compact, then we can write (P − Q⊥ )2 =


P
k λk Rk ,
where Rk = R1 , R2 , · · · are pairwise orthogonal finite rank projections and {λi }i∈N
arranged strictly in decreasing order and converging to zero. The points {λi } are
isolated points in the spectrum of (P − Q⊥ )2 , so there exist continuous functions
f1 , f2 , · · · such that Ek = fk ((P − Q⊥ )2 ). Since P (P − Q⊥ )2 = (P − Q⊥ )2 P , Q(P −
Q⊥ )2 = (P −Q⊥ )2 Q (from a direct computation), we deduce that P Rk = Rk P, QRk =
Rk Q, ∀ k; in particular P Rk , QRk are both finite rank projections. By keeping in

44
mind that QP Q, Q⊥ P ⊥ Q⊥ are trace class and using Lemma 3.6,

a − b = tr (QP Q) − tr (Q⊥ P ⊥ Q⊥ )

= tr (QP Q − Q⊥ P ⊥ Q⊥ )
X
= tr [(QP Q − Q⊥ P ⊥ Q⊥ )Rk ]
k
X
= tr (QP QRk ) − tr (Q⊥ P ⊥ Q⊥ Rk )
k
X
= tr (QP Rk QRk ) − tr (Q⊥ P ⊥ Rk Q⊥ Rk )
k
X
= tr (P Rk QRk − P ⊥ Rk Q⊥ Rk )
k
X
= tr (P Rk QRk − (Rk − Rk QRk − P Rk + P Rk QRk ))
k
X
= tr (P Rk + QRk − Rk ).
k

P Rk , QRk , and Rk are finite rank projections and their traces are integers, so

tr (P Rk + QRk − Rk ) ∈ Z, ∀k.

As every term in the series is an integer, we conclude that they are eventually zero,
and that a − b ∈ Z.

3.2 One of the Subspaces K, K ⊥ has Finite Dimension

In this context we will show the Carpenter’s theorem for the finite dimensional
subspaces K, K ⊥ of an infinite dimensional Hilbert space, where the sum of the diago-
nal elements of the projection Pk is finite and integer. The proof comes as consequence
of Theorem 3.8.

45
Theorem 3.9. If {ej }j∈J is an orthonormal basis for an infinite-dimensional Hilbert
space H, and {tj }j∈J ⊂ [0, 1], then the following statements are equivalent:

1. There exists m-dimensional K ⊂ H, such that kPk ej k2 = tj .


P
2. j∈J tj = m.

Proof. First we will show (2) ⇒ (1):


As in Theorem 3.8, we write {tj } = {t0j } ∪ {t00j }, 0 ≤ t0j ≤ 12 , and 21 < t00j ≤ 1. As
00 00 00 00
P 2
j tj < ∞, tj → 0, so {tj } is necessarily finite, i.e. {tj } = {t1 , · · · , tk }. Let


X k
X
a= t0j , b= (1 − t00j ).
1 1

Then


X k
X
a−b = t0j − (1 − t00j )
1 1

X
= tj − k
1
= m − k ∈ Z.

Since a − b ∈ Z we apply Theorem 3.8 to get a projection PK such that the diagonal
of PK is t1 , t2 , · · · . And dimK = tr (PK ) = ∞
P P∞
j=1 hPK ej , ej i = j=1 tj = m.

For the other direction (1) ⇒ (2): Let PK be the orthogonal projection onto K.
As dim K = m, we get

X X X
m = tr (PK ) = hPK ej , ej i = kPK ej k2 = tj .
j j j

When K has finite co-dimension, we can apply Theorem 3.9 to its orthogonal

46
complement K ⊥ to obtain a projection PK with prescribed diagonal {sj } and PK ⊥ =
I − PK . So the diagonal of PK ⊥ is {1 − sj }. We get thus the following result:

Theorem 3.10. If {ej } is an orthonormal basis of H, and {tj }j∈J ⊂ [0, 1], then the
following statements are equivalent:

1. There exists K ⊂ H, of co-dimension m, such that kPk ej k2 = {tj }j∈J .

P
2. j∈J (1 − tj ) = m.

3.3 A Pythagorean Theorem for Infinite Doubly Stochastic

Matrices

In chapter 2 we defined Pythagorean matrix (2.20), then we studied finite doubly


stochastic matrices A, and we calculated the weight of some block B on the matrix A
and the weight of it complementary block. Then we proved a Pythagorean Theorem
(Theorem 2.21) for finite doubly stochastic matrices, where w(B) − w(B 0 ) ∈ Z. For
an infinite doubly stochastic matrix we can’t do that immediately because we are
dealing with infinite rows and columns. Kadison defines the weight for an infinite
doubly stochastic matrix, by taking the sum as a limit. The sets of the rows and
the columns are infinite sets, so he takes all the family of the finite subsets, that
ordered by inclusion gives us a net, and then he takes the limit over the net. By using
Theorem 3.8 we will show a Pythagorean Theorem for infinite Pythagorean matrix
when it has infinite complementary blocks, each of them with finite weight.

Definition 3.11. We say that a doubly stochastic matrix A is orthostochastic if


A = |Uij |2 , where U is unitary matrix [4].

47
Lemma 3.12. If A is a Pythagorean matrix, then it is doubly stochastic.

Proof. To show that a Pythagorean matrix is doubly stochastic we will show that it
is orthostochastic. Let ei , fj be orthonormal bases with A = |hei , fj i|2 . We know
that there exists a unitary U , with U fi = ei . So |hei , fj i|2 = |hU fi , fj i|2 = |Uji |2 .

How can an infinite doubly stochastic matrix that is Pythagorean have infinite
complementary blocks with finite weights? To make that clear we will discuss an
example. Let


 2i

if i ∈ Z−
ai =
 (1 − 2−i )
 if i ∈ Z+

P P
and let a = i∈Z− ai = 1, b = i∈Z+ 1 − ai = 1. Write Z0 = Z+ ∪ Z− . By using
Theorem 3.8, since a−b ∈ Z, there exists an infinite dimensional K ⊂ H, with infinite
dimensional orthogonal complement K ⊥ such that the diagonal of the projection PK
is {ai }i∈Z0 . When j ∈ Z+ , k(I − PK )ej k2 = 1 − kPK ej k2 = 1 − aj = 2−j . Let
{fj }j∈Z+ , {fj }j∈Z− be orthonormal bases for K, K ⊥ respectively. Let aij = |hei , fj i|2 ;
then A = (aij ) is an infinite Pythagorean matrix.

X X
aij = |hei , fj i|2 = kPK ei k2 = ai , ∀i ∈ Z0
j∈Z+ j∈Z+

and
X
aij = k(I − Pk )ei k2 = 1 − ai , ∀i,
j∈Z−

so

48
X X X
aij = ai = 1
i∈Z− j∈Z+ i∈Z−

and

X X X
aij = aj = 1.
j∈Z+ i∈Z− j∈Z+

Thus the weight of both complementary blocks is finite, and the difference is an
integer.

Theorem 3.13. If A is an infinite Pythagorean matrix, and B, B 0 are a block and


its complement in A, and both their weights are finite, then w(B) − w(B 0 ) ∈ Z.

Proof. Since both blocks have finite weight, then both B, B 0 are infinite (because the
complement of a finite block has infinite weight). Let us write A = (aij ), ∀i, j ∈ Z0 ,
where B = (aij )i,j∈Z− , and B 0 = (aij )i,j∈Z+ . As A is Pythagorean, there exist
{ei }i∈Z0 , {fj }j∈Z0 orthonormal bases for H such that |hei , fj i|2 = aij ∀i, j ∈ Z0 . Let
K ⊂ H be the subspace that is spanned by {fj }j∈Z− . Then

X X X
w(B) = aij = |hei , fj i|2
i,j∈Z− i∈Z− j∈Z−
X X
= hhei , fj ifj , ei i
i∈Z− j∈Z−
X X
= hPK ei , ei i = kPK ei k2 .
i∈Z− i∈Z−

Similarly,

X X X X X X
w(B 0 ) = aij = |hei , fj i|2 = k(I − PK )ei k2 = 1 − kPK ei k2 .
i,j∈Z+ i∈Z+ j∈Z+ i∈Z+ j∈Z+ i∈Z+

49
As both weights are finite, (i) ⇒ (ii) in Theorem 3.8 gives us w(B) − w(B 0 ) ∈ Z.

If we compare the finite and infinite dimensional versions of the Pythagorean


Theorem for doubly stochastic matrices 2.21 and 3.13, we note that in the finite-
dimensional case the result holds for any doubly stochastic matrix, while the infinite
dimensional case requires an additional hypothesis (i.e. Pythagorean). We don’t
know if Theorem 3.13 holds for arbitrary doubly stochastic infinite matrices. We
note below that not every doubly stochastic matrix is Pythagorean, starting with
dimension 3.
For example if we take any 2 × 2 doubly stochastic matrix,
 
 a 1 − a
A= 
1−a a

and then construct 2 orthonormal bases e1 , e2 , and f1 , f2 , where

√ √ √ √
f1 = a e1 + 1 − a e 2 , f2 = − 1 − a e 1 + a e 2 ,

then Aij = |hei , fj i|2 , so A is Pythagorean. What about 3 × 3 doubly stochastic


matrices? Consider  
1 1 0
1 
A= 1 0 1.
2



0 1 1

If A were Pythagorean, n there would exist orthonormal bases {e1 , e2 , e3 }, {f1 , f2 , f3 }

50
with Aij = |hei , fj i|2 . Then

0 = A31 = |he3 , f1 i|2 , so f1 = se1 + re2 ,

0 = A22 = |he2 , f2 i|2 , so f2 = pe1 + qe3 .

Also,

1
= A11 = |he1 , f1 i|2 = |s|2 , so s 6= 0
2
1
= A12 = |he1 , f2 i|2 = |p|2 , so p 6= 0.
2

But then hf1 , f2 i = sp 6= 0, contradicting the fact that {f1 , f2 , f3 } is an orthonormal


basis. So every 2 × 2 doubly stochastic matrix is Pythagorean, but bigger doubly
stochastic matrices will not necessarily be Pythagorean.

51
Chapter 4

A Schur-Horn Theorem in Infinite


Dimensional Case

In this chapter we will see a generalization for Schur-Horn Theorem in infinite


dimension by A. Neumann [11]. After that we will give an explanation for this result.
After that, we will consider W. Arveson and R. Kadison’s Schur-Horn theorem for
positive trace-class operators on an infinite dimensional Hilbert space [3].

4.1 Majorization in Infinite Dimension

To begin, first we will define majorization in infinite dimension and for that we
will use `∞ (N) ⊂ B(H) instead of using Rn , such that `∞ (N) has real entries. We
chose `∞ (N) ⊂ B(H) here because the diagonal of a self adjoint operator as a matrix
is a bounded sequence of real numbers, so we can think here about `∞ (N) as “diagonal
self adjoint matrices” inside B(H).
We defined majorization in finite dimension as follows:

52
For any two finite vectors x, y ∈ Rn , we say x is majorized by y, denoted x ≺ y, if

k
X k
X
x↓i ≤ yi↓ for k < n,
i=1 i=1
and
n
X n
X
xi = yi .
i=1 i=1

One can try to generalize this, naively, by saying that for x, y ∈ `∞ (N), x is
majorized by y if

k
X k
X
x↓i ≤ yi↓ for k < ∞,
i=1 i=1
and

X ∞
X
xi = yi .
i=1 i=1

But such “majorization” would fail to generalize many of the properties that finite-
dimensional majorization enjoys. For instance in finite dimension (Proposition 1.4),
we have that x ≺ y if and only if

k
X k
X
yi↓ ≤ x↓i ∀ k < n,
i=1 i=1
and
n
X n
X
xi = yi .
i=1 i=1

But in `∞ (N), the numbers x↓n are not defined if x = 1 − n1 , and neither are
x↑n if xn = n1 . Also, looking at Theorem 1.10, the sequences x = (1, 1, 1, · · · ) and

53
y = (2, 2, 2, · · · ) would satisfy x ≺ y, but x ∈
/ convPy (not even in the closure).
What A. Neumann does to solve this problem, instead of taking these sums

k
X k
X
yi↓ ≤ x↓i ∀ k < n,
i=1 i=1

P
he defines Uk (x) = sup{ i∈F xi : |F | = k} i.e. he takes all possible sums of
cardinality k in x, and then he takes the supremum. But this is not enough, we can
see the reason in the following example.

Example 4.1. Let x, y ∈ `∞ (N) such that

x = (1, 1, 1, 1, · · · ) and y = (1, 0, 1, 0, · · · ),

then
Uk (x) = k and Uk (y) = k.

So both vectors would majorize each other, and having double majorization one
would expect both vectors to be permutations of each other, as in Theorem 1.3. So
we would have
x ≺ y, y ≺ x ⇐⇒ x = Py,

which in this example is clear that can’t happen.


Note that if we have

k
X k
X k
X k
X
x↓i ≤ yi↓ , and yi↓ ≤ x↓i ∀ k
i=1 i=1 i=1 i=1

54
then
n
X n
X
xi = yi .
i=1 i=1

Neumann uses this idea to (implicitly) define majorization in infinite dimension as


follows [11]:

Definition 4.2. Let x, y ∈ `∞ (N), then we say x ≺ y if ∀ k ∈ N,

Uk (x) ≤ Uk (y)

and

Lk (x) ≥ Lk (y),

where
X X
Uk (x) = sup{ xi : |F | = k}, Lk (x) = inf{ xi : |F | = k}.
i∈F i∈F

So, in Example 4.1, Uk (x) = Lk (x) = k, Uk (y) = k, Lk (y) = 0; then x ≺ y is


true, but y ⊀ x.

4.2 Neumann’s Schur-Horn Theorem

We start by defining some terminology to be used later [15].

Definition 4.3. A subalgebra A ⊂ B(H) is maximal abelian if any two elements


from A commute, and A is not properly contained in any other commutative subalgebra
A0 ⊂ B(H).

Definition 4.4. A von Neumann algebra M on H is a ∗-subalgebra of B(H) such


that M = M 00 , where M 00 is the double commutant of M .

55
Definition 4.5. We say that a von Neumann M is atomic when every nonzero
projection majorizes a nonzero minimal projection.

Definition 4.6. Let A ⊂ M be subalgebra a von Neumann algebra M . Then we say


a linear map E is a conditional expectation when E : M −→ A is onto, E = E 2 ,
and kEk = 1. Moreover, we say that E is trace preserving when

tr ◦ E = tr .

In the finite dimensional case, Schur-Horn Theorem 2.17 states for every x, y ∈ Rn
we have
{Mx ∈ x : x ≺ y} = D{U My U ∗ : U ∈ U(H)}.

If we want to do the analog thing for infinite dimensional case, take x, y ∈ `∞ (N),
where we see `∞ (N) as the diagonal operators for a fixed orthonormal basis. If we
write E for the projection onto the diagonal, then we expect

{Mx : x ≺ y} = E{U My U ∗ : U ∈ U(H)}.

This equality cannot hold without norm closure: if we go back to our Example 4.1,
were x = (1, 1, 1, 1, · · · ) and y = (1, 0, 1, 0, · · · ), then x ≺ y but

/ E{U My U ∗ : U ∈ U(H)}, while Mx = I ∈ E{U My U ∗ : U ∈ U(H)}.


Mx = I ∈

So the closure seems to be necessary to satisfy the equality of Schur-Horn Theorem


in the infinite dimensional case, and this is what Neumann does to form the next
theorem ([11], Corollary 2.18, and Theorem 3.13).

56
Theorem 4.7 ([11]). For x, y ∈ l∞ (N) we have

{Mx : x ≺ y} = E{U My U ∗ : U ∈ U(H)}.

4.3 A Strict Schur-Horn Theorem for Positive Trace-Class

Operators

We will define positive trace-class operators and then we will form a Schur-Horn
Theorem for positive trace-class operator on infinite dimensional Hilbert space. We
call this theorem “strict”, because no closure is required after projecting onto the
diagonal.
We refer to the definitions of positive and trace-class operators at the beginning
of Chapter 3.

Definition 4.8. For a trace-class operator A, its one-norm is

1
kAk1 = tr ((A∗ A) 2 ).

Definition 4.9. Let A, B ∈ B(H) be two trace-class operators. We say A and B


are L1 equivalent if there exist a sequence of unitary operators {Ui }∞
i=1 such that

kA − Un BUn∗ k1 → 0, when n → ∞.

57
The trace-class operators are compact, and thus their spectrum consists of a se-
quence of eigenvalues that converge to zero; if λ is the eigenvalue list of A, then

X
kAk1 = |λj |.
j

Proposition 4.10. Let A ∈ B(H) be a positive trace-class operator with eigenvalue


list λ, Oλ be the set of all positive trace-class operators having λ as their eigenvalue
k.k1
list, and O(A) = {U AU ∗ : U ∈ U(H)} , where U (H) the set of unitary operators.
Then
Oλ = O(A).

Proof. We have to show that when two trace-class operators are L1 equivalent then
their eigenvalue lists are equal, and also the converse. So first we are going to show
that if we have A, B two equivalent trace-class operators with eigenvalues lists λ, µ,
then λ = µ. We will use the inequality [12]


X
|λn − µn | ≤ kA − Bk1
i=1


to show that there exist unitary operators Um such that kUm AUm − Bk1 → 0. Note

that Um AUm has the same eigenvalue list as A. So for a given  > 0 choose m such

AUm − Bk1 < , and so ∞
P
that kUm 1 |λn − µn | < . As  is arbitrary this implies
P∞
1 |λn − µn | = 0. Thus λ = µ for all n.

For the reverse implication we are going to show that trace-class operators A, B

58
having the same eigenvalues list then are equivalent. Since A, B are trace class oper-
ators they are compact, so we can write A as

n
X ∞
X
A = An + Rn , where An = λi Pi , Rn = λ i Pi ,
i=1 i=n+1

with {Pi } pairwise orthogonal rank-one projections, similarly,

n
X ∞
X
B = Bn + Sn , where Bn = λi Pi0 , Sn = λi Pi0 .
i=1 i=n+1

Note that kRn k1 → 0, kSn k1 → 0. Since An , Bn are finite-rank operators with the
same eigenvalues, there exists a unitary Un with Un An Un∗ = Bn . Then

kB − Un AUn∗ k1 = kBn + Sn − Un (An + Rn )Un∗ k1

= kBn + Sn − Un An Un∗ − Un Rn Un∗ k1

= kSn − Un Rn Un∗ k1 ≤ kSn k1 + kRn k1 → 0.

So kB − Un AUn∗ k → 0.

The next two theorems 4.11, and 4.12, are about majorization of positive compact
operators. We will use them to deduce the Schur-Horn Theorem for the positive trace-
class operators (Theorem 4.13).

Theorem 4.11 ([3]). Let M ⊆ B(H) be a discrete maximal abelian algebra having
a conditional expectation E : B(H) → M. Let A be positive compact operator in
B(H), with eigenvalue list λ = (λ1 ≥ λ2 ≥ · · · ). Let B ∈ M be positive compact
operator with eigenvalue list µ. Then the following are equivalent:

59
• ∃ contraction L ∈ B(H) such that E(L∗ AL) = B.

• µ1 + µ2 + · · · + µn ≤ λ1 + λ2 + · · · + λn , n = 1, 2, · · · .

Theorem 4.12 ([3]). Let A ∈ B(H) be a positive compact operator having eigenvalue
list λ. Let Pn ∈ B(H) be the set of all n-dimensional projections. Then

sup tr AP = max tr AP = λ1 + · · · + λn . (4.14)


P ∈Pn P ∈Pn

The maximum is achieved for the n−dimensional projection such that its range is
spanned by the eigenvectors of λ1 , · · · , λn .

The following is Arveson-Kadison’s strict Schur-Horn theorem for positive trace-


class operators. Note that, as opposed to Neumann’s result (Theorem 4.7), no clo-
sure is needed outside E(Oλ ). Of course, Neumann’s result applies to all bounded
sequences, while Arveson-Kadison’s require the sequences to be `1 .

Theorem 4.13. Let M ∈ B(H) be a discrete maximal abelian von Neumann algebra
with the trace preserving conditional expectation

E : B(H) → M.

Let the eigenvalue list λ be arranged in decreasing order in `1 with positive terms only.
Then E(Oλ ) contains all positive trace-class operators B ∈ M that have eigenvalue
list µ such that µ ≺ λ.

Proof. By Proposition 4.10, we have to show that, for a fixed A with eigenvalue list
λ, E{U AU ∗ : U ∈ U (H)} = {B : µ ≺ λ, where µ is the eigenvalue list of B}. For

60
the implication (⇒) we are going to show that for a positive trace-class operator
A ∈ B(H) that has eigenvalue list λ, the eigenvalue list µ of B = E(A) will satisfy

n
X n
X
µi ≤ λi , for all n, (4.15)
i=1 i=1
and

X ∞
X
µi = λi , (4.16)
i=1 i=1

Let {ek } be an orthonormal basis such that B is diagonal on it, i.e. Bek = µk ek for
k = 1, 2, . . .. Let Pn be the projection onto the span of {ei }ni=1 . Since A is a positive
trace-class operator, then A is compact. So we can apply Theorem 4.12, to get

n
X n
X
µi = hBek , ek i
i=1 k=1
= tr (BPn )

= tr (E(A)Pn )

= tr (E(APn ))

= tr (APn )

≤ λ1 + · · · + λn ,

61
and


X ∞
X
µi = hBei , ei i
i=1 i=1
= tr (B)

= tr (E(A))

= tr (A)
X∞
= λi .
i=1

Now we will show the other implication (⇐). Let assume that we have two
eigenvalues lists µ, λ for B, A such that they satisfy the majorization conditions 4.15,
4.16. Then B ∈ B(H) is a compact positive operator that is diagonal on some
orthonormal basis {ek }. So Bek = µk ek . By permuting A with a unitary, we can
assume without loss of generality that Aek = λk ek for all k. let P be the projection
on the range of A,
PH = span{ek : λk 6= 0}.

Then A = AP + AP⊥ = AP. By Theorem 4.11 there exists a contraction L, kLk ≤ 1


such that E(L∗ AL) = B. So by replacing L by PL, we still have the same equality
PL = L, and LL∗ ≤ P. Since L is contraction, then LL∗ = PLL∗ P ≤ kLL∗ kP ≤ P,
and P(I − LL∗ )P ≥ 0. So we know that P(I − LL∗ )P is positive; we will show
it is equal to zero. By hypothesis, tr (A) = tr (B), so tr (L∗ AL) = tr (E(L∗ AL)) =

62
tr (B) = tr (A). Then

0 = tr (A) − tr (L∗ AL)


1 1 1 1
= tr (A 2 PA 2 ) − tr (A 2 LL∗ A 2 )
1 1
= tr (A 2 (P − LL∗ )A 2 ).

1 1
As A 2 (P − LL∗ )A 2 ≥ 0, we deduce that

1 1
A 2 (P − PLL∗ P)A 2 = 0.

1 1 1 1 1 1
As A 2 (P − LL∗ )A 2 = ((P − LL∗ ) 2 A 2 )∗ ((P − LL∗ ) 2 A 2 ), we conclude that

1 1
(P − LL∗ ) 2 A 2 = 0,

1 1 1
and so (P − LL∗ )A 2 = 0. As Aek = λk ek for all k, A 2 ek = λ 2 ek . So

1 1
(P − LL∗ )ek = (P − LL∗ )λ− 2 A 2 ek = 0, for all k,

so P − LL∗ = 0, and therefore P = LL∗ which implies that L is partial isometry. In


particular, L∗ L is a projection. Now let us define a unitary operator

U : H −→ PH ⊕ ker L,

by ξ → Lξ ⊕ Qξ, where Q is the orthogonal projection onto ker L. Note that PH =

63
LH. For any ξ ∈ PH, η ∈ ker L,

U (L∗ ξ + η) = L(L∗ ξ + η) ⊕ Q(L∗ ξ + η)

= LL∗ ξ ⊕ (QL∗ ξ + η)

= Pξ ⊕ η

= ξ ⊕ η.

So U is onto. To show that U is unitary we have to show it is isometric. Indeed

kU ξk2 = kLξ ⊕ Qξk2

= kLξk2 + kQξk2

= hLξ, Lξi + hQξ, Qξi

= hL∗ Lξ, ξi + hQξ, ξi

= h(L∗ L + Q)ξ, ξi

= hξ, ξi

= kξk2 ,

so U is a surjective isometry, thus it is unitary. Then if A0 = A|PH , we have for any k

h(A0 ⊕ 0)U ek , U ek i = h(A0 ⊕ 0)(Lek ⊕ Qek )Lek ⊕ Qek i

= hAPLek , Lek i

= hL∗ PAPLek , ek i

= hL∗ ALek , ek i.

64
So U ∗ (A0 ⊕ 0) = L∗ AL. Then

E(U ∗ (A0 ⊕ 0)U ) = E(L∗ AL) = B.

The eigenvalue list of U ∗ (A0 ⊕ 0)U is the same eigenvalue list of A, therefore
U ∗ (A0 ⊕ 0)U ∈ O(A), and by Proposition 4.10 we get B ∈ O(A).

65
Chapter 5

Conclusion and Future Work

The goal of this thesis was to better understand the role of majorization in the
Schur-Horn Theorem. In our way to understand this theorem and its generaliza-
tions we studied the relation between majorization and doubly stochastic matrices,
including Birkhoff’s Theorem 1.13. Then we found that majorization characterizes
the relation between the diagonal of self adjoint matrices and their eigenvalues: this
is the classical Schur-Horn Theorem 2.17. We used the Schur-Horn Theorem to prove
the Carpenter Theorem in both the finite (CT-6, Theorem 2.12) and the infinite di-
mensional cases (Theorem 3.8). Our main contribution is a new proof for Theorem
3.8, which is shorter and we think conceptually clearer than Kadison’s.
In the infinite-dimensional setting, we considered A. Neumann’s extension of the
notion of majorization to `∞ (N). When the right generalization of majorization is
taken, we can get Neumann’s result, Theorem 4.7. This result has opened the way
for many scientists to extend this result to many areas. In particular W. Arveson and
R.V. Kadison got a Schur-Horn Theorem for positive trace-class operators, Theorem
4.13, where they can dispense with taking closure on the left-hand-side. Arveson and
Kadison also studied the Carpenter Theorem in II1 -factors [2], but were not able to

66
prove an analog of their trace-class result. Their conjecture was first considered by
Argerami and Massey [1], who proved a weak version (i.e. with closure a-la Neumann),
and it was finally settled a few months ago by M. Ravichandran [13].
Possible future work in this area would be to extend Theorem 3.8 to a Schur-Horn
Theorem, both in the atomic case (directly generalizing Theorem 3.8) and in the
II1 -case (using the techniques in Ravichandran’s recent proof [13]).

67
Bibliography

[1] M. Argerami and P. Massey. A Schur-Horn theorem in II1 factors. Indiana Univ.
Math. J., 56(5):2051–2059, 2007.

[2] William Arveson. Diagonals of normal operators with finite spectrum. Proc.
Natl. Acad. Sci. USA, 104(4):1152–1158 (electronic), 2007.

[3] William Arveson and Richard V. Kadison. Diagonals of self-adjoint operators.


In Operator theory, operator algebras, and applications, volume 414 of Contemp.
Math., pages 247–263. Amer. Math. Soc., Providence, RI, 2006.

[4] Rajendra Bhatia. Matrix analysis, volume 169 of Graduate Texts in Mathematics.
Springer-Verlag, New York, 1997.

[5] Edward G. Effros. Why the circle is connected: an introduction to quantized


topology. Math. Intelligencer, 11(1):27–34, 1989.

[6] Alfred Horn. Doubly stochastic matrices and the diagonal of a rotation matrix.
Amer. J. Math., 76:620–630, 1954.

[7] Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge University
Press, Cambridge, 1990. Corrected reprint of the 1985 original.

68
[8] Richard V. Kadison. The Pythagorean theorem. I. The finite case. Proc. Natl.
Acad. Sci. USA, 99(7):4178–4184 (electronic), 2002.

[9] Richard V. Kadison. The Pythagorean theorem. II. The infinite discrete case.
Proc. Natl. Acad. Sci. USA, 99(8):5217–5222 (electronic), 2002.

[10] Richard V. Kadison. Non-commutative conditional expectations and their ap-


plications. In Operator algebras, quantization, and noncommutative geometry,
volume 365 of Contemp. Math., pages 143–179. Amer. Math. Soc., Providence,
RI, 2004.

[11] Andreas Neumann. An infinite-dimensional version of the Schur-Horn convexity


theorem. J. Funct. Anal., 161(2):418–451, 1999.

[12] Robert T. Powers and Derek W. Robinson. An index for continuous semigroups
of ∗-endomorphisms of B(H). J. Funct. Anal., 84(1):85–96, 1989.

[13] Mohan Ravichandran. preprint arXiv:1209.0909, 2012.

[14] I. Schur. Über eine klasse von mittelbildungen mit anwendung auf die determi-
nantentheorie. S.-Ber. Berliner math. Ges., 2:9–20, 1923.

[15] M. Takesaki. Theory of operator algebras. I, volume 124 of Encyclopaedia of


Mathematical Sciences. Springer-Verlag, Berlin, 2002. Reprint of the first (1979)
edition, Operator Algebras and Non-commutative Geometry, 5.

69

You might also like