You are on page 1of 209

Probability and Geometry on Groups Lecture notes for a graduate course

bor Pete Ga Technical University of Budapest http://www.math.bme.hu/~ gabor April 9, 2013 Work in progress. Comments are welcome.

Abstract These notes have grown (and are still growing) out of two graduate courses I gave at the University of Toronto. The main goal is to give a self-contained introduction to several interrelated topics of current research interests: the connections between 1) coarse geometric properties of Cayley graphs of innite groups; 2) the algebraic properties of these groups; and 3) the behaviour of probabilistic processes (most importantly, random walks, harmonic functions, and percolation) on these Cayley graphs. I try to be as little abstract as possible, emphasizing examples rather than presenting theorems in their most general forms. I also try to provide guidance to recent research literature. In particular, there are presently over 150 exercises and many open problems that might be accessible to PhD students. It is also hoped that researchers working either in probability or in geometric group theory will nd these notes useful to enter the other eld.

Contents
Preface 1 Basic examples of random walks 1.1 1.2 Z and Td , recurrence and transience, Greens function and spectral radius . . . . . A probability proposition: Azuma-Hoeding . . . . . . . . . . . . . . . . . . . . . . .
d

4 6 6 10 12 12 13

2 Free groups and presentations 2.1 2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Digression to topology: the fundamental group and covering spaces . . . . . . . . . .

2.3 2.4

The main results on free groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Presentations and Cayley graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16 18 20 20 25 25 25 25 27 31 34 37 37 39 42 43 44 45 50 52 57 57 60 62 71 74 74 76 78 82 82 86 88 92 95

3 The asymptotic geometry of groups 3.1 3.2 3.3 Quasi-isometries. Ends. The fundamental observation of geometric group theory . . Gromov-hyperbolic spaces and groups . . . . . . . . . . . . . . . . . . . . . . . . . . Asymptotic cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Nilpotent and solvable groups 4.1 4.2 4.3 4.4 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Semidirect products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The volume growth of nilpotent and solvable groups . . . . . . . . . . . . . . . . . . Expanding maps. Polynomial and intermediate volume growth . . . . . . . . . . . .

5 Isoperimetric inequalities 5.1 5.2 5.3 5.4 Basic denitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amenability, invariant means, wobbling paradoxical decompositions Isoperimetry in Z
d

. . . . . . . . .

From growth to isoperimetry in groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Random walks, discrete potential theory, martingales 6.1 6.2 6.3 Markov chains, electric networks and the discrete Laplacian . . . . . . . . . . . . . . Dirichlet energy and transience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Cheeger constant and spectral gap 7.1 7.2 7.3 7.4 Spectral radius and the Markov operator norm . . . . . . . . . . . . . . . . . . . . . The innite case: the Kesten-Cheeger-Dodziuk-Mohar theorem . . . . . . . . . . . . The nite case: expanders and mixing . . . . . . . . . . . . . . . . . . . . . . . . . . From innite to nite: Kazhdan groups and expanders . . . . . . . . . . . . . . . . .

8 Isoperimetric inequalities and return probabilities in general 8.1 8.2 8.3 Poincar e and Nash inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolving sets: the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evolving sets: the proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9 Speed, entropy, Liouville property, Poisson boundary 9.1 9.2 9.3 9.4 9.5 Speed of random walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Liouville property for harmonic functions . . . . . . . . . . . . . . . . . . . . . . Entropy, and the main equivalence theorem . . . . . . . . . . . . . . . . . . . . . . . Liouville and Poisson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Poisson boundary and entropy. The importance of group-invariance . . . . . . . 2

9.6

Unbounded measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98 100

10 Growth of groups, of harmonic functions and of random walks

10.1 A proof of Gromovs theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 10.2 Random walks on groups are at least diusive . . . . . . . . . . . . . . . . . . . . . . 105 11 Harmonic Dirichlet functions and Uniform Spanning Forests 106

11.1 Harmonic Dirichlet functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 11.2 Loop-erased random walk, uniform spanning trees and forests . . . . . . . . . . . . . 108 12 Percolation theory 110

12.1 Percolation on innite groups: pc , pu , unimodularity, and general invariant percolations111 12.2 Percolation on nite graphs. Threshold phenomema . . . . . . . . . . . . . . . . . . 129 12.3 Critical percolation: the plane, scaling limits, critical exponents, mean eld theory . 134 12.4 Geometry and random walks on percolation clusters . . . . . . . . . . . . . . . . . . 143 13 Further spatial models 148

13.1 Ising, Potts, and the FK random cluster models . . . . . . . . . . . . . . . . . . . . . 148 13.2 Bootstrap percolation and zero temperature Glauber dynamics . . . . . . . . . . . . 157 13.3 Minimal Spanning Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 13.4 Measurable group theory and orbit equivalence . . . . . . . . . . . . . . . . . . . . . 161 14 Local approximations to Cayley graphs 166

14.1 Unimodularity and socity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 14.2 Spectral measures and other probabilistic questions . . . . . . . . . . . . . . . . . . . 172 15 Some more exotic groups 184

15.1 Self-similar groups of nite automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 15.2 Constructing monsters using hyperbolicity . . . . . . . . . . . . . . . . . . . . . . . . 190 15.3 Thompsons group F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 16 Quasi-isometric rigidity and embeddings References 190 192

Preface
These notes have grown (and are still growing) out of two graduate courses I gave at the University of Toronto: Probability and Geometry on Groups in the Fall of 2009, and Percolation in the plane, on Zd , and beyond in the Spring of 2011. I am still adding material and polishing the existing parts, so at the end I expect it to be enough for two semesters, or even more. Large portions of the rst drafts were written up by the nine students who took the rst course for credit: Eric Hart, Siyu Liu, Kostya Matveev, Jim McGarva, Ben Rifkind, Andrew Stewart, Kyle Thompson, Llu s Vena, and Jeremy Voltz I am very grateful to them. That rst course was completely introductory: some students had not really seen probability before this, and only few had seen geometric group theory. Here is the course description:
Probability is one of the fastest developing areas of mathematics today, nding new connections to other branches constantly. One example is the rich interplay between large-scale geometric properties of a space and the behaviour of stochastic processes (like random walks and percolation) on the space. The obvious best source of discrete metric spaces are the Cayley graphs of nitely generated groups, especially that their large-scale geometric (and hence, probabilistic) properties reect the algebraic properties. A famous example is the construction of expander graphs using group representations, another one is Gromovs theorem on the equivalence between a group being almost nilpotent and the polynomial volume growth of its Cayley graphs. The course will contain a large variety of interrelated topics in this area, with an emphasis on open problems.

What I had originally planned to cover turned out to be ridiculously much, so a lot had to be dropped, which is also visible in the present state of these notes. The main topics that are still missing are Gromov-hyperbolic groups and their applications to the construction of interesting groups, metric embeddings of groups in Hilbert spaces, more on the construction and applications of expander graphs, more on critical spatial processes in the plane and their scaling limits, and a more thorough study of Uniform Spanning Forests and 2 -Betti numbers I am planning to improve the notes regarding these issues soon. Besides research papers I like, my primary sources were [DrK09], [dlHar00] for geometric group theory and [LyPer10], [Per04], [Woe00] for probability. I did not use more of [HooLW06], [Lub94], [Wil09] only because of the time constraints. There are proofs or even sections that follow rather closely one of these books, but there are always dierences in the details, and the devil might be in those. Also, since I was a graduate student of Yuval Peres not too long ago, several parts of these notes are strongly inuenced by his lectures. In particular, Chapter 9 contains paragraphs that are almost just copied from some unpublished notes of his that I was once editing. There is one more recent book, [Gri10], whose rst few chapters have considerable overlap with the more introductory parts of these notes, although I did not look at that book before having nished most of these notes. Anyway, the group theoretical point of view is missing from that book entirely. With all these books available, what is the point in writing these notes? An obvious reason is that it is rather uncomfortable for the students to go to several dierent books and start reading them somewhere from their middle. Moreover, these books are usually for a bit more specialized 4

audience, so either nilpotent groups or martingales are not explained carefully. So, I wanted to add my favourite explanations and examples to everything, and include proofs I have not seen elsewhere in the literature. And there was a very important goal I had: presenting the material in constant conversation between the probabilistic and geometric group theoretical ideas. I hope this will help not only students, but also researchers from either eld get interested and enter the other territory. There are presently over 150 exercises, in several categories of diculty: the ones without any stars should be doable by everyone who follows the notes, though they are often not quite trivial; * means it is a challenge for the reader; ** means that I think I would be able to do it, but it would be a challenge for me; *** means it is an open problem. Part of the grading scheme was to submit exercise solutions worth 8 points, where each exercise was worth 2# of stars . There are also conjectures and questions in the notes the dierence compared to the *** exercises is that, according to my knowledge or feeling, the *** exercises have not been worked on yet thoroughly enough, so I want to encourage the reader to try and attack them. Of course, this does not necessarily mean that all conjectures are hard, neither that any of the *** exercises are doable. . . . But, e.g., for a PhD topic, I would personally suggest starting with the *** exercises. Besides my students and the books mentioned above, I am grateful to Alex Bloemendal, Damien am Gaboriau, Gady Kozma, Russ Lyons, P eter Mester, Yuval Peres, Mark Sapir, Andreas Thom, Ad Tim ar, Todor Tsankov and B alint Vir ag for conversations and comments.

1
1.1

Basic examples of random walks


{s.1}

Z and Td , recurrence and transience, Greens function and spectral radius


{ss.ZdTd}

In simple random walk on a connected, bounded degree innite graph G = (V, E ), we take a starting vertex, and then each step in the walk is taken to one of the vertices directly adjacent to the one currently occupied, uniformly at random, independently from all previous steps. Denote the positions in this walk by the sequence X0 , X1 , . . . , each Xi V (G). Denition 1.1. A random walk on a graph is called recurrent if the starting vertex is visited innitely often with probability one. That is: Po Xn = o innitely often = P Xn = o innitely often X0 = o = 1 . Otherwise, the walk is called transient. The result that started the area of random walks on groups is the following:
{t.Polya}

Theorem 1.1 (P olya 1920). Simple random walk on Zd is recurrent for d = 1, 2 and transient for d 3. In fact, the so-called on-diagonal heat-kernel decay is pn (o, o) = Po [ Xn = o ] Cd nd/2 . For the proof, we will need the following two lemmas. The proof of the rst one is trivial from Stirlings formula, the proof of the second one will be discussed later.
{l.Stirling}

Lemma 1.2. Given a 1-dimensional simple random walk Y0 , Y1 , . . . , then P[ Yn = 0 ] = n 1 2n n/2 1 . n


{l.LD}

Lemma 1.3. The following estimate holds for a d-dimensional lattice: P # of steps among rst n that are in the ith coordinate / n 3n , 2d 2d < C exp(cd n) .

1 d Proof of Theorem 1.1. Let o = 0 Zd denote the origin, Xn = (Xn , . . . , Xn ) the walk, and n =

the number of steps taken in the ith coordinate, i = 1, . . . , d. Then


i Po [ Xn = o ] = Po [ Xn = 0 i ] =

(n1 , . . . , nd ) a multi-index, with |n| := n1 + + nd . Furthermore, let S i = S i (X1 , . . . , Xn ) denote


i = 0 i S i = ni i P[ S i = ni i ] , Po Yn i

|n|=n

which we got by using the Law of Total Probability and Bayes Theorem, and where the sequences
i are the components of the sequence X0 , X1 , . . . with the null moves deleted. That is, if in the Yn i

random walk we move from one node to one of the two adjacent nodes in the ith coordinate, then
i changes by one correspondingly, and ni increases by one. Yn i

Using the independence of the steps taken and Lemma 1.2, the last formula becomes Po [ Xn = o ]
|n|=n

(n1 nd )1/2 P[ S i = ni i ] (n) P[ S i = ni i ] + Cd nd/2 P[ S i = ni i ] , (1.1) {e.twosums}

n 3n ni /[ 2 d , 2d ]

n 3n ni [ 2 d , 2d ]

where (n) [0, 1]. Now, by Lemma 1.3,


n 3n ni /[ 2 d , 2d ]

P[ S i = ni i ] C d exp(cd n) ,

hence the rst term in (1.1) is exponentially small, while the second term is polynomially large, so we get Po [ Xn = o ] Cd nd/2 , as claimed. Now notice that Eo # of visits to o =
n=0

(1.2) {e.Zdret}

pn (o, o) .

Thus, from (1.2) we get for d 3 that Eo # of visits to o < , hence Po # visits to o is = 0, so the random walk is transient. Also, for d = 1, 2 we get that Eo # of visits to o = . However, this is not yet quite enough to

claim that the random walk is recurrent. Here is one way to nish the proof it may be somewhat too fancy, but the techniques introduced will also be used later. Denition 1.2. For simple random walk on a graph G = (V, E ), let Greens function be dened as G(x, y |z ) = In particular, G(x, y |1) = Ex # of visits to y . Let us also dene U (x, y |z ) =
n=1 n=0

{d.GU}

pn (x, y )z n ,

x, y V (G) and z C .

Px [ y = n]z n ,

where y is the rst positive time hitting y , so that Px [ x = 0] = 0. Now, we have pn (x, x) =
n k=1

Px [ x = k ] pnk (x, x), from which we get


n

pn (x, x)z n =
k=1 n=1

Px [ x = k ] z k pnk (x, x) z nk ,

pn (x, x)z n = U (x, x|z ) G(x, x|z ) ,

G(x, x|z ) 1 = U (x, x|z ) G(x, x|z ) , G(x, x|z ) = 1 . 1 U (x, x|z ) 7 (1.3) {e.GU}

Thus (1.3) says that U (x, x|1) = 1, which means that Px [ x < ] = 1, which clearly implies recurrence: if we come back once almost surely, then we will come back again and again. (To be obvious.)

As we proved above, for d = 1, 2 we have Ex [ # of visits to x ] = , hence G(x, x|1) = .

rigorous, we need the so-called Strong Markov property for simple random walk, which is intuitively

Instead of using Greens function, here is a simpler proof of recurrence (though the real math content is probably the same). Assume Px [ x < ] = q < 1. Then P[ # of returns = k ] = q k (1 q ) a geometric random variable. Therefore, E[ # of returns ] = q/(1 q ) < . Thus, the innite expectation we had for d = 1, 2 implies recurrence. Once we have encoded our sequence pn (x, y ) into a power series, it is probably worth looking at the radius of convergence of this G(x, y |z ), denoted by rad(x, y ). By the Cauchy-Hadamard criterion, rad(x, y ) = 1 lim supn
n

for k = 0, 1, 2, . . . ,

pn (x, x)

1.
{t.Pringsheim}

A useful classical theorem is the following: Theorem 1.4 (Pringsheims theorem). If f (z ) =


n

convergence is the smallest positive singularity of f (z ).

an z with an 0, then the radius rad(f ) of

Exercise 1.1. Prove that for simple random walk on a connected graph, for real z > 0, G(x, y |z ) < G(r, w|z ) < . Therefore, by Theorem 1.4, we have that rad(x, y ) is independent of x, y . By this exercise, we can dene := 1 = lim sup rad(x, y ) n
n

pn (x, x) 1 ,

(1.4) {e.rho}

which is independent of x, y , and is called the spectral radius of the simple random walk on the graph. We will see later in the course the reason for this name. Now, an obvious fundamental question is when is smaller than 1. I.e., on what graphs are the return probabilities exponentially small? We have seen that they are polynomial on Zd . The walk has a very dierent behaviour on regular trees. The simplest dierence is the rate of escape: In Z, or any Zd , E[ dist(Xn , X0 ) ] n, a homework from the Central Limit Theorem.

For the k -regular tree Tk , k 3, let dist(Xn , X0 ) = Dn . Then k1 k 1 and P Dn+1 = Dn 1|Dn = 0 = , k k1 1 . hence E Dn+1 Dn |Dn = 0 = k k P Dn+1 = Dn + 1|Dn = 0 = On the other hand, E Dn+1 Dn Dn = 0 = 1. Altogether, E Dn visible in the return probabilities. Theorem 1.5. The spectral radius of the k -regular tree Tk is (Tk ) =
2 k 1 . k

k2 n. k

So, the random walk escapes much faster on Tk than on any Zd . This big dierence will also be
{t.treerho}

We rst give a proof using generating functions, then a completely probabilistic proof. Proof. With the generating function of Denition 1.2, consider U (z ) = U (x, y |z ), where x is a neighbour of y (which we will often write as x y ). By taking a step on Tk , we see that either
1 , or we move to another neighbour of x, with probability we hit y , with probability k k 1 k ,

in which

case, in order to hit y , we have to rst return to x and then hit y . So, by the symmetries of Tk , U (z ) = which gives U (z ) = Note that if 0 z 1, then k k 2 4(k 1)z 2 . 2(k 1)z 1 k1 z+ zU (z )2 , k k

U (z ) = Px [ ever reaching y if we die at each step with probability 1 z ] . k k2 4(k1)z 2 So, in particular, U (0) is 0, and we get U (z ) = . Then, 2(k1)z U (x, x|z ) = zU (x, y |z ) = and thus G(x, x|z ) = k k 2 4(k 1)z 2 , 2(k 1)

1 1 U (x, x|z ) 2(k 1) = . k 2 + k 2 4(k 1)z 2 so Pringheims Theorem 1.4 gives (Tk ) =

(1.5) {e.GreenTree}
2 k 1 . k

The smallest positive singularity is z =

k , 2 k 1

Exercise 1.2. Compute (Tk, ), where Tk, is a tree such that if vn Tk, is a vertex at distance n from the root, deg vn = k 9 n even n odd

The next three exercises provide a probabilistic proof of Theorem 1.5, in a more precise form. (But note that the correction factor n3/2 of Exercise 1.5 can also be obtained by analyzing the singularities of the generating functions.) This probabilistic strategy might be known to a lot of people, but I do not know any reference.
{ex.nozeros}

Exercise 1.3. Show that for biased SRW on Z, i.e., P[ Xn+1 = j + 1 | Xn = j ] = p, P0 Xi > 0 for 0 < i < n Xn = 0 1 , n

with constants independent of p (0, 1). (Hint: rst show, using the reection principle [Dur96, Section 3.3], that for symmetric simple random walk, P0 [ Xi > 0 for all 0 < i 2n ] = P0 [ X2n = 0 ].) g (m) = exp(o(m)) such that P0 #{i : Xi = 0 for 0 < i < n} = m Xn = 0 g (m) 1 . n
{ex.tree3/2} {ex.zeros}

Exercise 1.4. * For biased SRW on Z, show that there is a subexponentially growing function

(Hint: count all possible m-element zero sets, together with a good bound on the occurrence of each.) Exercise 1.5. Note that for SRW on the k -regular tree Tk , the distance process Dn = dist(X0 , Xn ) is a biased SRW on Z reected at 0. a) Using this and Exercise 1.3, prove that the return probabilities on Tk satisfy c1 n3/2 n pn (x, x) c2 n1/2 n , for some constants ci depending only on k , where = (Tk ) is given by Theorem 1.5. (b) Using Exercise 1.4, improve this to pn (x, x) n3/2 n , with constants depending only on k .

1.2

A probability proposition: Azuma-Hoeding

{ss.LD} {p.Azuma}

We discuss now a result needed for the random walk estimates in the previous section. Proposition 1.6 (Azuma-Hoeding). Let X1 , X2 , . . . be random variables satisfying the following criteria: E Xi = 0 i. More generally, E Xi1 Xik = 0 for any k Z+ and i1 < i2 < < ik . Xi

< i . P X 1 + + X n > L e L 10
2

Then, for any L > 0,


/(2

Pn

i=1

Xi

2 )

Proof. Dene Sn := X1 + + Xn . Choose any t > 0. We have P Sn > L P etSn > etL etL E etSn (by Markovs inequality).

By the convexity of ex , for |x| a, we have etx cosh(at) + x sinh(at). If we set ai := Xi


,

we get

E etXi E cosh(ai t) + E Xi sinh(ai t) = cosh(ai t) . Since cosh(ai t) = we see that P Sn > L etL e We optimize the bound by setting t = L/
n i=1 k=1

(ai t)2k (2k )!

k=1 Pn 2 ,

2 2 (ai t)2k = eai t /2 , 2k k !

i=1

Xi

2 2 t /2

Xi

proving the proposition. = , for any > we get 2


2

For an i.i.d. sequence, with E Xi = and Xi

P Sn > n exp

which immediately implies Lemma 1.3, where we have = 1 and = 1/d. tools in discrete probability, usually called Chernos inequality, see, e.g., [AloS00, Appendix]. In fact, for general i.i.d., not even necessarily bounded, sequences, for any > = E Xi , the limit log P Sn > n = () n n lim always exists. The function () is called the large deviation rate function (associated with the distribution of Xi ). Moreover, in order to ensure () > 0, instead of boundedness, it is enough that the moment generating function E etX is nite for some t > 0. The rate function can be computed in terms of the moment generating function, and, e.g., for Xi Ber(p) it is p () = log 1 + (1 ) log . p 1p (1.6) {e.LDBinomial} For an i.i.d. Bernoulli sequence, Xi Ber(p), this exponential bound is among the most basic

See, e.g., [Dur96, Section 1.9] and [Bil86, Section 1.9] for these results. The main point of Proposition 1.6 is its generality: the uncorrelatedness conditions for the Xi s, besides i.i.d. sequences, are also fullled by martingale dierences Xi = Mi+1 Mi , where {Mi } i=0 is a martingale sequence. See Section 6.3 for the denition and examples.

11

2
2.1

Free groups and presentations


{s.groups}

Introduction
1 1 1 a1 , a2 , . . . , ak , a 1 , a2 , . . . , ak

{ss.groupintro}

Denition 2.1 (Fk , the free group on k generators). Let

be symbols.

The elements of the free group generated by {a1 , . . . , ak } are the nite reduced words: remove any
1 1 ai a or a i i ai , and group multiplication is concatenation (then reduction if needed).

Lemma 2.1. Every word has a unique reduced word. (Proof by induction.) Proposition 2.2. If S is a set and FS is the free group generated by S , and is any group, then : FS extending f . for any map f : S there is a group homomorphism f (si1 sik ) = f (s1 )i1 f (sk )ik , then check that this is a homomorphism. Proof. Dene f 1 k Corollary 2.3. Every group is a quotient of a free group. Proof. Lazy solution: take S = , then there is an onto map FS . A less lazy solution is to : FS take a generating set, = S . Then, by the proposition, there is a surjective map f Denition 2.2. Given a generating set S and a set of relations R of elements of S , a presentation of is given by S mod the relations in R. This is written = S |R . More formally, S / R presented if both R and S can be chosen to be nite sets.
1 1 Example: Consider the group = a1 , . . . , ad | [ai , aj ] i, j , where [ai , aj ] = ai aj a i aj . We wish

) FS , hence FS / ker(f ) . with ker(f

where R FS and

is the smallest normal subgroup containing R. A group is called nitely

to show that this is isomorphic to the group Zd . It is clear that is commutitive if we have ai aj

in a word, we can insert the commutator [ai aj ] to reverse the order so every word in can be written in the form
nk 1 n2 v = an 1 a2 ak .

Dene : G Zd by (v ) = (n1 , n2 , . . . , nk ). It is now obvious that is an isomorphism. Denition 2.3. Let be a nitely generated group, S = . Then the right Cayley graph G(, S ) is the graph with vertex set V (G) = and edge set E (G) = {(g, gs) : g , s S }. The left Cayley graph is dened using left multiplications by the generators. These graphs are often considered to have directed edges, labeled with the generators, and then they are sometimes called and |S |-regular (even if S has order 2 elements). Cayley diagrams. However, if S is symmetric (s S, s1 S ), then G is naturally undirected

{d.Cayley}

Example: The Zd lattice is the Cayley graph of Zd with the 2d generators {ei , ei }d i=1 .

Example: The Cayley graph of F2 with generators {a, b, a1 , b1 } is a 4-regular tree, see Figure 2.1. Let be a group with right Cayley graph G = G(, S ). Then acts on G by multiplication from the left as follows: for h , an element g V (G) maps to hg , and an element (g, gs) E (G) maps to (hg, hgs). This shows that every Cayley graph is transitive. 12

Figure 2.1: The Cayley graph of F2 . Finitely presentedness has important consequences for the geometry of the Cayley graphs, see Subsections 2.4 and 3.1, and the discussion around Proposition 12.5. Classical examples of nitely generated non-nitely presented groups are the lamplighter groups Z2 Zd , which will be dened presentedness of Z2 Z is proved in Exercise 12.3. We have seen two eects of Tk being much more spacious than Zd on the behaviour of simple random walk: the escape speed is much larger, and the return probabilities are much smaller. It looks intuitively clear that Z and Tk should be the extremes, and that there should be a large variety of possible behaviours in between. It is indeed relatively easy to show that Tk is one extreme among 2k -regular Cayley-graphs, but, from the other end, it is only a recent theorem that the expected rate of escape is at least E dist(X0 , Xn ) c n on any group: see Section 10.2. One reason for this not in Section 5.1, and studied in Chapter 9 from a random walk point of view. The non-nitely

{f.F2}

being obvious is that not every innite group contains Z as a subgroup: there are nitely generated innite groups with a nite exponent n: the nth power of any element is the identity. Groups with such strange properties are called Tarksi monsters. But even on groups with a Z subgroup, there does not seem to be an easy proof. We will also see that constructing groups with intermediate behaviours is not always easy. One reason for this is that the only general way that we have seen so far to construct groups is via presentations, but there are famous undecidability results here: there is no general algorithm to decide whether a word can be reduced in a given presentation to the empty word, and there is no general algorithm to decide if a group given by a presentation is isomorphic to another group, even to the trivial group. So, we will need other means of constructing groups.

2.2

Digression to topology: the fundamental group and covering spaces

{ss.topology}

Several results on free groups and presentations become much simpler in a topological language. The present section discusses the necessary background. We will need the concept of a CW-complex. The simplest way to dene an n-dimensional CW-complex is to do it recursively:

13

A 0-complex is a countable (nite or innite) union of points, with the discrete topology. To get an n-complex, we can glue n-cells to an n 1-complex, i.e., we add homeomorphic images of the n-balls such that each boundary is mapped continuously onto a union of n 1-cells. We will always assume that our topological spaces are connected CW-complexes. Consider a space X and a xed point x X . Consider two loops : [0, 1] X and : [0, 1]

X starting at x. I.e., (0) = (1) = x = (0) = (1). We say that and are homotopic, denoted by , if there is a continuous function f : [0, 1] [0, 1] X satisfying f (t, 0) = (t), f (t, 1) = (t), f (0, s) = f (1, s) = x s [0, 1].

We are ready to dene the fundamental group of X . Let 1 (X, x) be the set of equivalence classes of paths starting and ending at x. The group operation on 1 is induced by concatenation of paths: (t) = (2t) t [0, 1 2]

While it seems from the denition that the fundamental group would depend on the point x, this is not true. To nd an isomorphism between 1 (X, x) and 1 (X, y ), map any loop starting at x to a path from y to x concatenated with and the same path from x back to y . functions f : X Y and g : Y X such that f g idY , Theorem 2.4. If X Y then 1 (X ) = 1 (Y ). Theorem 2.5. Consider the CW-complex with a single point and k loops from this point. Denote this CW-complex by Rosek , a rose with k petals. Then 1 (Rosek ) = Fk . The proof of this theorem uses the Seifert-van Kampen Theorem 3.1. We do not discuss this here, but we believe the statement is intuitively obvious. Corollary 2.6. The fundamental group of any (connected) graph is free. Proof. For any nite connected graph with n vertices and l edges, consider a spanning tree T . Then the contraction each begins and ends at x. Contraction of a spanning tree to a point is a homotopy equivalence, so the graph is homotopy equivalent to Rosek . Hence, the fundamental group is free by Theorems 2.4 and 2.5. Exercise 2.1. If is a topological group then 1 () is commutative. (Recall that a group is a topological group if it is also a topological space such that the functions : (x, y ) xy and : x x1 are continuous.) T has n 1 edges. Contract T to a point x. There are k = l n + 1 edges left over, and after
{c.fundgraph}

(2t 1) t [ 1 , 1] 2

The spaces X and Y are homotopy equivalent, denoted by X Y , if there exist continuous g f idX .
{t.fundhomotop} {t.fundRose}

A basic result (with a simple proof that we omit) is the following:

14

We now introduce another basic notion of geometry: Denition 2.4. We say that X X is a covering map if for every x X , there is an open neighbourhood U x such that each connected component of p1 (U ) is homeomorphic to U by p. unique starting at x with p( ) = .
{p.uniquelift}
p

{d.covering}

Proposition 2.7. Let X be any path starting at x. Then for every x p1 (x) there is a

Sketch of proof. Because of the local homeomorphisms, there is always a unique way to continue the lifted path.
{t.monodromy}

Theorem 2.8 (Monodromy Theorem). If and are homotopic with xed endpoints x and y , then the lifts and starting from the same x have the same endpoints and . Sketch of proof. The homotopies can be lifted through the local homeomorphisms. The following results can be easily proved using Proposition 2.7 and Theorem 2.8:
{l.graphcover}

Lemma 2.9. Any covering space of a graph is a graph.


{l.k-fold}

Lemma 2.10. Let x X and U x a neighbourhood as in Denition 2.4. Then the number k of fact, the lifts of any path between x, y X give a pairing between the preimages of x and y .) Let X be a topological space. We say that X is a universal cover of X if: X is a cover. X is connected. X is simply connected, i.e., 1 (X ) = 1. The existence of a universal cover is guaranteed by the following theorem:

connected components of p1 (U ) is independent of x and U , and the covering is called k -fold. (In

{t.univcover}

Theorem 2.11. Every connected CW complex X has a universal cover X . Sketch of proof. Let the set of points of X be the xed endpoint homotopy classes of paths starting from a xed x X . The topology on X is dened by thinking of a class of paths [ ] to be close to [ ] if there are representatives , such that is just concatenated with a short piece of path. Exercise 2.2. Write down the above denition of the topology on X properly and the proof that, with this topology, 1 (X ) = 1. The fundamental group 1 (X ) acts on X with continuous maps, as follows. point above x. Using Proposition 2.7 and Theorem 2.8, there exists a unique homeomorphic curve (not depending on the representative for ) with starting point x and some ending point x , which also belongs to p1 (x). The action f on x is now dened by f (x ) = x . As we mentioned 15 Let 1 (X, x) be a loop, p the surjective map dened by the covering X , and x p1 (x) a

, 1 (X, x), so we indeed have an action of 1 (X, x) on the bre p1 (x).

above, this action does not depend on the representative for , and it is clear that f f = f for We need to dene the action also on any y p1 (y ), y X . We take a path in X from y

to x , then = p( ) is a path from y to x, and 1 is a path from y to itself. Since 1 (X ) = 1, all possible choices of are homotopic to each other, hence all resulting and all 1 curves are homotopic. By Proposition 2.7 and Theorem 2.8, there is a unique lift of 1 starting from y , and its endpoint y p1 (y ) does not depend on the choice of . Hence we indeed get an action of 1 (X, x) on the entire X . This action permutes points inside each bre, and it is easy to see that the action is free (i.e., only the identity has xed points). If we make the quotient space of X by this group action (i.e., each point of X is mapped to its orbit under the group action), we will obtain X , and the quotient map is exactly the covering map p: X/1 (X, x) = X . (2.1) {e.pi1factor}

Example: The fundamental group of the torus T2 is Z2 . We have R2 /Z2 = T2 , where R2 is the usual covering made out of copies of the square [0, 1) [0, 1). X : there is a natural surjective map from XH to X , by taking the 1 (X )-orbit containing any given H. H -orbit. On the other hand, X is the universal cover of XH , too, and we have 1 (XH ) = For each subgroup H 1 (X ), we can consider XH = X/H , which is still a covering space of

2.3

The main results on free groups

{ss.freemain} {t.freeisom}

A probably unsurprising result is the following: Theorem 2.12. Fk = Fl only if k = l . Exercise 2.3. Prove Theorem 2.12. (Hint 1: how many index 2 subgroups are there? Or, hint 2: What is Fk /[Fk , Fk ]?) The next two results can be proved also with combinatorial arguments, but the topological language of the previous subsection makes things much more transparent.
{t.NieSch}

Theorem 2.13 (Nielsen-Schreier). Every subgroup of a free group is free.


{t.Schind}

Theorem 2.14 (Schreiers index formula). If Fk is free and Fl Fk such that [Fk : Fl ] = r < , then l 1 = (k 1)r. Proof of Theorem 2.13. Let the free group be Fk . By Theorem 2.5, we have Fk = 1 (Rosek ). Let G be the universal cover of Rosek . By Lemma 2.9, it is a graph; in fact, it is the 2k -regular tree T2k . Rosek and is covered by T2k . Again by Lemma 2.9, it is a graph, and 1 (GH ) = H . By Corollary 2.6, the fundamental group of a graph is free, hence H is free, proving Theorem 2.13. Now take H Fk . As discussed at the end of the previous section, GH = G/H is a covering of

16

Proof of Theorem 2.14. The Euler characteristic of a graph is the dierence between the number of vertices and the number of edges: (G) = |V (G)| |E (G)| . Homotopies of graphs increase or decrease the vertex and the edge sets by the same amount, hence is a homotopy invariant of graphs. Furthermore, if G is an r-fold covering of G, then r(G) = (G ). Fk is r, we see that T2k /H is an r-fold covering of Rosek . Thus, (T2k /H ) = r(Rosek ) = r(1 k ). Since the graph Rosek has one vertex and k edges, (Rosek ) = 1 k . As the index of H = Fl in On the other hand, T2k /H is a graph with 1 (T2k /H ) = H = Fl . Since any graph is homotopic

to a rose, and, by Theorems 2.5 and 2.12, dierent roses have non-isomorphic fundamental groups, Therefore, we obtain r(1 k ) = 1 l. 1 (Rosel ) = Fl , we get that T2k /H must be homotopic to Rosel , and thus (T2k /H ) = 1 l. Notice that, if 2 r < then r(1 k ) = 1 k . Exercise 2.4. Prove the Fk has Fk as a subgroup with innite index. Exercise 2.5. * A nitely generated group acts on a tree freely if and only if the group if free. (The action is by graph automorphisms of the tree T , and a free action means that StabG (x) = {1} for any x V (T ) E (T ).) Hint: separate the cases where there is an element of order 2 in , and where there are no such

elements (in which case there is a xed vertex, as it turns out). We will now show that a nitely generated free group is linear. For instance, F2 SL2 (Z)
{l.pingpong}

(integer 2 2 matrices with determinant 1). Indeed:

Lemma 2.15 (Ping-Pong Lemma). Let be a group acting on some set X . Let 1 , 2 be subgroups of , with |1 | 3, and let = 1 , 2 . Assume that there exist non-empty sets X1 , X2 X with X2 X1 and (X2 ) X1 1 \ {1} , Then 1 2 = . (X1 ) X2 2 \ {1} .

S1 , S2 |R1 , R2 . In particular, Fk = Z Z Z, k times. Exercise 2.6. Prove the Ping-Pong Lemma. Now let 1 = with X1 = x y R2 , |x| > |y | , 17 X2 = 1 2n 0 1 , nZ , 2 =

The product denotes the free product. If 1 = S1 |R1 and 2 = S2 |R2 , then 1 2 =

2n 1 x y

, nZ ,

R2 , |x| < |y | .

Observe that the hypotheses of the Ping-Pong Lemma are fullled. Therefore, the matrices and 1 0 2 1 generate a free group.

1 2 0 1

Exercise 2.7. Show that the following properties of a group are equivalent. (i) For every = 1 there is a homomorphism onto a nite group F such that ( ) = 1. (ii) The intersection of all its subgroups of nite index is trivial. (iii) The intersection of all its normal subgroups of nite index is trivial. Such groups are called residually nite. For instance, Zn , SLn (Z) and GLn (Z) are residually nite, because of the natural (mod p) homomorphisms onto (Z/pZ)n , SLn (Z/pZ) and GLn (Z/pZ). Exercise 2.8. Show that any subgroup of a residually nite group is also residually nite. Conclude that the free groups Fk are residually nite. Exercise 2.9. (i) Show that if is residually nite and : is a surjective homomorphism, then it is / ker is nite, then arrive at a contradiction. Alternatively, show that any homomorphism onto a nite group F must annihilate ker .) (ii) Conclude that if k distinct elements generate the free group Fk , then they generate it freely. a bijection. (Hint: assume that ker is non-trivial, show that the number of index k subgroups of

{ex.resfin}

{ex.Fresfin}

2.4

Presentations and Cayley graphs

{ss.presentations}

Fix some subset of (oriented) loops in a directed Cayley graph G(, S ) of a group = S |R , which will be called the basic loops. A geometric conjugate of a loop can be: rotate the loop, i.e., choose another starting point in the same loop; translate the loop in G by a group element. A combination of basic loops is a loop made of a sequence of loops (i ), where the i-th is a geometric conjugate of a basic loop, and starts in a vertex contained in some j , j [1, i 1]. To any loop = (g, gs1 , gs1 s2 , . . . , gs1 s2 sk = g ), with each si S , we can associate the word
{p.loops}

w() = s1 s2 sk . So, translated copies of the same loop have the same word associated to them. and only if the words given by the basic loops generate Recall that R R as a normal subgroup.

Proposition 2.16. Any loop in G(, S ) is homotopy equivalent to a combination of basic loops if

denotes the normal subgroup of the relations, and = FS / R .

from B by combining them and applying homotopies C . The statement of the proposition is that C = L if and only if w(B ) = R .

Proof. Let the set of all loops in G be L, the set of basic loops B , and the set of loops produced

18

We rst show that

1 s 1 (s1 s2 sk )s1 , hence it is an element of

rotating by one edge, say, we get s2 sk s1 . This new word is in fact a conjugate of the old word: Next, when we combine two loops, (g, gs1 , . . . , gs1 sn = g ) and (h, ht1 , . . . , ht1 tm = h) into (g, . . . , gS1 , gS1 t1 , . . . , gS1 T, gS1 T sk+1 , gS1 T S2 = g ), w(B ) .

What happens to the associated word when we rotate a loop? From the word s1 s2 sk w(B ),

w(B )

= w(C ). Let us start with the direction .

where S1 = s1 sk , S2 = sk+1 sn and T = t1 tm , then the new associated word S1 T S2 equals


1 SS2 T S2 . Thus, combining loops does not take us out from

Finally, if two loops are homotopic to each other in G, then we can get one from the other by combining in a sequence of contractible loops, which are just contour paths of subtrees in G, i.e.,
1 themselves are combinations of trivial si s loops for some generators si S . See Figure 2.2 (b). i

w(B ) . See Figure 2.2 (a).

The eect of these geometric combinations on the corresponding word is just plugging in trivially reducible words, wich again does not take us out from w(B ) .

Figure 2.2: (a) Combining two loops. (b) Homotopic loops inside Z2 . So, we have proved w(B ) w(C ). But is now also clear: we have encountered the geometric R , since a word on S is a loop in G iff the of the factorization from the free group FS . w(B ) .

{f.loops}

counterpart of all the operations on the words in w(B ) that together produce word represents 1 in iff the word is in the kernel closed under translations, this is clear. On the other hand, we obviously have w(L) = R

So, the last step we need is that L = C if and only if w(L) = w(C ). Since both L and C are We now state an elegant topological denition of the Cayley graph G(, S ):

{d.CayleyCW}

Denition 2.5. Take the 1-dimensional CW-complex Rose|S | , and for any word in R, glue a 2-cell

on it with boundary given by the word. Let X (S, R) denote the resulting 2-dimensional CW complex.

Now take the universal cover X (S, R). This is called the Cayley complex corresponding to S and R. The 1-skeleton of X (S, R) is the 2|S |-regular Cayley graph G(, S ) of = S |R . whose universal cover will be homeomorphic to R2 , with a Z2 lattice as its 1-skeleton. For instance, when S = {a, b} and R = {aba1 b1 }, we glue a single 2-cell, resulting in a torus, The proof that this denition of the Cayley graph coincides with the usual one has roughly the same math content as our previous proposition. But here is a sketch of the story told in the 19

language of covering surfaces: It is intuitively clear, and can be proved using the Seifert van We can now consider the right Schreier graph G(, X, S ) of the action of on X = X (S, R) with Kampen theorem, that 1 (X (S, R)) = S |R . Then, as we observed in (2.1), X (S, R)/ = X (S, R).

generating set S : the vertices are the vertices of the CW complex X (this is just one -orbit), and the edges are (x , x s), where x runs through the vertices and s runs through S . Clearly, this is exactly the 1-skeleton of X . On the other hand, since the action is free, this Schreier graph is just the Cayley graph G(, S ), and we are done. So, if we factorize Fk by normal subgroups, we get all possible groups , and the Schreier graph of the action by on the Cayley complex will be a 2k -regular Cayley graph. Besides the Schreier graph of a group action, there is another usual meaning of a Schreier graph, which is just a special natural graph structure, Hg Hgs for s S . This graph is denoted by G(, H, S ). case: if H , and S is a generating set of , then the set H \ of right cosets {Hg } supports a
{ex.Schreier}

Exercise 2.10. * Show that any 2k -regular graph is the Schreier graph of Fk with respect to some subgroup H . On the other hand, the 3-regular Petersen graph is not a Schreier graph. The above results show that being nitely presented is a property with a strong topological avour. Indeed, it is clearly equivalent to the existence of some r < such that the Ripsr (G(, S )) complex is simply connected, with the following denition:
{d.Rips}

Denition 2.6. If X is a CW-complex and r > 0, then the Rips complex Ripsr (X ) is given by adding all simplices (of arbitrary dimension) of diameter at most r.

3
3.1

The asymptotic geometry of groups


{s.asymptotic}

Quasi-isometries. Ends. The fundamental observation of geometric group theory


{ss.qisom}

We now dene what we mean by two metric spaces having the same geometry on the large scale. It is a weakening of bi-Lipschitz equivalence: Denition 3.1. Suppose (X1 , d1 ) and (X2 , d2 ) are metric spaces. A map f : X1 X2 is called a quasi-isometry if C > 0 such that the following two conditions are met: 2. For each x2 X2 , there is some x1 X1 with d2 (x2 , f (x1 )) < C . 1. For all p, q X1 , we have
d1 (p,q) C

{d.Quasi-isometry}

C d2 (f (p), f (q )) C d1 (p, q ) + C .

Informally speaking, the rst condition means that f does not distort the metric too much (it is coarsely bi-Lipschitz), while the second states that f (X1 ) is relatively dense in the target space X2 . We also say that X2 is quasi-isometric to X1 if there exists a quasi-isometry f : X1 X2 . Exercise 3.1. Verify that being quasi-isometric is an equivalence relation. For example, Z2 with the graph metric (the 1 or taxi-cab metric) is quasi-isometric to R2 with the Euclidean metric. Also, Z Z2 with the graph metric is quasi-isometric to Z: (n, i) n for 20

{ex.Qisom-equivale

most importantly to us, if is a nitely-generated group, its Cayley graph depends on choosing the

while (n, i) 2n + i is an injective quasi-isometry with Lipschitz constant 2. Finally, and maybe

n Z, i {0, 1} is a non-injective quasi-isometry that is almost an isometry on the large scale,

symmetric nite generating set, but the good thing is that any two such graphs are quasi-isometric. Exercise 3.2. Show that the regular trees Tk and T for k, 3 are quasi-isometric to each other, by giving explicit quasi-isometries. For a transitive connected graph G with nite degrees of vertices, we can dene the volume radius n (in the graph metric on G) with center o. We will sometimes call two functions v1 , v2 from N to N equivalent if C > 0 such that v1 r /C < v2 (r) < Cv1 (Cr) C growth function vG (n) = |Bn (o)|, where o is some vertex of G and Bn (o) is the closed ball of

for all r > 0.

(3.1) {e.growthequiv}

It is almost obvious that quasi-isometric transitive graphs have equivalent volume growth functions. (For this, note that for any quasi-isometry of locally nite graphs, the number of preimages of any point in the target space is bounded from above by some constant.) Another quasi-isometric invariant of groups is the property of being nitely presented, as it follows easily from our earlier remark on Rips complexes just before Denition 2.6. Exercise 3.3. Dene in a reasonable sense the space of ends of a graph as a topological space, of ends. Prove that any quasi-isometry of graphs induces naturally a homeomorphism of their spaces of ends. Thus, the number of ends is a quasi-isometry invariant of the graph. By invariance under quasi-isometries, we can dene the space of ends of a nitely generated group to be the space of ends of any of its Cayley graphs. Exercise 3.4 (Hopf 1944). (a) Show that a group has two ends iff it has Z as a nite index subgroup. (b) Show that if a f.g. group has at least 3 ends, then it has continuum many. Exercise 3.5. (a) Show that if G1 , G2 are two innite graphs, then the direct product graph G1 G2 has one end. (b) Show that if |1 | 2 and |2 | 3 are two nitely generated groups, then the free product 1 2 has a continuum number of ends.
{ex.EndsExamples} {ex.2and3ends}

{ex.MysteriousEnds

knowing that Z has two ends, Zd has one end for d 2, while the k -regular tree Tk has a continuum

Ri , i = 1, 2, are nitely generated groups, each with a subgroup isomorphic to some H , with an embedding i : H i , then 1 H 2 = 1 H,1 ,2 2 := S1 S2 | R1 R2 {1 (h)2 (h)1 : h H } . If both i are nitely presented, then 1 H 2 is of course also nitely presentable. There is an important topological way of how such products arise: 21

An extension of the notion of free product is the amalgamated free product: if i = Si |

{t.SvK}

Theorem 3.1 (Seifert-van Kampen). If X = X1 X2 is a decomposition of a CW-complex into connected subcomplexes with Y = X1 X2 connected, and y Y , then 1 (X, y ) = 1 (X1 , y ) 1 (Y,y) 1 (X2 , y ) , with the natural embeddings of 1 (Y, y ) into 1 (Xi , x). Exercise 3.6. for each Z factor? Identify the group as a semidirect product Z2 Z4 , see Section 4.2. van Kampen? The reason for talking about amalgamated free products here is the following theorem. See the references in [DrK09, Section 2.2] for proofs. (There is one using harmonic functions [Kap07]; I might say something about that in a later version of these notes.)
{t.Stallings} {ex.amalgamsemi}

(a) What is the Cayley graph of Z 2Z Z (with the obvious embedding 2Z < Z), with one generator (b) Do there exist CW-complexes realizing this amalgamated free product as an application of Seifert-

Theorem 3.2 (Stallings [Sta68], [Bergm68]). Recall from Exercise 3.4 that a group has 0, 1, 2 or a continuum of ends. The last case occurs iff the group is a free product amalgamated over a nite subgroup. The following result is also called the fundamental observation of geometric group theory. For instance, it connects two usual denitions of geometric group theory: 1. it is the study of group theoretical properties that are invariant under quasi-isometries; 2. it is the study of groups using their actions on geometric spaces. We start with some denitions: Denition 3.2. A metric space X is called geodesic if for all p, q X there exists a, b R and an isometry g : [a, b] X with g (a) = p, g (b) = q . A metric space is called proper if all closed balls of nite radius are compact.
{d.actions} {d.spaces}

Denition 3.3. An action of a group on a metric space X by isometries is called properly discontinuous if for every compact K X , |{g : g (K ) K = }| is nite. Any group action denes an equivalence relation on X : the decomposition into orbits. The set

of equivalence classes, equipped with the factor topology coming from X , is denoted by X/. The action is called co-compact if X/ is compact. Lemma 3.3 (Milnor-Schwarz). Let X be a proper geodesic metric space, and suppose that a group acts on it by isometries properly discontinuously and co-compactly. Then is nitely generated, and for any xed x X , the map Jx : X dened by Jx (g ) = g (x) is a quasi-isometry (on each Cayley graph). Proof. Pick an arbitrary point x X , and consider the projection : X X/. By compactness of X/, there is an R < such that the -translates of the closed ball B = BR (x) cover X , or in other words, (B ) = X/. (Why exactly? Since each element of is invertible, its action on X is a 22

{l.Milnor-Schwarz}

bijective isometry, hence a homeomorphism. So, for any open U X and g , we have that g (U ) and 1 ( (U )) = images
(Br (x)) of open balls are open. Since Br X as r , we also have (Br (x)) X/, g

g (U ) are open in X , and therefore (U ) is open in X/. In particular, the

(x)).) and, by compactness, there exists an R < such that X/ = (BR

such that B si (B ) = . Let S be the subset of consisting of these elements si . Since each si is
1 an isometry, s belongs to S iff si does. Let i

Since the action of is properly discontinuous, there are only nitely many elements si \ {1}

r := inf d(B, g (B )) : g \ (S {1}) . Observe that r > 0. Indeed, if we denote by B closed ball with center x0 and radius R + 1, then inmum above is a minimum of nitely many positive numbers, so is positive. The claim now is that S generates , and for each g , d(x, g (x)) g 2R where g
S S

for all but nitely many g we will have B g (B ) = , hence d(B, g (B )) 1, therefore the

d(x, g (x)) +1, r

(3.2) {e.Jqisom}

= dS (1, g ) is the word norm on with respect to the generating set S . Since, in the

right Cayley graph w.r.t. S , we have dS (h, hg ) = dS (1, g ) for any g, h , this will mean that Jx (g ) := g (x) is coarsely bi-Lipschitz. Furthermore, since (B ) = X/, for each y X there is some g with d(y, g (x)) R, hence Jx will indeed be a quasi-isometry from to X . (m 1)r + R d(x, g (x)) < mr + R . Each xj belongs to some gj (B ) for some gj (and take gm+1 = g and g0 = 1). Observe that
1 d(B, gj (gj +1 (B ))) = d(gj (B ), gj +1 (B )) d(xj , xj +1 ) < r , 1 hence the balls B and gj (gj +1 (B )) intersect, and gj +1 = gj si(j ) for some si(j ) S {1}. Therefore,

Let g , and connect x to g (x) by a geodesic . Let m be the unique nonnegative integer with

Choose points x0 = x, x1 , . . . , xm+1 = g (x) such that x1 B and d(xj , xj +1 ) < r for 1 j m.

g = si(1) si(2) si(m) , and S is a generating set for . Moreover, g


S

d(x, g (x)) R d(x), g (x)) +1< +1. r r

On the other hand, the triangle inequality implies that d(x, g (x)) 2R g nishes the proof of (3.2).
S,

because, for any y X and s S , we have BR (y ) BR (s(y )) = and hence d(y, s(y )) 2R. This

23

This lemma implies that if 1 and 2 both act nicely (cocompactly, etc.) on a nice metric space, then they are quasi-isometric. A small generalization and a converse are provided by the following characterization noted in [Gro93]:
{p.qisomcomm}

Proposition 3.4. Two f.g. groups 1 and 2 are quasi-isometric iff there is a locally compact topological space X where they both act properly discontinuously and cocompactly, moreover, the two actions commute with each other. Exercise 3.7. Prove the above proposition. (Hint: given the quasi-isometric groups, the space X can be constructed using the set of all quasi-isometries between 1 and 2 .) An important consequence of Lemma 3.3 is the following:
{cor.FiniteIndex} {ex.qisomcomm}

Corollary 3.5. A nite index subgroup H of a nitely generated group is itself nitely generated and quasi-isometric to . The same conclusions hold if H is a factor of with a nite kernel. Proof. For the case H , we consider an extended Cayley graph of with respect to some

generating set. This is a metric space which contains not only vertices of the Cayley graph but also points on edges, so that each edge is a geodesic of length 1. H naturally acts on it by isometries can just apply it. and clearly this action satises all properties in the statement of the Milnor-Schwarz lemma we For the case when H is a factor of , the image of any generating set of generates H , so it is nitely generated, and the action of on the Cayley graph of H satises the conditions of the Milnor-Schwarz lemma. Based on this corollary, we will usually be interested in group properties that hold up to moving to a nite index subgroup or factorgroup or group extension. In particular, we say that two groups, 1 and 2 , are virtually or almost isomorphic, if there are nite index subgroups Hi i and isomorphic to a group that has some property P , we will say that almost or virtually has P . nite normal subgroups Fi Hi such that H1 /F1 H2 /F2 . Correspondingly, if a group is virtually Proposition 3.4 above is the geometric analogue of the following characterization of virtual iso{ex.isomcomm}

morphisms: Exercise 3.8. Show that two f.g. groups, 1 and 2 , are virtually isomorphic iff they admit comnite. muting actions on a set X such that the factors X/i and all the stabilizers Sti (x), x X , are Gromov proposed in [Gro93] the long-term project of quasi-isometric classication of all f.g. groups. This is a huge research area, with connections to a lot of parts of mathematics, including probability theory, as we will see. Here is one more instance of realizing a group theoretical notion via geometric properties of group actions. Recall the notion of residually nite groups from Exercise 2.7.

24

Exercise 3.9. * (a) Show that a group is residually nite iff it has a faithful chaotic action on a Hausdor topological space X by homeomorphisms: (i) the union of all nite orbits is dense in X ; (ii) the action is topologically transitive: for any nonempty open U, V X there is such that (U ) V = . (A rather obvious hint: start by looking at nite groups.) (b) Construct such a chaotic action of SL(n, Z) on the torus Tn .

3.2 3.3

Gromov-hyperbolic spaces and groups Asymptotic cones

{ss.hyperbolic} {ss.cones}

4
4.1

Nilpotent and solvable groups


{s.nilpsolv}

The basics

{ss.nilpsolvbasics

We have studied the free groups, which are the largest groups from most points of view. We have also seen that it is relatively hard to produce new groups from them: their subgroups are all free, as well, while taking quotients, i.e., dening groups using presentations, has the disadvantage that we might not know what the group is that we have just dened. On the other hand, the smallest innite group is certainly Z, and it is more than natural to start building new groups from it. Recall that a nitely generated group is Abelian if and only if it is a direct product of cyclic groups, and the number of (free) cyclic factors is called the (free) rank of the group. We understand these examples very well, so we should now go beyond commutativity. Recall that the commutator is dened by [g, h] = ghg 1 h1 .
{d.nilpsolv}

Denition 4.1. A group is called nilpotent if the lower central series 0 = , n+1 = [n , ] A group is called solvable if the derived series 0 = , n+1 = [n , n ] terminates at s = {1} Clearly, nilpotent implies solvable. Before discussing any further properties of such groups, let us give the simplest non-Abelian nilpotent example. The 3-dimensional discrete Heisenberg group is the matrix group 1 x z = 0 1 y : x, y, z Z . 0 0 1 terminates at s = {1} in nite steps. If s is the smallest such index, is called s-step nilpotent.

in nite steps. If s is the smallest such index, is called s-step solvable.

25

If we denote by X, Y, Z the matrices given by the three permutations of the entries 1, 0, 0 for x, y, z , then is given by the presentation X, Y, Z [X, Z ] = 1, [Y, Z ] = 1, [X, Y ] = Z .

y x
Figure 4.1: The Cayley graph of the Heisenberg group with generators X, Y, Z . Clearly, it is also generated by just X and Y . Note that [X m , Y n ] = Z mn . Furthermore, [, ] = Z , and the center is Z () = Z , hence is 2-step nilpotent.
{ex.HeisenGrowth} {f.Heisenberg}

Exercise 4.1. Show that the Heisenberg group has 4-dimensional volume growth. Whenever , we have that [ , ] is a subgroup of that is normal in : if g, h and

, then [, g ] = (g 1 g 1 ) and h1 [, g ]h = (h1 h)(h1 gh)(h1 1 h)(h1 g 1 h)

[ , ]. Therefore, both in the nilpotent and the solvable cases, by induction, n+1 n and n . Abelian factor must factor through Ab . It is not true that for any nitely generated group , the subgroup [, ] is generated by the

The factor Ab := /[, ] is called the Abelianization of , the largest Abelian factor of : any

commutators of the generators of . For instance, we will dene in Section 5.1 the lamplighter group However, if is nilpotent, then, by simple word manipulations one can show that each n is nitely = Z2 Z, a nitely generated solvable group, for which [, ] = Z Z2 is not nitely generated.

generated by iterated commutators of the generators of , as follows. In any commutator of two words on the generators of , there is an equal number of x and x1 letters for each generator x. By introducing commutators, we can move such pairs of letters towards each other until they become neighbours and annihilate each other. So, at the end, we are left with a product of iterated commutators. For instance, [a, bc] = abca1 c1 b1 = ab[c, a1 ]a1 cc1 b1 = [a, b]ba[c, a1 ]a1 b1 = [a, b]b a, [c, a1 ] [c, a1 ]aa1 b1 = [a, b] b, a, [c, a1 ] = [a, b] b, a, [c, a1 ] a, [c, a1 ] [c, a1 ] [a1 , c], b . a, [c, a1 ] bb1 [c, a1 ] [c, a1 ]1 , b

26

In the resulting word, the iteration depths of the commutators depend on the two words we started with, but since is s-step nilpotent, we can just delete from the word any iteration of depth larger than s. This way we get a nite set of generators for [, ], and then for each n , as well. Furthermore, any nilpotent group is polycyclic: there is a subgroup sequence Hi such that a solvable group is polycyclic iff all subgroups are nitely generated (and this includes all nitely H0 = , Hi+1 Hi , Hi /Hi+1 is a cyclic group, and there is some n such that Hn = {1}. In fact,

generated nilpotent groups). We do not give the proof here, only one small idea that will be needed later. If is nitely generated solvable, then the factor /[, ] is also nitely generated and Abelian, hence is a nite direct product of cyclic groups. If it is innite, then one of the factors must be Z, and hence composing the surjection /[, ] with the projection to this Z factor, we get a surjection of onto Z. So, we have the rst step in the desired polycyclic sequence, H0 /H1 Z.
{ex.finpres}

Exercise 4.2. (a) Assume that for some H , both H and /H are nitely presented. Show that is also nitely presented. (b) Show that any nitely generated almost-nilpotent group is nitely presented.

4.2

Semidirect products

{ss.semidirect}

We do not yet have any general procedure to build non-commutative nilpotent and solvable groups. Such a procedure is given by the semidirect product. For any group N , let Aut(N ) be the group of its group-automorphisms, acting on the right, i.e., group homomorphism. Then dene the semidirect product N H by (a, g ) (b, h) = (a(h) b, gh) , f acts by af = f (a) and f g acts by af g = (af )g = g (f (a)). Further, let : H Aut(N ) be some a, b N and g, h H .

It is easy to check that this is indeed a group; in particular, (a, f )1 = (f 1 (a1 ), f 1 ). Furthermore, f a = (1, f )(a, id) = (a, f ), so one might prefer writing = H N = {(h, a) : h H, a N }. Anyway, we have HN = and H N = {1G}. Conversely, whenever we have a group with a(h) := h1 ah, and it is easy to check that = H N . N := N H , with the inclusion a (a, id), and H with the inclusion f (1, f ). Then

subgroups N and H satisfying HN = and H N = {1}, then H acts on N by conjugations, Of course, for the trivial homomorphism : H {id} Aut(N ) we get the direct product of

N and H . The archetypical non-trivial examples are the ane group Rd GLd (R) = {v vA + b : A GLd (R), b Rd } and the group of Euclidean isometries Rn O(n), with the product being just

composition, v vA1 + b1 vA1 A2 + b1 A2 + b2 . Similar examples are possible with Zd instead product is in Exercise 3.6.

of Rd , but note that Aut(Zd ) = GLd (Z) is just SLd (Z). One more small example of a semidirect A fancy way of saying that N and /N F is to write down the short exact sequence 1 N F 1 , 27

which just means that the image of each map indicated by an arrow is exactly the kernel of the next map. (When I was a grad student in Cambridge, UK, a famous number theory professor, Sir Swinnerton-Dyer, wrote down short exact sequences just to prove he was not hopelessly old = idF , then we say that the short exact sequence splits. In this case, (F ) (N ) = {1} and (F )(N ) = , hence N F is a semidirect product with : F Aut(N ) given by the exact sequence 1 Z2 Z4 Z2 1 are actually direct products, but Z2 Z2 Z4 . (Or, the only Z2 subgroup of Z4 does not have trivial intersection with the given normal subgroup {0, 2}, hence there is no good .) Similarly, 1 [, ] Ab 1 does not always split. Nevertheless, a lot of solvable and nilpotent groups can be produced using semidirect products, as we will see in a second. It will also be important to us that any sequence 1 N Z 1 hence for H := x Z we have H N = {1} and HN = .

fashioned and old, he said.) If, in addition, there is an injective homomorphism : F with

conjugation: a(f ) := 1 (f )1 (a) (f ) . This splitting does not always happen; for instance,

dened by {0, 2} {0, 1, 2, 3} does not split, since Aut(Z2 ) = {id} means that all semidirect products

(4.1) {e.NGZ}

does split. This is because if x 1 (1), then (xk ) = k = 0 for all k Z \ {0}, while ker( ) = N ,
{ex.semidirect}

Exercise 4.3. (a) Show that (a, g )(b, h)(a, g )1 (b, h)1 = a(hg
1

h 1 ) ( g 1 h 1 )

(a1 )(g

h 1 )

(b1 )(h

, ghg 1 h1 ,

and using this, show that if N and H are solvable, then N H is also. (b) Given N g , g Aut(N ), show that N gk is a nite index subgroup of N g . Exercise 4.4. Show that if /K N with K a nite normal subgroup and N nilpotent, then has is also nilpotent. Therefore, an almost nilpotent group in the sense given at the end of Section 3.1 has a nite index nilpotent subgroup. (Hint: use the Z factor from the end of Section 4.1 and the splitting in (4.1).) Now, given = Zd M Z, M GLd (Z), we will prove the following: Proposition 4.1. If M has only absolute value 1 eigenvalues, then is almost nilpotent. If not, then has exponential volume growth. 28
{p.NilpOrExp}

k Z \ {0}
{ex.almostnilp}

a nite index nilpotent subgroup. Also, obviously, any subgroup or factor group of a nilpotent group

In fact, this is true in larger generality. The proof of the following theorem will be given in the next section (with a few details omitted): Theorem 4.2 (Milnor-Wolf [Mil68a, Wol68, Mil68b]). A nitely generated almost-solvable group is of polynomial growth if and only if it is almost-nilpotent, and is of exponential growth otherwise. Moreover, the Bass-Guivarch formula (which we will not prove) states that if di := freerank(i /i+1 ) for the lower central series, then the volume growth of a nilpotent is d() =
i

{t.MilnorWolf}

idi .

For example, for the Heisenberg group we have d0 = 2 and d1 = 1, hence d() = 4, agreeing with Exercise 4.1. To prove Proposition 4.1, we shall rst show the following lemma: Lemma 4.3. If M GLd (Z) has only eigenvalues with absolute value 1, then all of them are roots
{l.rootsofunity}

of unity.

Proof. Let 1 , . . . d be the set of eigenvalues of M . Then we have


d

Tr(M k ) =
i=1

k i Z.

k 1 d d Now consider the sequence vk = (k 1 , . . . , d ) (S ) as k = 0, 1, 2, . . . . Since (S1 ) is a compact m m = k+1 k , then (m 1 , . . . , d ) 1 and d i=1
d as . But m i |m i |

1 group, there exists a convergent subsequence {vk } of {vk }, and hence vk+1 vk 1 as . Let

implies that there exists m such that

d i=1

m i

= d. But

= 1i. Thus,

d m i=1 i m i = 1i.

Z. This

Denition 4.2. A matrix M is unipotent if all eigenvalues are equal to 1 and quasi-unipotent if all eigenvalues are roots of unity. Note that a matrix M is quasi-unipotent if and only if there exists m < such that M m is Thus, by Exercise 4.3 part (b), for the rst part of Proposition 4.1 it is enough to show that if M is unipotent, then = Zd M Z is nilpotent. Exercise 4.5. If M GLd (Z) is unipotent, then M = I + S 1 N S = S 1 (I + N )S for some strictly upper-triangular integer matrix N . Note that N d = 0 in the exercise. Then, for any m Z+ , m N m1 + . . . + I )S 1 m M m I = S 1 (N m + N m1 + . . . + N )S, 1 M m = S 1 (N m + 29
{ex.unipotent}

unipotent.

last result holds, in fact, for any m Z.

which implies that (M m I )d = 0. Since the inverse of a unipotent matrix is again unipotent, the What is now 1 = [, ]? By Exercise 4.3 part (a), (v, M i )(w, M j )(v, M i )1 (w, M j )1 = (vM i + wM ij vM ij wM j , I )

= vM ij (M j I ) + wM ij (I M i ), I .

vector part has a smaller and smaller support:

So, the matrix part has got annihilated, while, since M i I lowers dimension for any i Z, the 1 (d 1) dim subspace 2 (d 2) dim subspace . . .

d = { 1 } , hence is at most d-step nilpotent, proving the rst half of Proposition 4.1. Exercise 4.6. Show that Z
2

{ex.HeisenSemi}

B1 @

0 1

1C
A

Z is isomorphic to the Heisenberg group.

For the second half of Proposition 4.1, the key step is the following. Lemma 4.4. If M GLd (Z) has an eigenvalue || > 2, then v Z such that for any k N and 0 v + 1 vM + . . . + k vM k
d

{l.alldifferent}

i {0, 1}, the 2k+1 vectors are all dierent.

Proof. Note that M T also has the eigenvalue . Let b Cd a corresponding eigenvector, bM T = b, or M bT = bT . Dene the linear form (v ) = vbT . Then we have: (vM k ) = vM k bT = vk bT = k (v ) Since : Cd C is linear and non-zero, its kernel is (d 1)-dimensional, hence there exists v Zd such that (v ) = 0. Now suppose that
k k k k

i i (v ) =
i=0 i=0

i vM i =
i=0

i vM i =
i=0

i i (v )

with m m = 0 for some m k , but i = i for all m < i k . Let i = i i . Then


m

i i (v ) = 0 .
i=0

Now consider

m 1 i i=0 i

. On one hand, we have


m 1 i=0

i i = |m m | = ||m . 30

On the other hand,


m 1 i=0

i i

||m 1 ||m 1 , || 1

since || > 2. This is a contradiction, hence we have found the desired v . Now, to nish the proof of Proposition 4.1, assume that M has an eigenvalue with absolute value not 1. Since | det(M )| = 1, there is an eigenvalue || > 1. Then, there is an m < such that M m has the eigenvalue |m | > 2. Consider the products (k v, M m )(k1 v, M m ) (0 v, M m ) = (k vM mk + k1 vM m(k1) + + 0 v, M m(k+1) ), with i {0, 1} and the vector v Zd given by Lemma 4.4. By that lemma, for a xed k these (0, M m ), these elements are inside the ball of radius 2(k + 1), hence has exponential growth. The proposition has the following generalization:
{e.nore}

elements are all dierent. But, in the Cayley graph given by any generating set containing (v, I ) and

Exercise 4.7. * Let N be a nitely generated almost nilpotent group. Then N Z is almost nilpotent or of exponential growth (and this can be easily detected from ).

4.3

The volume growth of nilpotent and solvable groups

{ss.volnilpsolv}

As promised in the previous section, we want to go beyond semidirect products, and prove the Milnor-Wolf Theorem 4.2. The rst step of this is for free: we have seen that an exact sequence (4.1) always splits, hence Exercise 4.7 can be reformulated as follows:
{p.gnore}

Proposition 4.5. Assume that 1 N Z 1 is a short exact sequence with N being a nitely generated almost nilpotent group. Then, if is not almost nilpotent, then it has exponential growth. We will need two more ingredients, Propositions 4.6 and 4.7. We will prove the rst one, but not the second, although that is not hard, either: the key idea is just to analyze carefully the procedure we used in Section 4.1 to prove that the subgroups n in the lower central series are nitely generated. See [DrK09, Theorem 5.12] for the details.
{p.subtfg}

Proposition 4.6. Assume we have a short exact sequence 1 N Z 1 . (i) If has sub-exponential growth, then N is nitely generated and also has sub-exponential growth. (ii) Moreover, if has growth O(Rd ), then N has growth O(Rd1 ).

31

{d.polydist}

S SH . Note that for the distance in the corresponding Cayley graphs, dH (e, h) =: h h
H

Denition 4.3. Let H , and SH and S nite generating sets of H and respectively, with
H

h H . We say that H has polynomial distortion if there is a polynomial p(x) so that p( h h H .

G)

{p.polydist}

Proposition 4.7. If is a nitely generated nilpotent group, then [, ] has polynomial distortion. In fact, any subgroup has polynomial distortion, but we will not need that result. Exercise 4.8. * Without looking into [DrK09], but using the hint given in the paragraph before Proposition 4.6, prove the last proposition. Anti-example: Consider the solvable Baumslag-Solitar group, BS(1, m) := a, b | a1 ba = bm . A concrete realization of this presentation is to take the additive group H of the rationals of the form x/my , x, y Z, and then BS H Z, where t Z acts on H by multiplication by mt . Indeed, we can take the generators a : u um and b : u u + 1. One can check that [BS, BS] = b , hence the group is two-step solvable. Furthermore, an ban = bm . So bm
n n

= mn but bm

BS

= 2n + 1.
{t.nilpoly}

So b as a subgroup of BS(1, m) does not have polynomial distortion. Theorem 4.8. Finitely generated almost nilpotent groups have polynomial growth. Proof. The proof will use induction on the nilpotent rank. First of all, by Corollary 3.5, we can assume that is nilpotent. So, we have the short exact sequence 1 [, ] Ab 1 , where 1 = [, ] is nilpotent of strictly smaller rank. Moreover, as we showed in Section 4.1, it is nitely generated. Take a nite generating set S of it. Let the Abelian rank of Ab be r and its free-rank be r r . Take a generating set e1 , . . . er for Ab , with r free generators, and let the -preimages be T = {t1 , . . . , tr }. Then is generated by S T . Consider any word w of length at most R in S T . Move all letters of T to the front of w: since

1 , for any g 1 and t T there is some g 1 with gt = tg . We get a word w representing the same group element, but of the form wT wS , where wT is a word on T of length equal to the number of T -letters in the original w, while wS is a word on S , but its length might be much longer than the original number of S -letters in w. Now, since (wT ) in Ab can be written in the form
k1 1 r r ek 1 er , there is some element h [, ] such that wT wS = t1 tr h in . k k

Since w

triangle inequality, h

S T

r 1 R, we also have tk 1 tr

is the degree of this polynomial for the generating sets S and S T , then h 2R}| = O(R have
S T | |BR Dd

S T

2R. But, by Proposition 4.7, 1 has polynomial distortion, so, if D


S S T

S T

k1 + + kr R, and hence, by the O(RD ). By

the induction hypothesis, 1 has polynomial growth of some degree d, hence |{h 1 : h ). Since the number of dierent possible words
dD+r
1 tk 1

= O(R

), so has polynomial growth. 32

k trr

is O(R ), we altogether

{t.solvnilorexp}

Theorem 4.9. (i) Any nitely generated almost solvable group of polynomial growth is almost nilpotent. (ii) The statement remains true if only subexponential growth is assumed. Proof. The beginning and the overall strategy of the proof of the two cases are the same. By its derived series such that j /[j , j ] is innite. Since [ : j ] < , we can further assume that 1 N Z 1 . (4.2) {e.1KGZ1} Corollary 3.5, we may assume that is innite and solvable. Then, there is a rst index j 0 in

j = 0. So, by the argument at the end of Section 4.1, there is a short exact sequence

It is also clear from that argument that [N, N ] = [, ], so N is solvable. Moreover, in both cases, (i) and (ii), it is nitely generated, by Proposition 4.6. Furthermore, we know from the argument at (4.1) that this exact sequence splits. We now prove (i) by induction on the degree of the polynomial growth of . For degree 0, i.e., when is nite the statement is trivial. Now, if has volume growth O(Rd ), then Proposition 4.6 says that N has growth O(Rd1 ), so, by induction, it is almost nilpotent, and then Proposition 4.5 says that is also almost nilpotent, and we are done. In case (ii), we know only that has subexponential growth. By Proposition 4.6, also N does. So, we can iterate (4.2), with N in place of . If this procedure stops after nitely many steps, then in the last N is a nite group, which is trivially almost nilpotent, hence we can apply Proposition 4.5 iteratively to the short exact sequences we got, and nd that at the top was also almost nilpotent. However, it is not obvious that the splitting procedure (4.2) terminates. For instance, for the nitely generated solvable lamplighter group = Z Z that we mentioned earlier, in the rst step we can have N = [, ] = Z Z, then we can continue splitting o a Z factor ad innitum. Of course, has exponential growth and N = [, ] is not nitely generated, which suggests that subexponential growth should crucially be used. This is done in the next exercise, nishing the proof.
{ex.inftysplit}

Exercise 4.9. * Assume that is nitely generated and has innitely many subgroups 1 , 2 , . . . with the properties that for each i 1, i Z, and i i+1 , i+2 , . . . = {1}. Show that has exponential growth. Proof of Proposition 4.6. Let = f1 , f2 , . . . , fk . Take so that ( ) = 1 Z. Now, for each S = m,i := m gi m m Z, i = 1, . . . , k , then S = N , since for any f ker( ), f = fi1 fil = gi1 si1 gil sil = (gi1 ) ( si1 gi2 si1 ) ( si1 si2 gi3 si1 +si2 ) (
P l 1

fi pick si Z so that (fi si ) = 0. Let gi = fi si N . Then = g1 , . . . , gk , . If we let

k=1

sik

gil sil ) ,

33

where the last factor is actually also a conjugated gil , since indeed f S .

l k=1 sik

= 0, due to f ker( ). So

Now consider a xed i and the collection of 2m words (for m > 0)


2 1 m gm gi gi j { 0 , 1 } ,

each of length at most m on the generators gi and gi . So, the subexponential growth of implies that there must exist some m and m = m such that
m 1 m 1 . gi = gi gi gi

Now notice that, somewhat miraculously,


m m 1 m 1 = 1 gi gi ,i m,i .

Thus our relation becomes


m 1 m 1 1 ,i m,i = 1,i m,i .

By m m = 0, this yields m,i 0,i , . . . , m1,i . Since m+1,i = m,i 1 , we also get m+1,i 1,i , . . . , m,i 0,i , . . . , m1,i . We can do the same argument for m < 0, and so get that n,i | n Z is nitely generated, and doing this for every i gives that N is nitely generated.
Y Let Y be this particular nite generating set of N . Then = Y { } . Let BR be the ball of

Y radius R in N with this generating set. Suppose |BR | = {h1 , . . . , hK }, and consider the elements

{ hi k | R k R } .
1 If hi k = hj l , then lk = h / N , so l = k and i = j . That is, these elements are j hi , but

distinct and belong to BR+1 . There are K (2R + 1) of them. So


Y |BR |=K

Y { }

|BR+1 | . 2R + 1

Y { }

This shows both parts (i) and (ii) of the proposition.

4.4

Expanding maps. Polynomial and intermediate volume growth

{ss.franks} {d.expanding}

Let us start with some denitions. Denition 4.4. Let X and Y be metric spaces. Then : X Y is an expanding map if > 1 such that dX (x, y ) dY ((x), (y )) x, y X . Recall the denition of virtually isomorphic groups from the end of Section 3.1. In particular, (hence injective), and [2 : (1 )] < . a group homomorphism : 1 2 is an expanding virtual isomorphism if it is expanding

34

{l.franks}

Lemma 4.10 (Franks lemma 1970). If a nitely generated group has an expanding virtual automorphism, then it has polynomial growth. Exercise 4.10. Given two dierent generating sets S1 and S2 for , prove that if is expanding in G(, S1 ) then k is expanding in G(, S2 ) for some k . The standard examples of expanding virtual automorphisms are the following: 1. In Zd , the map x k x is an expanding virtual isomorphism, since [Zd : k Zd ] = k d . 2. For the Heisenberg group H3 (Z), the map 1 x z m,n 0 1 y 0 0 1 1 0 0

mx mnz 1 0

is an expanding virtual automorphism, with index [H3 (Z) : m,n (H3 (Z))] = m2 n2 .

ny 1

Exercise 4.11. ** If Zd M Z is nilpotent, does it have an expanding virtual automorphism? There exist nilpotent groups with no isomorphic subgroup of nite index greater than one [Bel03]. Groups with this property are called co-Hopan. (And groups having no isomorphic factors with non-trivial kernel are called Hopan.) Instead of proving Franks Lemma 4.10, let us explore a geometric analogue: Lemma 4.11. Let M be a Riemannian manifold. Assume v (r) := supxM vol(Br (x)) < . If
{l.franks2}

: M M is an expanding homeomorphism with Jacobian Det(D) < K , then M has polynomial volume growth. Proof. From Denition 4.4 of expanding, we have (Br (x)) Br ((x)), which gives vol((Br (x))) vol(Br ((x))).

By the bound on the Jacobian of , we have that Kv (r) vol((Br (x)) vol(Br ((x))), and the following: For a given r, set j = log r. Then taking the supremum over x on both sides gives Kv (r) v (r). This implies polynomial growth by v (r) = v (j ) v (j ) Kv (j 1 ) K j v (1). Since K j K j max(1, K ) and K j = K log r = rlog K , we nally have v (r) Crd , where d = log K and C = v (1) max(1, K ). Exercise 4.12. Prove the group version of Franks Lemma. (Hint: Emulate the ideas of the proof of the topological version, but in a discrete setting, where the bounded Jacobian is analogous to the nite index of the subgroup.)

35

Exercise 4.13. *** Assume is a nitely generated group and has a virtual isomorphism such that
n1

n () = {1}.

(This is the case, e.g., when is expanding.) Does this imply that has polynomial growth? A condition weaker than in the last exercise is the following: a group is called scale-invariant This notion was introduced by Itai Benjamini, and he had conjectured that it implies polynomial growth of . However, this was disproved in [NekP09], by the following examples. The proofs use the self-similar actions of these groups on rooted trees, see Section 15.1. Theorem 4.12. The following groups of exponential growth are scale-invariant: the lamplighter group Z2 Z, described below in Section 5.1; the ane group Zd GL(d, Z); the solvable Baumslag-Solitar group BS(1, m), dened in Section 4.3. On the other hand, torsion-free Gromov-hyperbolic groups are not scale-invariant, a result that can be a little bit motivated by the well-known fact that hyperbolic spaces do not have homotheties. Benjaminis question was motivated by the renormalization method of percolation theory. In [NekP09], we formulated a more geometric version of this question, on scale-invariant tilings in transitive graphs, which is still relevant to percolation renormalization and is still open; see Question 12.26. Going back to volume growth, here are some big theorems: Theorem 4.13 (Tits alternative [Tit72]). A linear group (subgroup of a matrix group) either has F2 or it is solvable. In particular, has either exponential growth or polynomial growth. Theorem 4.14 (Grigorchuk [Gri83]). There exist nitely generated groups with intermediate growth. Conjecture 4.15. There are no groups with superpolynomial growth but with growth of order exp(o( n)). For an introduction to groups of intermediate growth, see [GriP08], and for more details, see [dlHar00, BarGN03]. The proof of Grigorchuks group being of intermediate growth relies on the following observation: Lemma 4.16 (Higher order Franks lemma). If is a group with growth function v (n) and there exists an expanding virtual isomorphism ,
m 2

if it has a chain of subgroups = 0 1 2 such that [ : n ] < and

n = { 1 } .

{t.scaleinv}

{l.highfranks}

then exp(n1 )

v (n)

exp(n2 ) for some 0 < 1 2 < 1. 36

Exercise 4.14. Prove the higher order Franks lemma. (Hint: m implies the existence of virtual isomorphism gives the existence of 2 .) 1 , since v (n)m C v (kn) for all n implies that v (n) has superpolynomial growth. The expanding

5
5.1

Isoperimetric inequalities
{s.isop}

Basic denitions and examples

{ss.isopbasic} {d.isoperimetric}

We start with a coarse geometric denition that is even more important than volume growth. Denition 5.1. Let be an increasing, positive function and let G be a graph of bounded degree. Then we say that G satises the -isoperimetric inequality IP if > 0 such that |E S | (|S |) for any nite subset of vertices S V (G), where the boundary E S is dened to be the set of edges which are adjacent to a vertex in S and a vertex outside of S . The supremum of all s

with this property is usually called the -isoperimetric constant, denoted by ,E . Besides the edge boundary E dened above, we can also consider the outer vertex boundary out V S , the set of vertices outside of S with at least one neighbour in S , and the inner vertex boundary in V S , the set of vertices inside S with at least one neighbour outside S . Since G has bounded degrees, any of these could have been used in the denition. Also note that satisfaction of an isoperimetric inequality is a quasi-isometry invariant. Zd satises IPn11/d , often denoted IPd . (This is so well-known that one might forget that it needs a proof. See Subsections 5.3 and 5.4.) If a group satises IP , i.e., a linear isoperimetric constant ,E in this case is usually called the Cheeger constant. Denition 5.2. A bounded degree graph G is amenable if there exists a sequence {Sn } of connected subsets of vertices, Sn V (G), such that |Sn | 0. |Sn | nitely generated Cayley graphs is. Such an {Sn } is called a Flner sequence. We say that a group is Flner amenable if any of its If, in addition to connectedness, the Sn s also satisfy Sn V (G), i.e., Sn Sn+1 n and
n

inequality, then it is called nonamenable, as described in the next denition. The isoperimetric
{d.amenable}

has a Flner exhaustion, as the example after the next exercise shows. Exercise 5.1. (a) Find the edge Cheeger constant ,E of the innite binary tree.

Sn = V (G), then the sequence {Sn } is called a Flner exhaustion. Not every amenable graph

(b) Show that a bounded degree tree is amenable iff there is no bound on the length of hanging chains, i.e., chains of vertices with degree 2.

37

root a binary tree at every vertex whose distance from the origin is between 2k and 2k+1 . This tree has unbounded hanging chains and is thus amenable; however, it is clear that you cannot both have Sn connected and Sn Sn+1 in a Flner sequence, because otherwise the Sn s would contain larger and larger portions of binary trees, causing them to have a large boundary. Lemma 5.1. In Denition 5.2, you can achieve Sn V (G) in the case of amenable Cayley graphs.
r Proof. Given a Flner sequence Sn , set Sn :=

Consider now the following tree. Take a bi-innite path with an origin, and, for each even k Z,

centered at the origin. Without loss of generality we can assume e Sn , since acts on G by
r r Sn Sn , hence |Sn | |Sn |. Also,

gBr

gSn , where Br is the ball in G of radius r

1 graph automorphisms, and thus choosing any gn Sn we can consider gn Sn . Now, since e Br ,

r Sn

gBr

| (gSn )| |B (r)| |Sn |.

Thus we have

r |Sn | |Sn | |Br | . r |Sn | |Sn |

Now, for each r N, choose nr such that |Snr |/|Snr |

{r(i)}i such that Sr (i) Br (i+1) , then {Sr (i) }i G is a Flner exhaustion.

a Flner sequence, and e Sn implies that Br Sr . If we take now a rapidly growing sequence

1 r |Br | ,

r . Then {Sr } is and set Sr := Sn r

The archetypical examples for the dierence between amenable and non-amenable graphs are the Euclidean versus hyperbolic lattices, e.g., tilings in the Euclidean versus hyperbolic plane. The notions non-amenable, hyperbolic, negative curvature are very much related to each other, but there are also important dierences. Here is a down-to-earth exercise to practice these notions; it might not be obvious at rst sight, but part (a) is a special case of part (b).

Figure 5.1: Trying to create at least 7 neighbours for each country. Exercise 5.2. (a) Consider the standard hexagonal lattice. Show that if you are given a bound B < , and can group the hexagons into countries, each being a connected set of at most B hexagons, then it is not 38

{f.hexsept}

possible to have at least 7 neighbours for each country. (b) In a locally nite planar graph G, dene the combinatorial curvature at a vertex x by curvG (x) := 2 (Li 2) , Li

where the sum runs over the faces adjacent to x, and Li is the number of sides of the ith face. Show possible that both G and its planar dual G are edge-amenable. that if there exists some > 0 such that curvature is less than at each vertex, then it is not
{ex.endsnonamen}

Exercise 5.3. Show that a group with a continuum number of ends must be non-amenable. We now look at an important example of a Flner amenable group with exponential growth. The lamplighter group is dened to be the wreath product Z2 Z, which is dened to be Z2 Z,
Z

where the left group consists of all bi-innite binary sequences with only nitely many nonzero terms, and is the left shift automorphism on this group. Thus a general element of the lamplighter group looks like (f, m), where f : Z Z2 has |supp(f )| < , interpreted as a conguration of Z2 -lamps on a Z-street, and m Z is the lamplighter or marker. Such pairs multiply according to the semidirect product rules, see Section 4.2.

group is generated by the following three elements: s := (e0 , 0);

Let ek Z Z2 denote the function that has a 1 in the k th place and zeroes everwhere else. The R := (0, 1); and L := (0, 1).

(Unfortunately, because of the way we dened semidirect multiplication, we need to multiply from the left with these nice generators to get the nice interpretation.) exponential growth. This bound is clear because at each step you can either switch (apply s), or not. On the other hand, it is not hard to see that this letf Cayley graph is amenable: set Sn = {(f, m) : n m n and supp(f ) [n, n]}
in and observe that |Sn | = 22n+1 (2n + 1) and |V Sn | = 22n+1 2, since the points on the boundary

(f, m + 1) and L (f, m) = (f, m 1), and so they can be interpreted as switch, Right and Left.

Multiplication by these generators gives s (f, m) = (e0 , 0) (f, m) = (em + f, m), while R (f, m) =

The volume of a ball in this generating set has the bound |Bn (id)| 2n/2 , therefore it has

correspond to m = n or m = n.

5.2

Amenability, invariant means, wobbling paradoxical decompositions

{ss.amen}

The algebraic counterpart of amenability is the following.

39

{d.vNamenable}

Denition 5.3 (von Neumann 1929). A nitely generated group is von Neumann amenable if it two properties for every bounded f : R: 1. m(f ) [inf(f ), sup(f )] 2. For all , m(f ) = m(f ), where f (x) := f (x). Proposition 5.2. A nitely generated group is von Neumann amenable if and only if there exists a nitely additive invariant (with respect to translation by group elements) probability measure on all subsets of . To prove the proposition, identify (A) and m(1A ) and approximate general bounded functions by step functions. Example: Z is amenable, but the obvious candidate (A) = lim sup
n

has an invariant mean on L (), i.e., there exists a linear map m : L () R with the following

{p.measurevN}

|[n, n] A| 2n + 1

is not good enough, as it fails nite additivity. But there is some sort of limit argument to nd an appropriate measure, which requires the Axiom of Choice. Exercise 5.4. (a) Prove that subgroups of amenable groups are amenable. (b) Given a short exact sequence 1 A1 A2 1, show that if A1 and A2 are amenable, then is as well. From the amenability of Z, this exercise gives that Zd is amenable. Moreover, it can be proved that an innite direct sum of amenable groups is also amenable, hence the lamplighter group is and only if all nitely generated subgroups of it are amenable, and hence the exercise implies that any solvable group is amenable. Theorem 5.3 (Flner [Fl55]). If is a nitely generated group, it is von Neumann amenable if and only if any of its Cayley graphs is Flner amenable. Theorem 5.4 (Kesten [Kes59]). A nitely generated group is amenable if and only if the spectral radius of any of its Cayley graphs, as dened in (1.4), equals 1. We will sketch a proof of Flners theorem below, and prove Kestens theorem in a future lecture, Section 7.2. Proposition 5.5. F2 is nonamenable in the von Neumann sense.
{pr.F2vN} {t.kestenorig} {t.folner}

also amenable: it is two-step solvable, with [, ] = Z Z2 . More generally, a group is amenable if

40

Proof. Denote F2 as a, b , and let A+ denote the set of words in F2 beginning with a. Let A denote the set of words beginning with a1 , and let A = A+ A . Dene B + , B , and B similarly. Notice that F2 = A B {e}, and that F2 also equals A+ aA , as well as B + bB . (F2 ) = (A) + (B ) + ({e}) Now, suppose that we have as in Proposition 5.2. Certainly ({e}) = 0, and so we have = (A+ ) + (A ) + (B + ) + (B ) = (A+ ) + (aA ) + (B + ) + (bB ) = (A+ aA ) + (B + bB ) = 2(F2 ). This is a contradiction, and thus no such measure exists. Exercise 5.5. * SO(3) F2 (Use the Ping Pong Lemma, Lemma 2.15). This exercise and Proposition 5.5 form the basis of the Banach-Tarski paradox: the 3dimensional solid ball can be decomposed into nitely many pieces that can be rearranged (using rigid motions of R3 ) to give two copies of the original ball (same size!). See [Lub94] or [Wag93] for more on this. The following theorem was proved by Olshanski in 1980, Adian in 1982, Gromov in 1987, and again by Olshanski and Sapir in 2002, this time for nitely presented groups. Theorem 5.6 (Olshanski 1980). There exist nonamenable groups without F2 as a subgroup. An example of this is the Burnside group B (m, n) = g1 , ..., gm | g n = 1g for m 2 and n 665
{t.olshanski}

and odd.

Now we sketch the proof of the Flner theorem, but rst we dene some notions and state an exercise we will use. I learned about this approach, using wobbling paradoxicity, from G abor Elek; see [ElTS05] and the references there, although I am not sure that this proof has appeared anywhere before. Denition 5.4. Let X be a metric space. : X X is wobbling if supx d(x, (x)) < . Further,
{ex.wobbling} {d.wobbling}

if G = (V, E ) is a graph, then the maps and are a paradoxical decomposition of G if they are wobbling injections such that (V ) (V ) = V . Exercise 5.6. * A bounded degree graph is nonamenable if and only if there exists a wobbling paradoxical decomposition. (Hint: State, prove and use the locally nite innite bipartite graph version of the Hall marriage theorem [Die00, Theorem 2.1.2], called the Hall-Rado theorem.)

Sketch of proof of Theorem 5.3. For the reverse direction, if there exists a Flner sequence {Sn }, dene n (A) := sequence, |Sn g 1 Sn | < |Sn | for g a generator. So n (Ag ) =
| A S n | | Sn | ,

and show that some sort of limit exists. Now, because {Sn } is a Flner
|AgSn | | Sn |

same limit as n (A), giving invariance of .

| ( A S n g 1 ) g | | Sn |

will have the

41

For the forward direction, we prove the contrapositive, so assume that G is nonamenable. Then by Exercise 5.6, it has a paradoxical decomposition and . Suppose that both of these maps move a vertex a distance of at most r. Then we can decompose V = A1 A2 Ak = B1 B2 B , where |Ai is translation by some gi Br , |Bi is a translation by some hi Br , and k, |Br |. If measure on as in Denition 5.2, then since ((Ci,j )) = (Ci,j ) = ( (Ci,j )), we have (V ) = ((V ) (V )) = Hence is von Neumann nonamenable. ((Ci,j )) + ( (Ci,j )) = 2(V ) .
i,j

we let Ci,j := Ai Bj , and if we assume for contradiction that there exists some invariant probability

5.3

From growth to isoperimetry in groups

{ss.growthisop}

The following theorem was proved by Gromov for groups, and generalized by Coulhon and SaloCoste in 1993 for any transitive graph. Theorem 5.7 ([CouSC93]). Let be a nitely generated group, with a right Cayley graph G(, S ). Dene the inverse growth rate by (n) := min{r : |Br (o)| n} . have
in Let K be any nite subset of . Recall the denition V K = {v K : S v / K }. Then we in V K 1 . |K | 2(2|K |)

{t.CSC}

in generators, then, by iterating the above argument, we see that |K \ Kg 1 | r V K .

in in x V K . Thus |K \ Ks1 | V K . More generally, if g

Proof. Take any s S . It is clear that x K \ Ks1 , i.e., x xs moves x out of K , only if
S

= r, i.e., g is a product of r

On the other hand, let = (2|K |), and observe that for any x K , xg : g B (o) \ K |K |

xg : g B (o) K ,

since the size of {xg : g B (o)} is greater than 2|K |. Therefore, if we pick g B (o) uniformly at random, then P g moves x out of K 1/2 P g leaves x in K ,

which implies E number of xs moved out of K |K |/2. Hence, there is a g that moves at least |K |/2 elements out of K . Combining our upper and lower bounds on the number of elements of K moved out of K by g , we have
in V K g in V K |K |/2 ,

and we are done. 42

Examples: For = Zd , (n) = n1/d , hence we get that it satises IPd . (We will see another proof strategy in the next section.) Hence Zd shows that the above inequality is sharp, at least in the regime of polynomial growth. The lamplighter group shows that the inequality is also sharp for groups of exponential growth. Exercise 5.7 (Tim ar). *** For any f.g. group , does ?C1 s.t. A nite g with 0 < d(A, gA) C1 , and does ?C2 s.t. A, B nite, g with 0 < d(A, gB ) C2 ? Exercise 5.8. Give an example of a group where C2 > 1 is needed. One reason for these exercises to appear here is the important role translations played also in Theorem 5.7. A simple application of an armative answer to Exercise 5.7 would be that the boundary-to-volume ratio would get worse for larger sets, since we could glue translated copies of any small set to get larger sets with worse boundary-to-volume ratio. In other words, the isoperimetric prole (r) := {|S |/|S | : |S | r} would be roughly decreasing. An actual result using the same than the Cheeger constant. Related isoperimetric inequalities were proved in [BabSz92] and [Zuk00]. idea is [LyPer10, Theorem 6.2]: for any nite set, the boundary-to-volume ratio is strictly larger
{ex.Timar}

5.4

Isoperimetry in Zd
1

{ss.Zdisop} {t.integerIP}

On Zd , the following sharp result is known: Theorem 5.8. For any S in Zd , |E S | 2d|S |1 d , where E S is the set of edges with one vertex One proof of Theorem 5.8, using some very natural compression methods, was done by Bollob as and Leader [BolL91]. A beautiful alternative method can be seen in [LyPer10, Section 6.7]: it goes through the following theorem, which is proved using conditional entropy inequalities. See [BaliBo] for a concise treatment and a mixture of both the entropy and compression methods, with applications to combinatorial number theory. Theorem 5.9 (Discrete Loomis-Whitney Inequality).
d

in S and one in S c .

{t.LoomisWhitney}

S Zd ,

|S |d1

i=1

|Pi (S )|,

where Pi (S ) is the projection of S onto the ith coordinate.

43

The Loomis-Whitney inequality gives: |S | which is Theorem 5.8. Let me give my personal endorsement for the conditional entropy proof of Theorem 5.9 in [LyPer10, Section 6.7]. In [Pet08], I needed to prove an isoperimetric inequality in the wedge Wh Z3 , the subgraph of the lattice induced by the vertices V (Wh ) = {(x, y, z ) : x 0 and |z | h(x)}, below, it was shown in [LyT83] that Wh is transient iff
j =1
d 1 d

1 d

i=1

|Pi (S )|

1 d

d i=1

|Pi (S )|

|E S | , 2d

where h(x) is some increasing function. Namely, using the ow criterion of transience, Theorem 6.5

1 < . jh(j )

(5.1) {e.tlyons}

For example, h(j ) = logr j gives transience iff r > 1. I wanted to show that this implies Thomassens condition for transience [Tho92], which is basically an isoperimetric inequality IP with
k=1

(k )2 < .

(5.2) {e.thomassen}

(The reason for this goal will be clear in Section 12.4.) In such a subgraph Wh , the Bollob as-Leader compression methods seem completely useless, but I managed to prove the result using conditional entropy inequalities for projections. What I proved was that Wh satises IP with (v ) := vh v/h( v ) .
2 8

As can be guessed from its peculiar form, this is not likely to be sharp, but is good enough to deduce (5.2) from (5.1). E.g., for h(v ) = v , we get (v ) = v 2 + 4 conjectured isoperimetric function v
1+ 2+ 1

, which is close to the easily

only for close to 0, but that is exactly the interesting

regime here, hence this strange (v ) suces. Exercise 5.9. Show that the wedge Wh with h(v ) = v , > 0, does not satisfy IP whenever (v )/v 2+ .
1+

Random walks, discrete potential theory, martingales


{s.potential}

Probability theory began with the study of sums of i.i.d. random variables: LLN, CLT, large deviations. One can look at this as the theory of 1-dimensional random walks. One obvious generalization is to consider random walks on graphs with more interesting geometries, or on arbitrary graphs in general. The rst two sections here will introduce a basic and very useful technic for this: electric networks and discrete potential theory. Another obvious direction of generalization is to study stochastic processes that still take values in R, resemble random walks in some sense, but whose increments are not i.i.d. any more: these will be the so-called martingales, the subject of the third section. Discrete harmonic functions will connect martingales to the rst two sections. 44

6.1

Markov chains, electric networks and the discrete Laplacian

{ss.networks}

Simple random walks on groups (for which we saw examples in Section 1.1), are special cases of reversible Markov chains, which we now dene. A Markov chain is a sequence of random variables X1 , X2 , X3 , . . . V such that P Xn+1 = y | X1 = x1 , X2 = x2 , . . . , Xn = xn = P Xn+1 = y | Xn = xn = p(xn , y ). That is, the behaviour of the future states is governed only by the current state. The values p(x, y ) are called the transition probabilities, where it is usually assumed that x V . We will also use the notation pn (x, y ) = P Xn = y | X0 = x , just as in Section 1.1. evolves in time: after one step, we get the measure P (y ) =
x V y V

p(x, y ) = 1 for all

Given any measure (nite or innite) on the state-space V , the Markov chain tells us how it (x)p(x, y ). A measure is

called stationary if the chain leaves it invariant: P = . An important basic theorem is that any nite Markov chain has a stationary distribution, which is unique if the chain is irreducible (which means that for any x, y V there is some n with pn (x, y ) > 0). These facts follow from the Perron-Frobenius theorem for the matrix of transition probabilities, which we will not state here. There is also a probabilistic proof, constructing a stationary measure using the expected number of returns to any given vertex. See, for instance, [Dur96, Section 5.4]. For an innite state space, there could be no stationary measure, or several. We will not discuss these matters in this generality, but see the example below. We say that a Markov chain is reversible if there exists a reversible measure, i.e., non-negative function (x), not identically zero, that satises (x)p(x, y ) = (y )p(y, x) x, y.

Note that this reversible measure is unique up to a global factor, since (x)/ (y ) is given by the Markov chain. Note also that reversible measures are also stationary. But not every stationary measure is reversible: it is good to keep in mind the following simple examples. Example. Consider the n-cycle Z (mod n) with transition probabilities p(i, i + 1) = 1 p(i + 1, i) = p > 1/2 for all i. It is easy to see that this is a non-reversible chain; intuitively, a movie of the evolving chain looks dierent from the reversed movie: it moves more in the + direction. But the uniform distribution is a stationary measure. On the other hand, the chain with the same formula for the transition probabilities on Z is already reversible. Although the uniform measure (i) = 1 i Z is still stationary but nonreversible, the measure (i) = p/(1 p)
i

is reversible. The above intuition with reversing the

movie goes wrong now because is not a nite measure, hence looking at a typical realization of the

movie is simply meaningless. A simple real-life example of an innite chain having some unexpected stationary measures is the following joking complaint of my former advisor Yuval Peres: Each day is of average busyness: busier than yesterday but less busy than tomorrow. Exercise 6.1. Show that a Markov chain (V, P ) has a reversible measure if and only if for all oriented cycles x0 , x1 , . . . , xn = x0 , we have
n1 i=0 n1 i=0

p(xi , xi+1 ) =

p(xi+1 , xi ).

45

Now, all Markov chains on a countable state space are in fact random walks on weighted graphs: Denition 6.1. Consider a directed graph with weights on the directed edges: for any two vertices x and y , let c(x, y ) be any non-negative number such that c(x, y ) > 0 only if x and y are neighbours. Let Cx =
y

{d.network}

c(x, y ), and dene the weighted random walk to be the Markov chain with p(x, y ) = c(x, y ) . Cx

transition probabilities

An important special case is when c(x, y ) = c(y, x) for every x and y , i.e., we have weights on the undirected edges. Such weighted graphs are usually called electric networks, and the edge weights c(e) are called conductances. The inverses r(e) = 1/c(e) (0, ] are called resistances. The weighted random walk associated to an electric network is always reversible: Cx is a reversible measure. On the other hand, any reversible Markov chain on a countable state space comes from an electric network, since we can dene c(x, y ) := (x)p(x, y ) = (y )p(y, x) = c(y, x). An electric network is like a discrete geometric space: we clearly have some notion of closeness coming from the neighbouring relation, where a large resistance should mean a larger distance. Indeed, we will dene discrete versions of some of the notions of dierential and Riemannian geometry, which will turn out to be very relevant for studying the random walk associated to the network. Consider an electric network on the graph G = (V, E ). If e is an edge with vertices e+ and e , decompose it into two directed edges e and e such that e runs from e to e+ , while e runs from e+ to e . Denote by E the set of directed edges formed. Take f, g : V R. Dene f (e) = [f (e+ ) f (e )] c(e) to be the gradient of f . In a cohomological language, this is a coboundary operator, since, from functions on zero-dimensional objects (the vertices), it produces functions on one-dimensional objects (the edges). Also dene the inner product (f, g ) = (f, g )C =
x V

f (x) g (x) Cx .

Take , : E R such that (e) = (e). Dene the boundary operator (x) = Also dene another inner product, (, ) = (, )r = 1 2 (e) (e) r(e).
e E

(e)
e:e+ =x

1 . Cx

Some works, e.g. [LyPer10], omit the vertex conductances Cx from the inner product on V and the boundary operator , while our denition agrees with [Woe00]. This is an inessential dierence, but it is good to watch out for it.

46

{p.adjoint}

Proposition 6.1. (f, ) = (f, ), i.e., and are the adjoints of each other (hence the notation). Proof. The right hand side is (
xV e+ =x

(e)

1 )f (x)Cx = Cx

(e)f (x).
xV e+ =x

The left hand side is 1 2 f (e+ ) f (e ) c(e) (e) 1 1 = c(e) 2 f (e+ ) f (e ) (e).

e E

e E

In this sum, for each x and e such that e+ = x, the term f (x)(e) is counted twice: once when y = e+ and once when y = e+ = e . Since (e) = (e), the two sums are equal. Denition 6.2. The Markov operator P is the operator dened by (P f )(x) =
y

{d.markovop}

p(x, y )f (y ) ,

if the mean value property holds: the average of the function values after one step of the chain equals the value at the starting point x. A function is harmonic if it is harmonic at every vertex.

operator is := I P . A function f : V R is harmonic at some x V if f (x) = 0; that is,

acting on functions f : V R: taking the one-step average by the Markov chain. The Laplacian

Exercise 6.2. Show that the Markov operator P is self-adjoint with respect to the inner product (, ) , with some : V R0 , if and only if is a reversible measure for the Markov chain. harmonic for all x V \ U , then there are no strict local maxima in V \ U , and if there is a global Now, how is the Laplacian related to the boundary and coboundary operators? For any f : V R, we have f (x) = =
e+ =x e+ =x

Note that a direct consequence of harmonicity is the maximum principle: if f : V R is

maximum over V , then it must be achieved also at some point of U .

f (e)

1 Cx c(e) Cx f (e )c(e) Cx c(x, y ) Cx

f (e+ ) f (e ) f (e+ )c(e) Cx c(e) Cx

=
e+ =x

e+ =x

= f (x)
e+ =x

f (y )
y

= f (x)

f (y )p(x, y )
y

= f (x) (P f )(x) = f (x) . 47

{d.flow}

Denition 6.3. Let A and Z be disjoint subsets of V (G). A voltage between A and Z is a function f : V R that is harmonic at every x V \ (A Z ), A ow from A to Z is a function : E R with (e) = (e) such that (x) = 0 for node law: the inow equals the outow at every inner vertex. The strength of a ow is := owing from A to Z .
aA (a) Ca

while f |A 0 and f |Z 0.

every x V \ (A Z ), while |A 0 and |Z 0. In words, it satises Kirchho s = (z ) Cz , the total net amount

z Z

A main example of a ow is the current ow := f associated to a voltage function f between A and Z . (It is a ow precisely because of the harmonicity of f .) Example 1. Let G be a recurrent network, and let f (x) = Px [ A < Z ], where the s are the x V , hence f |A 0 and f |Z 0. For x V \ (A Z ), we have f (x) = hitting times on A, Z V (G). It is obvious that f |A = 1 and f |Z = 0, while f (x) [0, 1] for all
y

f (y )p(x, y ). From

this, it follows that f is harmonic on this set. Therefore, f is a voltage function from A to Z . Example 2. A slightly more general example is the following: let U be any subset of V , and let f : U R be any real-valued function. Then, for x V \ U , dene f (x) = E f (X ) | X0 = x , where X is the rst vertex visited in U (we assume that < almost surely). Then f is harmonic on V \ U : the exact same argument as above works. If we now dene U0 = {u U | f (u) < 0} and U1 = {u U | f (u) > 0}, then we have a ow on the graph from U1 to U0 . Example 3. Let Z V be such that V \ Z is nite, and let x / Z . Then GZ (x, y ) := Ex number of times the walk goes through y before reaching Z , is called Greens function killed at Z . It is the standard Greens function corresponding to the Markov chain killed at Z , i.e., with transition probabilities pZ (x, y ) = p(x, y )1xZ . (This is a subMarkovian chain, i.e., the sum of transition probabilities from certain vertices is less than 1: once in Z , the particle is killed instead of moving.) Quite similarly to the previous examples, GZ is harmonic

48

in its rst coordinate outside of Z {y }: GZ (x, y ) =


n1

pZ n (x, y )

(since x / Z)

=
n1 x

pZ (x, x )pZ n1 (x , y ) pZ n1 (x , y ) pZ n (x , y ) n0

=
x

pZ (x, x )
n1

=
x

p (x, x )

=
x

pZ (x, x )GZ (x , y )

= P (x) GZ (x, y ). To get harmonicity in the second coordinate, we need to change GZ a little bit. Notice that Cx GZ (x, y ) = Cy GZ (y, x) by reversibility. Therefore, f (x) := GZ (o, x)/Cx = GZ (x, o)/Co is f (o) > 0 and f |Z 0. So, f is a voltage function from o to Z . now harmonic in x / Z {o}. It is clear that f |Z = 0 and f (o) > 0, so, by the maximum principle,
{l.harmext}

U , there exists a unique extension of f to V that is harmonic on V \ U . above, works. Proof. Existence: The extension f (x) := E f (X ) | X0 = o

Lemma 6.2. Given a nite network G(V, E, c), a subset U V , and any real-valued function f on for x V \ U , given in Example 2

principle, it must also be attained on U , where g 0. Hence g 0 on V .

V \ U . Since V is nite, the global maximum and minimum of g is attained, and by the maximum Given this lemma, we can dene a certain electrical distance between disjoint nonempty subsets A and Z of V (G). Namely, consider the unique voltage function v from A to Z with v |A and v |Z , where > . The associated current ow i = v has strength i > 0. It is easy to see that R(A Z ) := i

Uniqueness: Suppose that f1 and f2 are two extensions of f . Let g = f1 f2 , again harmonic on

(6.1) {e.Reff}

is independent of the values > . It is called the eective resistance between A and Z . Its inverse C (A Z ) := 1/R(A Z ) is the eective conductance. Exercise 6.3. Show that eective resistances add up when combining networks in series, while eective conductances add up when combining networks in parallel. Exercise 6.4. (a) Show that for the voltage function f (x) of Example 3 above, the associated current ow has unit strength, hence R(o Z ) = GZ (o, o)/Co .
{ex.Ceff}

49

+ + (b) Using part (a), show that C (a Z ) = Ca Pa [ Z < a ], where a is the rst positive hitting

time on a.

{e.hitsym}

Exercise 6.5. Let G(V, E, c) be a transitive network (i.e., the group of graph automorphisms preserving the edge weights have a single orbit on V ). Show that, for any u, v V , Pu [ v < ] = Pv [ u < ] . Exercise 6.6 (Greens function is the inverse of the Laplacian). Let (V, P ) be a transient Markov chain with a stationary measure and associated Laplacian = I P . Assume that the function equation u = f . y G(x, y )/y is in L2 (V, ). Let f : V R be an arbitrary function in L2 (V, ). Solve the

6.2

Dirichlet energy and transience

{ss.Dirichlet}

In PDE theory, the Laplace equation arises as the Euler-Lagrange variational PDE for minimizing the L2 norm of the gradient. The same phenomenon holds in the discrete setting. Denition 6.4. For any f : V (G) R, dene the Dirichlet energy by E (f ) := (f, f )r = 1 2
e E (G)

{d.dirichlet}

|f (e+ ) f (e )|2 c(e) .

With a slight abuse of notation, for any antisymmetric : E R we can dene E () := (, )r . Lemma 6.3. The unique harmonic extension in Lemma 6.2 minimizes Dirichlet energy. Exercise 6.7. Prove this lemma. (Hint: for a quadratic function f (x) = solution of f (x) = 0?)
2 i (x ai ) , what is the

{l.harmmin} {ex.harmmin}

V (G) R if and only it satises Kirchho s cycle law, i.e., is circulation-free: (e) r(e) = 0
eC

Note that an antisymmetric function : E (G) R is the gradient = f of some f : for all directed cycles C E .

Thus, Lemma 6.3 can be reformulated as follows: among all antisymmetric satisfying Kirchhos cycle law, with a given ux along the boundary (the values (u), u U ), the one that minimizes the Dirichlet energy E () also satises Kirchhos node law in V \ U (i.e., it is a ow). There is a dual statement:
{l.Thomson}

Lemma 6.4 (Thomsons principle). If A and Z are disjoint subsets of V (G), then among all the Kirchho s cycle law). ows with given values |AZ , the current ow has the smallest energy (i.e., the one that satises

50

Proof. The strategy is the same as in Exercise 6.7, just need to perturb ows instead of functions: constant 1 on C , and a simple quadratic computation shows that E ( + ) E () can hold for all > 0 only if satises the cycle law along C . Let us compute this minimum Dirichlet energy for the unique harmonic extension (the voltage v ) of v |A = and v |Z = , where > . Here E (v ) = (v, v ) = have a unit ow from A to Z , i.e., v = 1, then, by (6.1),
z Z

if is a ow with minimal energy E (), and C is an oriented cycle, then consider the ow that is

Cz v (z ) = v v . So, if the voltage dierence is adjusted so that we E (v ) = = R(A Z ) . (6.2) {e.DReff}


{t.transientflow}

aA

Ca v (a) +

Theorem 6.5 (Terry Lyons [LyT83]). A graph G is transient if and only if there exists a non-zero ow of nite energy from some vertex o V (G) to innity (i.e., a ow with A = {o} and Z = ). In other words, transience is the same as positive eective conductance to innity. Proof. If G is transient, then Greens function G(o, x) is nite for all o, x, hence we can consider f (x) := G(o, x)/Cx , as in Example 3 after Denition 6.3. Then f is a non-zero ow to innity E (f ) = (f, f ) = f (o) f = f (o) = G(o, o)/Co < , which is indeed nite. from o G, and, by Exercise 6.4 (b), it has unit strength. Its energy, by the above calculation, is For the other direction, assuming recurrence, take an exhaustion of G by subgraphs Fn . Re-

currence of G implies, by Exercise 6.4 (a), that C (o G \ Fn ) 0 as n . So, the eective resistance blows up. By Thomsons principle (Lemma 6.4) and (6.2), this means that there are no unit strength ows from o to G \ Fn whose energies remain bounded as n . On the other

hand, if there was a nite energy ow on G from o to innity, then there would also be one with G \ Fn , contradicting the previous sentence. Exercise 6.8. Fill in the gaps (if there is any) in the proof of the theorem above.

unit strength, and from that we could produce a unit strength ow of smaller energy from o to each

Exercise 6.9. * Without consulting Lyons (Terry or Russ), nd an explicit ow of nite energy on Z3 . (Hint: what would a ow started from the origin do in R3 ? Mimic this in Z3 , and hope that it will have nite energy.) Theorem 6.6 (Kanai, 1986). If : G1 G2 is a quasi-isometric embedding between bounded degree graphs, then C < such that E1 (f ) C E2 (f ) . Furthermore, if G1 is transient, then so is G2 . sense that can be read o from the above inequality). Thus, if G1 q G2 , then the Gi s are transient at the same time and E1 E2 (in the obvious
{t.Kanai}

51

Proof. We will prove only the transience statement, the other being very similar. Suppose f1 is a ow of nite energy on G1 , say, from a to innity. For any e in G1 , we choose one of the shortest paths in G2 going from the vertex (e ) to (e+ ), in an arbitrary way, and we call this the image of e under . Dene f2 as follows: f2 (e2 ) =
e1 :e2 (e1 )

f1 (e1 ) ,

where the sign of f1 (e1 ) depends on the orientation of the path (e1 ) with respect to the orientation of e2 . Now f2 is a ow from (a) to , since the contribution of each path (e) to each f2 (x) is zero unless x (e ). Dene

:= sup d2 ((e+ ), (e ))
eG1

and := sup #{e1 G1 : e2 (e1 )} .


e2 G2

is nite, since is a quasi-isometry. is nite since G1 and G2 are bounded degree and is a quasi-isometry: if were innite, then would also have to be innite, since the size of a neighbourhood of radius C is bounded by a function of C and the maximum degree of the graph. The energy of f2 is nite, since: |f2 (e2 )|2 f1 (e2 )2 ,
e1 :e2 (e1 )

e2 G2

|f2 (e2 )|2

e1 G1

f1 (e1 )2 < ,

and we are done.

6.3

Martingales

{ss.MG}

We now dene one of the most fundamental notions of probability theory: martingales. We will use them in the evolving sets method of Chapter 8, in the study of bounded harmonic functions in Chapter 9, and in several other results related to random walks. Denition 6.5. A sequence M1 , M2 , . . . of R-valued random variables is called a martingale if E[ Mn+1 | M1 , . . . , Mn ] = Mn . More generally, given an increasing sequence of -algebras, F1 F2 . . . (called a ltration), and Mn is measurable w.r.t. Fn , we want that E[ Mn+1 | Fn ] = Mn . Example 1: We start with a trivial example: if Mn =
n k=1

Xk with E[ Xk ] = 0 for all k , with the

Xk s being independent, then E[ Mn+1 | M1 , . . . , Mn ] = E[ Xn+1 ] + Mn = Mn . Example 2: A classical source of martingales is gambling. In a completely fair casino, if a gambler chooses based on his history of winnings what game to play next and in what value, then the

52

increments in his fortune will not be i.i.d., but his fortune will be a martingale: he cannot make and cannot lose money in expectation, whatever he does. In particular, for any martingale, E M k M 0 = E E M k M k 1 , M k 2 , . . . , M 0 = = M0 , by iterating , and E [ M k ] = E [ M k 1 ] = = E [ M 0 ] . (6.4) {e.MGE2} M0 , by Fubini (6.3) {e.MGE1}

= E Mk1 M0 , by being a martingale

A famous gambling example is the double the stake until you win strategy: 0. start with fortune M0 = 0; 1. borrow one dollar, double or lose it with probability 1/2 each, arriving at a fortune M1 = 2 1 = 1 or M1 = 0 1 = 1; 2. if the rst round is lost, then borrow two more dollars, double or lose it with probability 1/2 each; 3. if the second round is also lost, then borrow four more dollars, and so on, until rst winning a round, and then pay back all the debts. This will eventually happen almost surely, in the random th round, and then the net payo is 1 2 2 1 +2 = 1, so this is an almost certain way to make money in a fair game assuming one has a friend with innite fortune to borrow from! Now, it is easy to see that the fortune Mk (set to be constant from time on) is a martingale: on the event {Mk1 = 1}, we have Mk = 1 = Mk1 a.s., while E Mk Mk1 = 1 2 2k1 = Mk1 + 1/2 2k 1/2 2k = Mk1 . But then, how does EMk = EM0 = 0 square with EM = 1? Well, is a random time, so why would (6.4) apply? We will further discuss this issue a few paragraphs below, under Optional Stopping Theorems. Example 3: G = (V, E ) is a network, U V is given, and f : V R is harmonic on V \ U . Dene

Mn = f (Xn ), where Xn is a random walk, and let N be the time of hitting U . It may be that = with positive probability. On the event < , we set X +i = X for all i N. Now, by the Markov property of Xn and the harmonicity of f , we have E[ Mn+1 | X1 , . . . , Xn , n < ] = f (Xn ),

w.r.t. Fn = (X0 , . . . , Xn ). It is also a martingale in the more restricted sense: given the sequence (Mn ) n=0 only, E[ Mn+1 | Mn , . . . , M0 ] = =
x0 ,...,xn : f (xi )=Mi i

and the same holds trivially also on the event {n } instead of {n < }, thus Mn is a martingale

x0 ,...,xn : f (xi )=Mi i

E[ f (Xn+1 ) | Xi = xi i = 0, . . . , n ] P[ Xi = xi i = 0, . . . , n ] E[ f (Xn+1 ) | Xn = xn ] P[ Xi = xi i = 0, . . . , n ] f (x) P[ Xn = x ] = Mn .

=
x:f (x)=Mn

This averaging argument can be used in general to show that it is easier to be a martingale w.r.t. to a ltration of smaller -algebras. Exercise 6.10. Give an example of a random sequence (Mn ) n=0 such that E[ Mn+1 | Mn ] = Mn for all n 0, but which is not a martingale.

53

Example 4: Given a ltration Fn F , possibly F = FN for some nite N , and Y is an integrable variable in F , then Mn := E[ Y | Fn ] is a martingale: E[ Mn+1 | Fn ] = E E[ Y | Fn+1 ] Fn = E[ Y | Fn ] = Mn . In words: our best guesses about a random variable Y , as we learn more and more information about it, form a martingale. As we will see, this is a far-reaching idea. Example 3 is a special case of Example 4, at least in the case when < almost surely. As we

saw in Section 6.1, given f : U R, one harmonic extension to V \ U is given by f (x) = Ex [ f (X ) ]. So, taking Fn = (X0 , . . . , Xn ) and F = (X0 , X1 , . . . ), we have that Y = f (X ) is F harmonic functions and limiting values of random walk martingales, for the case when = is a measurable, and Mn = E[ Y | Fn ] equals f (Xn ). A generalization of this correspondence between

possibility, will be Theorem 9.18.

Another instance of Example 4, very dierent from random walks, but typical in probabilistic combinatorics, is the edge- and vertex-exposure martingales for the Erd os-R enyi random graph model G(n, p). An instance G of this random graph is generated by having each edge of the complete graph Kn on n vertices be present with probability p and missing with probability 1 p, independently from each other. In other words, the probability space is all subgraphs of Kn , with measure n P[ G = H ] = p|E (H )| (1 p)( 2 )|E (H )| for any H . Now, given a variable Y on this probability space, e.g., the chromatic number (G ), we can associate to it two martingales, as follows. Fix an ordering e1 , . . . , em of the edges of Kn , where m =
n 2

of the

E Mm = Y . Similarly, one can x an ordering v1 , . . . , vn of the vertices, let G [v1 , . . . , vi ] be the states i 2

E exposure martingale associated to Y , a random sequence determined by G , with M0 = E[ Y ] and

edges e1 , . . . , ei in G , and let MiE (G ) := E Y

. Let G [e1 , . . . , ei ] {0, 1}{e1 ,...,ei } be the states of the

G [e1 , . . . , ei ] for i = 0, 1, . . . , m. This is the edge

This is the vertex exposure martingale. One reason for the interest in these martingales is

edges spanned by v1 , . . . , vi , and let MiV (G ) := E Y

G [v1 , . . . , vi ] for i = 0, 1, . . . , n.

that the Azuma-Hoeding inequality, Proposition 1.6, can be applied to prove the concentration of certain variables Y around their mean (even if the value of the mean is unknown). For instance,
V Mn = (G ), Proposition 1.6 gives V when Y = (G ), then we have the Lipschitz property |MiV +1 Mi | 1 almost surely, hence, for

P |(G ) E(G )| > n 2 exp(2 /2) .

two graphs on the same vertex set whose symmetric dierence is a set of edges incident to a single MiV = E MiV +1 G [v1 , . . . , vi ] , and think of the right hand side as a double averaging: rst over the

The reason for the Lipschitz property is that we clearly have |(H ) (H )| 1 if H and H are

vertex. Now, when we reveal the states of the edges between vi+1 and {v1 , . . . , vi }, we can write

vi+1 and {v1 , . . . , vi }, we get a set of possible outcomes whose Y -values dier by at most 1. Hence 54

the states of the edges not spanned by {v1 , . . . , vi , vi+1 } and varying the states of the edges between

edges not spanned by {v1 , . . . , vi , vi+1 }, then over the edges between vi+1 and {v1 , . . . , vi }. Fixing

V V the MiV +1 -values, given Mi , all dier by at most 1. But their average is Mi , hence they can all be

at distance at most 1 from MiV , as claimed. We give two more examples of this kind: Exercise 6.11. Let G = (V, E ) be an arbitrary nite graph, 1 k |V | an integer, and K a G spanned by K is concentrated: for the number c(G, k ) := E[ (G[K]) ], we have P (G[K]) c(G, k ) > k exp(2 k/2) . The concentration of measure phenomenon shown by the next exercise is strongly related to isoperimetric inequalities in high-dimensional spaces. For a subset A of the hypercube {0, 1}n, let B (A, t) := x {0, 1}n : dist(x, A) t . Exercise 6.12. Let , > 0 be constants satisfying exp(2 /2) = . Then, for A {0, 1}n, |A| 2n = B (A, 2 n) (1 ) 2n .

uniform random k -element subset of V (G). Then the chromatic number (G[K]) of the subgraph of

{ex.hypcubeincreas

That is, even small sets become huge if we enlarge them by a little. There are two basic and very useful groups of results regarding martingales. One is known as Martingale Convergence Theorems: e.g., any bounded martingale Mn converges to some limiting variable M , almost surely and in L1 . An example of this was Example 3, in the case when f is bounded and < almost surely. More generally, in Example 4, we have a natural candidate for the limit: Mn Y . This convergence follows from Fn F , but in a non-trivial way, known as L evys 0-1 law (see Theorem 9.14 in Section 9.4). We will state but not prove a general version of the Martingale Convergence Theorem as Theorem 9.7 in Section 9.2; a thorough source is [Dur96, Chapter 4]. One version of the Martingale Convergence Theorem implies that Example 4 is not at all that special: the class of martingales arising there coincides with the uniformly integrable ones:
K n0

lim sup E Mn 1{|Mn |>K } = 0 ,

(6.5) {e.UI}

where the corresponding Y is the a.s. and L1 -limit of Mn , as n ; see [Dur96, Theorem 4.5.6]. The second important group of results about martingales is the Optional Stopping Theorems: given a stopping time for a martingale (i.e., the event { > k } is Fk -measurable for all k N), when do we have E[ M ] = E[ M0 ]? In Example 3, assuming that < a.s., we had Ex [ M ] =

Ex [ Y ] = f (x) = M0 almost by denition. More generally, if Mn is a uniformly integrable martingale, as in (6.5), and < a.s., then E[ M ] = E[ M0 ] does hold, see [Dur96, Section 4.7]. On the other hand, in Example 2, we had E[ M ] = 1 = 0 = M0 . An even simpler counterexample is SRW on Z, started from 1, viewed as a martingale {Mn } n=0 , with being the rst hitting time on 0. By recurrence, < a.s., but E[ M ] = 0 = 1 = M0 .

55

n := Mn Let us sketch the proof of the Optional Stopping Theorem for bounded martingales. M is a martingale again, hence E[ Mn ] = E[ M0 ] = E[ M0 ] for any n N. On the other hand, < n = M almost surely. Hence the Dominated Convergence Theorem (a.s.) implies that limn M n ] = limn E[ M0 ] = E[ M0 ], as desired. says that E[ M ] = limn E[ M We have already seen applications of martingales to concentration results and to harmonic functions dened on general graphs, and we will see more later, but let us demonstrate now that martingale techniques are useful even in the simplest example, random walk on Z. 0 or n, at time 0 n . What is h(k ) := Pk [ 0 > n ]? Since (Xi ) is a bounded martingale, we have thus h(k ) = k/n. This is of course also the harmonic extension of h(0) = 0 and h(n) = 1, but (at equations, which can be considered less elegant than using the Optional Stopping Theorem. Also, one can use similar ideas in the asymmetric case: Exercise 6.13. Consider asymmetric simple random walk (Xi ) on Z, with probability p > 1/2 for a right step and 1 p for a left step. Find a martingale of the form rXi for some r > 0, and calculate Pk [ 0 > n ]. Then nd a martingale of the form Xi i for some > 0, and calculate Ek [ 0 n ]. exponential tail.) Start a symmetric simple random walk X0 , X1 , . . . at k {0, 1, . . . , n}, and stop it when rst hit

Ek [ X ] = k by the Optional Stopping Theorem. On the other hand, Ek [ X ] = h(k ) n +(1 h(k )) 0, least in principle) one would get this discrete harmonic extension by solving a system of linear

(Hint: to prove that the second martingale is uniformly integrable, rst show that 0 n has an Now, condition the symmetric simple random walk (Xi ) to reach n before 0. This conditioning

concerns the entire trajectory, hence it might happen, a priori, that we get a complicated nonMarkovian process. A simple but beautiful result is that, in fact, we get a nice Markov process. This construction is a version of Doobs h-transform [Doo59], with h being the harmonic function h(k ) := Pk [ A ] below: Lemma 6.7. Let (Xi ) i=0 be any time-homogeneous Markov chain on the state space N, and A := chain, see Denition 9.4 in Section 9.4). Then (Xi ) conditioned on A is again a Markov chain, with transition probabilities P[ Xi+1 = | Xi = k, A ] = P [ A ] P[ Xi+1 = | Xi = k ] , Pk [ A ] {A < Z } for some A, Z N (more generally, it could be any event in the invariant -eld of the
{l.condMC}

where Pk [ A ] = P[ A | X0 = k ] is supposed to be positive. Proof. Note that P[ A | Xi+1 = , Xi = k ] = P[ A | Xi+1 = ] = P [ A ]. Then, P[ Xi+1 = | Xi = k, A ] = P[ Xi+1 = , Xi = k, A ] P[ Xi = k, A ] P[ A | Xi+1 = , Xi = k ] P[ Xi+1 = , Xi = k ] = P[ A | Xi = k ] P[ Xi = k ] P [ A ] P[ Xi+1 = | Xi = k ] , = Pk [ A ] 56

as claimed. Back to our example, if (Xi ) is simple random walk with X0 = k , stopped at 0 and n, and k+1 k1 , p(k, k + 1) = , (6.6) {e.Bessel3} 2k 2k for k = 1, . . . , n 1. Note the consistency property that these values do not depend on n. In p(k, k 1) = probabilities given in (6.6) for all k = 1, 2, . . . . This chain can naturally be called SRW on Z
1 Xt dt

A = {n < 0 }, then Pk [ A ] = k/n. Therefore, the new transition probabilities are

particular, the conditional measures have a weak limit as n : the Markov chain with transition

conditioned not to ever hit zero. It is the discrete analogue of the Bessel(3) process dXt = + dBt , for those who have or will see stochastic dierential equations. The reason for the
n1 2Xt dt

index 3 is that the Bessel(n) process, given by dXt =

+ dBt , is the Euclidean distance of an

n-dimensional Brownian motion from the origin. (But I do not think that anyone knows a direct link between Brownian motion in R3 and the conditioned one in R.) Exercise 6.14. Show that the conditioned random walk (6.6) is transient. (Hint: construct an electric network on N that gives rise to this random walk, then use Subsections 6.1 and/or 6.2.)

Cheeger constant and spectral gap


{s.Cheeger}

The previous section introduced a certain geometric view on reversible Markov chains: many natural dynamically dened objects (hitting probabilities, Greens functions) turned out to have good encodings as harmonic functions over the associated electric network, and a basic probabilistic property (recurrence versus transience) turned out to have a useful reformulation via the Dirichlet energy of ows (the usefulness having been demonstrated by Kanais quasi-isometry invariance Theorem 6.6). We will now make these connections even richer: isoperimetric inequalities satised by the underlying graph (the electric network) will be expressed as linear algebraic or functional analytic properties of the Markov operator acting on functions over the state space, which can then be translated into probabilistic behaviour of the Markov chain itself.

7.1

Spectral radius and the Markov operator norm

{ss.specrad}

Consider some Markov chain P on the graph (V, E ). We will assume in this section that P is reversible. Given any reversible measure , we can consider the associated electric network c(x, y ) = (x)p(x, y ) = c(y, x), Cx = y c(x, y ), and the usual inner products (, )C on 0 (V ) and (, )r on 0 ( E ). Note that Cx = (x) now. The Markov operator P : 0 (V ) 0 (V ) clearly satises P = sup
f 0 (V )

Pf f

1.

Moreover, the extension P : 2 (V ) 2 (V ) is self-adjoint with respect to , i.e., (P f, g ) = (f, P g ) . 57

Observe furthermore that the Dirichlet energy can be written as E (f ) = EP, (f ) = (f, f ) = (f, f ) = (f, (I P )f ) = f Recall the denition of the spectral radius from (1.4), (P ) = lim sup pn (o, o)1/n .
n 2

(f, P f ) .

(7.1) {e.DIP}

Now here is the reason for calling (P ) the spectral radius: Proposition 7.1. P = (P ). Proof. Using self-adjointness, (P 2n 0 , 0 )1/2n = (P n 0 , P n 0 )1/2n = P n 0 ( P
n 1/n

{p.specrad}

0 )1/n

P . Thus we have found that (C0 p2n (0, 0))1/2n P , hence (P ) P . For the other direction, for any function f 0 (V ) with norm 1, P n+1 f
2

= (P n+1 f, P n+1 f ) = (P n f, P n+2 f ) P n f P n+2 f .

This second equality holds since P is self-adjoint, and the nal inequality is by the Cauchy-Schwarz inequality. So, we get P n+1 f P nf P n+2 f . P n+1 f (7.2) {e.ratioineq}

For any non-negative sequence (an ) n=1 we know that: lim inf
n

an+1 an+1 /n /n lim inf a1 lim sup a1 lim sup . n n n an an n n P n+1 f , P nf

So, if both limits lim lim P nf


1/n

exist, then they must be equal. But (7.2) implies that lim inf n

P n+1 f P nf

so, to show that the limits exist, it is sucient to show that any of these are nite.

= lim supn

P n+1 f P nf

58

We will now show that lim supn P n f P nf


2

1/n

(P ). We have

= (f, P 2n f ) =
x

f (x)P 2n f (x)Cx f (x)f (y )p2n (x, y )Cx


x,y

x,y

f (x)f (y )1{f (x)f (y)>0} p2n (x, y )Cx .

(7.3) {e.xyfxfy}

that for all n > N , and for every pair x, y , p2n (x, y )1/2n < + . Thus, by (7.3), P nf
2

convergence of G(x, y |z ) does not depend on x and y . So for every > 0 there is some N > 0 such <
x,y

This is a nite sum since f has nite support. lim supn p2n (x, y )1/2n = , since the radius of

f (x)f (y )1{f (x)f (y)>0} ( + )2n Cx

< C ( + )2n for some nite constant C > 0, since f is nitely supported. Taking 2nth roots, we get that limn
P n+1 f P nf

= limn P n f

1/n

implies that

. Starting the chain of inequalities (7.2) from n = 0 Pf , f

and we are done. Lemma 7.2. For a self-adjoint operator P on a Hilbert space H = (V ), P := supf H
(P f,f ) f 2 . 2 Pf f

{l.hilbertnorm}

supf 0 (V )

Proof. We give two proofs. The rst uses theory, the second uses a trick. (Grothendieck had the program of doing all of math without tricks: the right abstract denitions should lead to solutions automatically. I think the problem with this is that, without understanding the tricks rst, people would not nd or even recognize the right denitions.) In the nite-dimensional case (i.e., when V is nite, hence 2 (V ) = 0 (V )), since P is self-adjoint, its eigenvalues are real, and then both supf H
Pf f

largest eigenvalue of P . For the innite dimensional case, one has to use the spectral theorem, and

and supf 0 (V )

(P f,f ) f 2

are expressions for the

note that it is enough to consider the supremum over the dense subset 0 (V ). For the details, see, e.g., [Rud73, Theorem 12.25]. For the tricky proof, which I learnt from [LyPer10, Exercise 6.6], rst notice that sup
f 0 (V )

Pf (P f, f ) |(P f, f )| sup , sup f f 2 f 2 f 0 (V ) f 0 (V )

where the rst inequality follows from (P f, P f )(f, f ) |(P f, f )|2 , which is just Cauchy-Schwarz. 59

For the other direction, if the rightmost supremum is C , then, using that P is self-adjoint, (P f, g ) = (P (f + g ), f + g ) (P (f g ), f g ) 4 (f + g, f + g ) + (f g, f g ) (f, f ) + (g, g ) C =C . 4 2

the proof.

Taking g := P f f / P f yields P f C f . Again by the denseness of 0 (V ) in H , this nishes

7.2

The innite case: the Kesten-Cheeger-Dodziuk-Mohar theorem

{ss.kesten}

In this section, we are going to prove Kestens characterization of the amenability of a group through the spectral radius of the simple random walk, Theorem 5.4. However, we want to do this in larger generality. The isoperimetric inequalities of Chapter 5 were dened not only for groups but also for locally nite graphs, moreover, we can naturally dene them for electric networks, as follows. Satisfying the (edge) isoperimetric inequality IP will mean that there exists a > 0 such that C (E S ) (S ) for any nite connected subset S , where (S ) = measure for the associated random walk, and C (E S ) =
x S y /S

c(x, y ). The largest possible , i.e.,

x S

Cx is the natural stationary

reversible Markov chain (V, P ), then the Cheeger contant of the associated electric network does not depend on which reversible measure we take, so we can talk about ,E (P ) = (P ), the Cheeger constant of the chain. The following theorem was proved for groups by Kesten in his PhD thesis [Kes59]. A dierential geometric version was proved in [Che70]. Then [Dod84] and [Moh88] proved generalizations for graphs and reversible Markov chains, respectively. We will be proving Mohars generalization, following [Woe00, Sections 4.A and 10.A]. Theorem 7.3 (Kesten, Cheeger, Dodziuk, Mohar). Let (V, P ) be an innite reversible Markov chain on the state space V , with Markov operator P . The following are equivalent: (1) (V, P ) satises IP with > 0; (2) For all f 0 (V ) the Dirichlet inequality f where EP, (f ) = (f, f ) =
1 2 x,y 2 2 2

,E,c := inf S C (E S )/ (S ), is the Cheeger constant of the network. Note that if we are given a

{t.kesten}

|f (x) f (y )| c(x, y ) and (f, f ) =

EP, (f ) is satised for some > 0,


x

f 2 (x)Cx ;

(3) (P ) 1 , where the spectral radius satises (P ) = P , shown in the previous section. In fact, and are related by (P )2 /2 1 (P ) (P ). inequality f Proof of (2) (3). Recall from (7.1) that EP, (f ) = f
2 2 2 2

is true precisely when P 1 .

2 2 (f, P f )

into (f, P f ) (1 ) f

(f, P f ). Now rearrange the Dirichlet


2 2

for all f 0 (V ). By Lemma 7.2, this

60

Proof of (2) (1). Take any nite connected set S . Then (S ) = EP, (1S ) = 1 2
x,y

x S

Cx = 1S

2 2,

and

|1S (x) 1S (y )|2 c(x, y ) = C (S ).


2 2

satises IP with = .

Now apply the Dirichlet inequality to 1S to get C (S ) = EP, (1S ) 1S

= (S ), and so (V, P )

To show that the rst statement implies the second, we want to show that the existence of almost invariant functions, or functions f such that (P f, f ) is close to f
2

, gives the existence


{d.sobolev} {p.sobolevIP}

of almost invariant sets, i.e., sets S with P 1S close to 1S , which are exactly the Flner sets. Denition 7.1. The Sobolev norm of f is SP, (f ) :=
1 2 x,y

|f (x) f (y )|c(x, y ) = f

1.

Proposition 7.4. For any d [1, ], a reversible chain (V, P ) satises IPd () if and only if the Sobolev inequality f
d 1 d d d 1

SP, (f ) holds for all f 0 (V ), where IPd () means C (E S )


p

(S )

for any nite connected subset S , and f

=(

|f (x)|p Cx ) p .

Proof. To prove that the Sobolev inequality implies good isoperimetry, note that 1S and SP, (1S ) = C (E S ).

d d 1

= (S )

d 1 d

may assume f 0. For t > 0, we are going to look at the super-level sets St = {f > t}, which will be nite since f 0 . Now, SP, (f ) = = = =
0 0 d For d = ( d 1 = 1), we get x y :f (y )>f (x) 0 x,y s.t. f (x)t<f (y ) x y :f (y )>f (x)

To prove the other direction, rst note that SP, (f ) SP, (|f |) by the triangle inequality, so we

x y :f (y )>f (x)

(f (y ) f (x))c(x, y ) c(x, y )
0

1[f (x),f (y))(t)dt

c(x, y )1[f (x),f (y)) (t)dt c(x, y )dt

C (E {f > t})dt.

SP, (f ) =
0

C (E {f > t})dt ({f > t})dt


1. f 0

= E[f ] = f For d = 1, since ({f > t}) = 0 0 < t f exercise.


,

The case 1 < d < is just slightly more complicated than the d = case, and is left as an

we get that SP, (f )

dt = f

61

Exercise 7.1. Let d (1, ), and let p = (a) Show that


0

d d 1 .

ptp1 F (t)p dt (

F (t)dt)p for F 0 decreasing.

(b) Using part (a), nish the proof of the proposition, i.e., the 1 < d < case. We now use the proposition to complete the proof of the Kesten theorem. Proof of (1) (2) of Theorem 7.3. f
2 2

= f2

1 SP, (f 2 ) by the proposition 11 = |f 2 (x) f 2 (y )| c(x, y ) 2 x,y 11 2


x,y

f (x) f (y ) |f (x)| + |f (y )| c(x, y )


1/2

1 1 2

x,y

|f (x) f (y )|2 c(x, y )

1 2

x,y

|f (x)| + |f (y )| c(x, y )

1/2

where the last inequality follows by Cauchy-Schwarz. using the inequality f


4 2

The rst sum above is precisely EP, (f )1/2 . The second sum can be upper bounded by
2 2 EP, (f )

2 f

2,

1 2 2 (|x| + |y |) 2 2 , which gives

x + y . So, after squaring the entire inequality, we have f


2 2

2 EP, (f ). 2
2 2 .

Therefore, the rst statement in the Kesten theorem implies the second, with =

7.3

The nite case: expanders and mixing

{ss.mixing}

The decay of the return probabilities has a very important analogue for nite Markov chains: convergence to the stationary distribution. In most chains, the random walk will gradually forget its starting distribution, i.e., it gets mixed, and the speed of this mixing is a central topic in probability theory and theoretical computer science, with applications to almost all branches of sciences and technology. (As I heard once from L aszl o Lov asz, when I was still in high school: With the right glasses, everything is a Markov chain.) A great introductory textbook to Markov chain mixing is [LevPW09], from which we will borrow several things in this section. Let (V, P ) be a nite Markov chain, with the Markov operator P f (x) =
2 y

p(x, y )f (y ) =

E X1 X0 = x acting on (V ). The constant 1 function is obviously an eigenfunction with eigenvalue 1, i.e., P 1 = 1. On the other hand, recall that a probability measure on V is called stationary if P = , i.e., if P (y ) =
x p(x, y ) (x)

= (y ), i.e., if it is a left eigenfunction with eigenvalue

1. The existence of such a is much less obvious than the case of the above constant eigenfunction, but we are dealing in this section only with reversible chains, for which the reversible distribution is also stationary. 62

{ex.genspec}

Exercise 7.2. Let (V, P ) be a reversible, nite Markov chain, with the stationary distribution (x). Note that P is self-adjoint with respect to (f, g ) = (a) All eigenvalues i satisfy 1 i 1; (b) If we write 1 n 1 = 1, then 2 < 1 if and only if (V, P ) is connected (the chain is irreducible); (c) n > 1 if and only if (V, P ) is not bipartite. (Recall here the easy lemma that a graph is bipartite if and only if all cycles are even.)
2 2 x V

f (x)g (x) (x). Show:

Recall from the Kesten Theorem 7.3 that

For nite chains, we have the following analogue, where 1 2 is usually called the spectral gap. chain (V, P ) with stationary distribution and conductances c(x, y ) = (x)p(x, y ), set h(V, P ) = inf the Cheeger constant of the nite chain. Then
S V h2 2

1 for innite reversible Markov chains.


{t.AlonMilman}

Theorem 7.5 (Alon, Milman, Dodziuk [Alo86, AloM85, Dod84]). For a nite reversible Markov

C (E S ) , (S ) (S c ) 1 2 2h.

Sketch of proof. The proof is almost identical to the innite case, with two dierences. One, the denition of the Cheeger constant is slightly dierent now, with the complement of the set also appearing in the denominator. Two, we have to deal with the fact that 1 is an eigenvalue, so we are interested in the spectral gap, not in the spectral radius. For this, recall that 2 = sup{
x Pf f

constant functions, the eigenspace corresponding to the eigenvalue 1. By an obvious modication of Lemma 7.2 and by (7.1), this says that 1 2 = inf E (f ) : f 2 f (x) (x) = 0
x

f (x) (x) = 0} (Raleigh, Ritz, Courant, Fisher), since this is the subspace orthogonal to the

(7.4) {e.gapDir}

Now, what are the eects of these modications on the proofs? When we have some isoperimetric constant h(V, P ) and want to bound the spectral gap from below, start with a function f attaining the inmum in (7.4), and consider its super-level sets {x : f (x) > t}. From the argument of Proposition 7.4, for f+ := f 0, we get SP, (f+ ) h
0

({f > t}) ({f t}) dt .

combined,

2 2 2 and we get S (f+ ) h f+ 1 . Similarly, we get S (f+ ) h f+ 1 = h f+ 2 . Now, just as in the 2 proof of (1) (2) of Theorem 7.3, we have S (f+ ) 2E (f+ )1/2 f+ 2 . These two inequalities 2 E (f+ ) h2 E (f 2 ) = 1 2 , 2 f+ 2 f 2

By symmetry, we may assume that ({f > 0}) 1/2, hence ({f > t}) ({f t}) for all t 0,

63

where the second inequality is left as a simple exercise. instead of f = 1A , with a chosen to make the average 0, and compute the Dirichlet norm. Exercise 7.3. Fill in the missing details of the above proof. The main reason for the interest in the spectral gap of a chain is the following result, saying that if the gap is large, then the chain has a small mixing time: starting from any vertex or any initial distribution, after not very many steps, the distribution will be close to the stationary measure. The speed of convergence to stationarity is basically the nite analogue of the heat kernel decay in the innite case, and the theorem is the nite analogue of the result that a spectral radius in Proposition 7.1). The converse direction (does fast mixing require a uniformly positive spectral am Tim gap?) has some subtleties (pointed out to me by Ad ar), which will be discussed after the proof of the theorem, together with proper denitions of the dierent notions of mixing time. Theorem 7.6. Let (V, P ) be a reversible, nite Markov chain. Set gabs := 1 maxi2 |i |, and
) (y ) := minxV (x). Then for all x and y , | pt (x,y | (y ) (1gabs )t ,

For the other direction, if we have a subset A with small boundary, then take f = 1A 1Ac

smaller than 1 implies exponential heat kernel decay (i.e., the quite obvious inequality (P ) P

{t.gapmix}

where pt (x, y ) is the probability

of being at y after t steps starting from x.

The result is empty for a chain that is not irreducible: there gabs = 0, and the upper bound 1/ is trivial. Note also that we use gabs , called the absolute spectral gap, instead of 1 2 . graph is bipartite (Exercise 7.2 (c)), and the distribution at time t depends strongly on the parity of t. Before proving the theorem, let us see four examples. Example 1. For simple random walk on a complete n-vertex graph with loops, whose transition matrix
1 n 1 n 1 n 1 n

The reason is that n being close to 1 is an obvious obstacle for mixing: say, if n = 1, then the

has eigenvalues 1, 0, . . . , 0, the bound in Theorem 7.6 is 0, that is, pt (x, y ) is 1/n for every t 1. Example 2. For simple random walk on the n-cycle Cn , the exercise below tells us that the eigenvalues are cos(2j/n), j = 0, 1, . . . , n 1. If n is even, Cn is bipartite, and indeed 1 is an eigenvalue. If n is odd, then the absolute spectral gap is 1 cos(2/n) = (n2 ). So, the upper in Exercises 7.11 (b), 7.14 (d), and in Section 8.2. Exercise 7.4. Verifying the above example, compute the spectrum of simple random walk on the n-cycle Cn . bound in the theorem gets small at Cn2 log n. The true mixing time is actually Cn2 , as we will see
{ex.specCn}

. . .

. . .

.. .

1 n

1 n

1 n 1 n 1 n

. . .

64

Example 3. A random walk can be made lazy by ipping a fair coin before each step, and only moving if the coin turns up heads (otherwise, the walk stays in the same place for that step.) If {0, 1}k is the k -dimensional hypercube, consider the simple random walk on this hypercube and let = I +P is the Markov operator for the lazy random walk on the P be its Markov operator. Then P
2

hypercube, and it has eigenvalues It turns out that gabs =


2 1 k

1+i 2 . 1 2k .

small for t = Ck , but the true mixing time is actually C k log k ; see Exercises 7.11 (b), 7.14 (b), and Exercise 8.6. on {0, 1}k in the above example. (Hint: think of this as Exercise 7.5. Compute the spectrum of P a product chain of random walks.) Our last example concerns graphs that are easy to dene, but whose existence is far from clear. See the next section for more on this. Denition 7.2. An (n, d, c)-expander is a d-regular graph on n vertices, with h(V ) c. A dTheorem 7.5, the existence of this c > 0 is equivalent to having a uniformly positive spectral gap. Example 4. For simple random walk on an (n, d, c) expander, is uniform, and =
1 n.

in this case, and =

So, the upper bound in the theorem gets

{ex.hyperspec}

{ndcExpander}

regular expander sequence is a sequence of (nk , d, c)-expanders with nk and c > 0. By

For

t = C (d, c) log n, the bound in the theorem is small. This is sharp, up to a constant factor. In fact,
) (y ) | = 1 for t logd1 n. Therefore, an expander mixes basically as fast as possible maxx,y | pt (x,y (y )

note that any d-regular graph with n vertices has diameter at least logd1 n, which implies that for a constant degree graph. (The converse is false, see Exercise 7.16.)
1 { x= z } (x ) .

Proof of Theorem 7.6. Dene x (z ) :=

Then (P t y )(x) =

P[ Xt =y |X0 =x ] (y )

pt (x,y ) (y ) .

So

pt (x, y ) (y ) = (P t y 1, x ) = (P t (y 1), x ) (y ) x P t (y 1) = 1 (x) P t (y 1) ,

where the inequality is from Cauchy-Schwarz. Now, since (y 1, 1) = 0, we can write y 1 =


n i=2

ai fi , where f1 , . . . , fn is an orthonormal basis of eigenvectors. So


n n

P t (y 1)

=
i=2

ai t i fi

=
i=2 t 2

2 |t i | ai f i n

ai f i
i=2

t = 2 y 1

t 2 y

) (y ) | where := maxi2 |i |, and the last inequality is by the Pythagorean theorem. So | pt (x,y (y ) t 1 , and we are done. (x ) (y )

65

This theorem measures closeness to stationarity in a very strong sense, in the L -norm between the functions pt (x, )/ () and 1(), called the uniform distance. The proof itself used, as an intermediate step, the L2 -distance P t y 1
2

pt (, y ) 1() (y )

=
2

pt (y, ) 1() ()

= 2 pt (y, ), () ,

where the middle equality uses that pt (x, y )/ (y ) = pt (y, x)/ (x) for reversible chains, and 2 is the chi-square distance, the following asymmetric distance between two measures: 2 (, ) :=
x V 2

(x) 1 (x) , (x)

(7.5) {e.chi2dist}

which we will again use in Section 8.3. The most popular notion of distance uses the L1 -norm, or more precisely the total variation distance, dened as follows: for any two probability measures and on V , dTV (, ) : = max |(A) (A)| : A V = 1 2
x V

|(x) (x)| =

1 2

x V

(x) 1 (x) . (x)

(7.6) {e.TV}

Exercise 7.6. Prove the equality between the two lines of (7.6). This also shows dTV to be a metric. Exercise 7.7. Show that dTV (, ) = min P[ X = Y ] : (X, Y ) is a coupling of and . Of course, one can use any Lp norm, or quantities related to the entropy, and so on, see [SaC97, BobT06]. In any case, given a notion of distance, the mixing time of a chain is usually dened constant, say 1/4. This time will be denoted by tmix (1/4). Why is this is a good denition? Let us discuss the case of the total variation distance. Let d(t) := sup dTV pt (x, ), ()
x V

{ex.TVcoupling}

to be the smallest time t when the distance between pt (x, ) and () becomes less than some small

(t) := sup dTV pt (x, ), pt (y, ) . and d


x,y V

(t). The following two exercises explain why we introduced d (t) 2d(t). Exercise 7.8. Show that d(t) d (t + s) d (t) d (s). Exercise 7.9. Using Exercise 7.7, show that d Therefore, for tmix (1/4) = tTV mix (1/4), ( tmix (1/4)) d (tmix (1/4)) (2d(tmix (1/4)) 2 . d( tmix (1/4)) d For instance, tmix (2100 ) 100 tmix(1/4), so tmix (1/4) captures well the magnitude of the time needed for the chain to get close to stationarity. Dene the separation distance at time t by s(t) := supxV 1
pt (x,y ) (y )

. Note that this is a

one-sided version of the uniform distance: s(t) < means that all states have collected at least a likely to be. (1 )-factor of their share at stationarity, but there still could be states where the walk is very

66

Exercise 7.10. * Show that, in any nite reversible chain, d(t) s(t) 4d(t/2). The relaxation time of the chain is dened to be trelax := 1/gabs , where gabs is the absolute spectral gap. This is certainly related to mixing: Theorem 7.6 implies that t mix (1/e) 1 + ln 1 trelax .

{ex.separation}

(7.7) {e.inftyrelax}

It also has an interpretation as a genuine temporal quantity, shown by the following exercise that is just little more than a reformulation of the second part of the proof of Theorem 7.6: Exercise 7.11. g > 0 implies that limt P f (x) = E f for all x V . Moreover, Var [P t f ] (1 gabs )2t Var [f ] , time needed to reduce the standard deviation of any function to 1/e of its original standard deviation. (b) Show that if the chain (V, P ) is transitive, then 4 dTV pt (x, ), ()
2 2

{ex.L2mixing}

(a) For f : V R, let Var [f ] := E [f 2 ] (E f )2 =


t

f (x)2 (x)

x f (x) (x)

. Show that

with equality at the eigenfunction corresponding to the i giving gabs = 1 |i |. Hence trelax is the

pt (x, ) 1() ()

=
2 i=2

t 2 i .

For instance, assuming the answer to Exercise 7.5, one can easily get the bound d 1/2 k ln k + c k e2c /2 for c > 1 on the TV distance for the lazy walk on the hypercube {0, 1}k . (This is sharp even regarding the constant 1/2 in front of k ln k see Exercise 7.14 below.) Also, assuming the answer

2 to Exercise 7.4, one can get that tTV mix (Cn ) = O(n ) for the n-cycle.

Furthermore, the nite chain analogue of the proof of the P tion 7.1 gives the following: Proposition 7.7. For the distance in total variation, limt d(t)

(P ) inequality in Proposi{pr.convspeed}
1/t

= 1 gabs .

be an eigenfunction corresponding to the i giving g = 1 |i |. This is orthogonal to the constant functions (the eigenfunctions for 1 = 1), hence
t |t i f (x)| = |P f (x)| = x f (x) (x)

Proof. The direction follows almost immediately from Theorem 7.6. For the other direction, let f = 0. Therefore,

pt (x, y ) (y ) f (y ) f
,

2 dTV

pt (x, ), () 2 f

d(t) .

Taking x V with f (x) = f

we get (1 gabs)t 2d(t), then taking tth roots gives the result.

The above inequality (1 gabs )t 2d(t) easily gives that gabs c/tTV mix (1/4), or in other words, trelax C tTV mix (1/4) , for some absolute constants 0 < c, C < . 67 (7.8) {e.relaxTV}

ask: going from the relaxation time to the uniform mixing time, how bad is the ln 1/ factor in

Besides (7.8), it is clear that tTV mix () tmix () for any . It is therefore more than natural to

(7.7), and when and how could we eliminate that? In some cases, this factor is not very important: e.g., in the example of simple random walk on the cycle Cn , it gives a bound O(n2 log n), instead of

the true mixing time n2 . However, for rapidly mixing chains, where trelax = O(log |V |), this is a

serious factor. For the case of SRW on an expander, it is certainly essential, and (7.7) is basically sharp. In some other cases, this factor will turn out to be a big loss: e.g., for SRW on the hypercube {0, 1}n, or on a lamplighter group, say, on Z2 Zd n , or for chains on spin systems, such as the Glauber

is exponential in the parameter n that is the dimension of the hypercube or the linear size of the

dynamics of the Ising model in a box Zd n on critical or supercritical temperature. In these cases, |Vn | base box Zd n , while we still would like to have polynomial mixing in n, and the factor ln 1/ might ruin the exponent of the polynomial. Looking at the proof of Theorem 7.6, this factor comes from two sources. The rst one is an application of Cauchy-Schwarz that looks indeed awfully wasteful: this happens for a lot of y s and a xed x, and it is certainly impossible for a xed y and several xs, average, then we should be able to avoid this wasteful Cauchy-Schwarz. The second source is the possibility that all the non-unit eigenvalues are close to 2 . To exclude this, one needs to understand the spectrum very well. The above Exercise 7.11 (b) gave two examples where both sources of waste can be eliminated. Yet another method to bound total variation mixing times from above is by Exercise 7.7 above: for any coupling of two simple random walks x = X0 , X1 , X2 , . . . and y = Y0 , Y1 , Y2 , . . . on V (G), dTV pn (x, ), pn (y, ) P[ Xn = Yn ] . Here is an example. For the lazy walk on the hypercube {0, 1}k , a simple coupling is that we rst the ith coordinate, each with probability half. If Xn (i) = Yn (i), then the two walks move or stay put choose a coordinate i {1, 2, . . . , k } uniformly, then, if Xn (i) = Yn (i), then either Xn or Yn moves in it seems very unlikely that P t y 1 is almost collinear with x , it seems even more unlikely that since dierent x s are orthogonal. This suggests that if we do not want mixing in L , only in some

together. In this coupling, we reach Xn = Yn when all the h coordinates in which X0 and Y0 diered have already been sampled, which is a classical coupon-collector problem: in expectation, this takes is k ln k , and, because of the independence of stages in collecting the coupons, one clearly expects time k/h + k/(h 1) + + k/1 k ln h if both k and h are large. In the worst case h = k , this

sharp concentration around this expectation (see the next exercise). If we want the distribution of pn (x, ) to get close to (), then y = Y0 can be chosen uniformly, which makes h concentrated worst case. around k/2, but the expected time to couple is still k ln k , not better than what we got from the

Exercise 7.12. Use the above coupling for the lazy walk on {0, 1}k to show the total variation disshows that this k ln k is suboptimal at least by a factor of 2.

tance bound d(k ln k + tk ) Cect for some 0 < c, C < and all t > 0. Note that Exercise 7.11 (b)

68

Before giving some lower bounds on mixing times, let us quote the following very natural result from [LevPW09, Section 7.3]: Exercise 7.13. Show that if f is a function on the state space such that E f E f r , where 2 = (Var f + Var f )/2 and r > 0, then dTV (, ) 1 4 . 4 + r2
k i=1

{ex.TVfunctions}

Exercise 7.14 (Lower bounds from eigenfunctions and similar observables). Estimate its Dirichlet energy E (W ) and variance Var [W ] = (a) For the lazy random walk on {0, 1}k , consider the total Hamming weight W (x) := W E W
2

{ex.mixlower}

xi .

, and deduce by the

variational formula (7.4) that the spectral gap of the walk is at most of order 1/k (which we already

knew from the exact computation in Exercise 7.5), i.e., the relaxation time is at least of order k . Then nd the L2 -decomposition of W into the eigenfunctions of the Markov operator, and deduce the decay of Var [P t W ] as a function of t. (b)* For the lazy random walk {Xt }tN on {0, 1}k and W as in part (a), compute E W (Xt ) X0 = 0 = and Var W (Xt ) X0 = 0 1 k 1 1 2 k k . 4
t

Using Exercise 7.13, deduce that d 1/2 k ln k c k 1 8e2c+1. See [LevPW09, Proposition 7.13] at time 1/2 k ln k : this is a standard example of the cuto phenomenon [LevPW09, Chapter 18].

for help, if needed. Comparing with Exercise 7.11 (b), we see a sharp transition in the TV distance (c) Take SRW on the n-cycle Cn . Find a function that shows by (7.4) that the spectral gap of the chain is a most of order 1/n2 . (Again, we already knew this from computing the spectrum exactly in Exercise 7.4, but we want to point out here that nding one good observable is typically much easier than determining the entire spectrum.) (d) For SRW on the n-cycle Cn , consider the function F (i) := 1{n/4 i < 3n/4} for i Cn . Observe that although this F gives only an (n) lower bound on the relaxation time if the variational formula (7.4) is used, the variance Var [P t F ] does not start decaying before t reaches order n2 , hence Exercise 7.11 (b) does imply a quadratic lower bound on the relaxation time. Explain how this is possible in terms of the L2 -decomposition of F into eigenfunctions of the Markov operator.
2 Similarly, using the strategy of part (b), show that tTV mix (Cn ) = (n ), which is the right order, by

Exercise 7.11 (b). (We already had this lower bound from part (c) and (7.8), but we wanted to point out that the same observable might fail or succeed, depending on the method.) In the previous exercise, part (b) showed that the TV mixing time on the hypercube is signicantly larger than the relaxation time, and showed the cuto phenomenon. In part (d), on the cycle, we did not draw such conclusions. This was not an accident: Exercise 7.15. Show that if tTV mix (Vn ) trelax (Vn ) for a sequence of nite reversible Markov chains on n vertices, then the sequence cannot exhibit cuto for the total variation mixing time. 69

There are natural chains where the three mixing times, trelax For d = 2,

tTV mix

t mix , are in fact of dierent

orders: e.g., simple random walk on the lamplighter groups Z2 Zd n , with a natural set of generators. trelax n2 log n ,
2 2 tTV mix () n log n , 4 t mix () n .

(7.9) {e.LLmix}

The reason is, very briey, that the relaxation time is given by the maximal hitting time for SRW on the base graph torus Z2 n , the TV mixing time is given by the expected cover time of the base graph, while the uniform mixing time is given by the time when the set St of unvisited sites on the base graph satises E[ 2|St | ] < 1 + . See [PerR04]. It is especially interesting to notice the dierence between the separation and uniform mixing times: by time C n2 log2 n with large C , the small pieces left out by the SRW on the torus are not enough to make the pt (x, )-measure of any specic state too small, but they still allow for certain states to carry very high measures. In this example, (7.9), the relaxation and the mixing times are on the same polynomial order of magnitude; their ratio is just log n, which is log log |Vn |. This seems to be a rather general phenomenon; e.g., I do not know examples of Markov chains over spin systems where this does counterexamples like expanders, or the ones in Exercise 7.16 (b). Let us now discuss whether a uniformly positive spectral gap is required for fast mixing. This will depend on the exact denition of fast mixing. First of all, Proposition 7.7 implies that in a sequence of nite chains, we have convergence to stationarity with a uniformly exponential speed if and only if the absolute spectral gaps are uniformly positive. On the other hand, if we dene fast mixing by the mixing time being small, i.e., tTV mix (1/4) = O(log |V |), then (7.8) does not imply a uniformly positive absolute spectral gap. Indeed, that does not hold in general:
{ex.qexpander}

not hold. However, I cannot formulate a general result or even conjecture on this, because of

Exercise 7.16. You may accept here that transitive expanders exist. tTV mix (1/4) = O(log |Vn |), but do not form an expander sequence. with some 0 < < 1. (a) Give a sequence of d-regular transitive graphs Gn = (Vn , En ) with |Vn | that mix rapidly,

(b) In a similar manner, give a sequence Gn = (Vn , En ) satisfying trelax tTV mix (1/4) log |Vn |,

A nice property of simple random walk on an expander is that the consecutive steps behave similarly in several ways to an i.i.d. sequence, but the walk can be generated using much less randomness. This is useful in pseudorandom generators, derandomization of algorithms, etc. To conclude this subsection, here is an exact statement of this sort: Proposition 7.8. Let (V, P ) be a reversible Markov chain with stationary measure and spectral of the chain in stationarity (i.e., X0 ). Then gap 1 2 > 0, and let A V have stationary measure at most < 1. Let (Xi ) i=0 be the trajectory P Xi A for all i = 0, 1, . . . , t C (1 )t , where = (2 , ) > 0 and C is an absolute constant.
{pr.AKSz}

70

This was rst proved by Ajtai, Koml os and Szemer edi [AjKSz87]; see [HooLW06, Section 3] for a proof and applications. I am giving here what I think is the simplest possible proof. Stronger and extended versions of this argument can be found in [AloS00, Section 9.2] and [HamMP12]. We will use the following simple observation: Exercise 7.17. If (V, P ) is a reversible Markov chain with stationary measure and spectral gap 1 2 > 0, and f L2 (V ) has the property that (supp f ) 1 , then (P f, f ) (1 1 )(f, f ) for some i = i (2 , ) > 0. Proof of Proposition 7.8. We want to rewrite the exit probability in question in a functional analytic for f L2 (V ), and note that language, since we want to use the notion of spectral gap. So, consider the projection Q : f f 1A and (P f, P f ) (1 2 )(f, f )
{ex.suppcontr}

P Xi A for i = 0, 1, . . . , 2t + 1 = Q(P Q)2t+1 1, 1

= P (QP )t Q1, (QP )t Q1 , by self-adjointness of P and Q

(1 1 ) P (QP )t1 Q1, P (QP )t1 Q1 , by Q being a projection (1 1 ) (1 2 )t Q1, Q1 ) , by iterating previous step (1 1 ) (1 2 )t , (1 1 ) (1 2 ) (QP )t1 Q1, (QP )t1 Q1 , by Exercise 7.17

(1 1 ) (QP )t Q1, (QP )t Q1 , by Exercise 7.17

and we are done, at least for odd times. For even times, we can just use monotonicity in t.

7.4

From innite to nite: Kazhdan groups and expanders

{s.expanders}

From the previous section, one might think that transitive expanders are the nite analogues of non-amenable groups. This is true in some sense, but this does not mean that we can easily produce transitive expanders from non-amenable groups. A simple but crucial obstacle is the following: Exercise 7.18. (a) If G G is a covering map of innite graphs, then the spectral radii satisfy (G ) (G), i.e., the larger graph is more non-amenable. In particular, if G is an innite k -regular graph,
2 k 1 . k

(b) If G G is a covering map of nite graphs, then 2 (G ) 2 (G), i.e., the larger graph is a worse expander. Let be a nitely generated group, with nite generating set S .

then (G) (Tk ) =

71

{d.kazhdan}

Denition 7.3. We say that has Kazhdans property (T ) if there exists a > 0 such that for any unitary representation : U (H ) on a complex Hilbert space H without xed (non-zero) vectors, and v , there exist a generator s S with (s)v v v . The maximal will be denoted by (, S ) 0. the denition says that having no invariant vectors in a representation always means there are no almost invariant vectors either. Note that if we have an action of on a real Hilbert space by orthogonal transformations, then we can also consider it as a unitary action on a complex Hilbert space, hence the Kazhdan property will apply to real actions, too. Exercise 7.19. A group having property (T) is well-dened: if (, S1 ) > 0 then (, S2 ) > 0 for any pair of nite generating set for , S1 , S2 . Example: If is nite, then (, ) 2. Let us assume, on the contrary, that v is a vector in H with norm 1 with (g )v v 2, for all g . Then (g )v is in the open half-space
+ Hv := {w H : (v, w) > 0} for all g . If we average over all g ,

(7.10)

{e.Kaka}

Recall that a vector xed by would be an element v such that (g )v = v for all g . So,

v0 :=

1 |G|

(g )v ,
g

+ we will obtain an invariant factor which is in the interior of Hv , so it is non-zero.

Exercise 7.20. If a group is Kazhdan (i.e., has Kazhdans property (T )) and it is also amenable, then it is a nite group. (Hint: From Flner sets (which are almost invariant), we can produce invariant vectors in L2 ().) F2 , the free group with two generators, is not Kazhdan since there exists a surjection F2 Z It is not obvious to produce innite Kazhdan groups. The main example is that SLd (Z), for d 3, are Kazhdan. See [BekdHV08]. The rst reason for property (T) entering probability theory was the construction of expanders.

(the obvious projection) to an amenable group, and then F2 acts on L2 (Z).

On one hand, we have the following not very dicult theorem, see [Lub94, Proposition 1.2.1] or [HooLW06, Lemma 1.9]: Theorem 7.9 ([Pin73],[Pip77]). Asymptotically almost every d-regular graph is an expander. However, as with some other combinatorial problems, although there are many expanders, it is hard to nd an innite family explicitly.

72

Exercise 7.21 (Margulis 1973). If is Kazhdan and innite, with nite factor groups n , let Gn = G(n , S ) be the Cayley graph of n with a xed generating set S of G. Then the Gn are expanders. This was the rst explicit construction of an expander family, from an innite Kazhdan group with innitely many dierent nite factors. Since then, much easier constructions have been found. An early example that uses only some simple harmonic analysis is [GaGa81]. More recently, a completely elementary construction was found by Reingold, Vadhan and Wigderson (2002), using the so-called zig-zag product; see [HooLW06, Section 9]. Any innite k -regular transitive graph G has spectral radius (G) (Tk ) =
2 k 1 k 2 k 1 , k

and any

see [HooLW06, Section 5.2].

nite k -regular graph on n vertices has second largest eigenvalue 2 (G) A sequence of k -regular graphs Gn are called Ramanujan graphs if 2 k1 . lim inf 2 (Gn ) = n k So, the Ramanujan graphs are the ones with largest spectral gap.

o(1) as n ,

this is due to Joel Friedman, and is much harder to prove than just being expanders. There are also explicit constructions, rst done by Lubotzky, Philips and Sarnak in 1988. They showed that G(SL3 (Fp ), S ), with appropriate generating sets S are Ramanujan graphs. See [HooLW06] and [Lub94]. Conjecture 7.10 (Question on k -regular graphs). What is the asymptotic distribution of 2 if we choose k -regular graphs uniformly at random? The limiting distribution should be 2
2 k 1 k

Again, for any > 0, asymptotically almost every k -regular graph has 2 (G)

2 k 1 k

+ f (n)

g (n)

T W =1 ,

for some unknown normalizing functions f, g , where T W stands for the Tracy-Widom distribution for the uctuations of the largest eigenvalue of the -ensemble of random matrices; in particular, for = 1, it is the GOE ensemble, the real symmetric Gaussian random matrices. The motivation behind this conjecture is that the adjacency matrix of a random k -regular graph is a sparse version of a random real symmetric Gaussian matrix. As we will see in Exercise ?? and Section 14.2, the typical eigenvalues of large random k -regular graphs in the k limit also matrix ensembles and their universal appearance throughout science. look like the typical eigenvalues of large random matrices. See, for instance, [Dei07] on the random

73

Isoperimetric inequalities and return probabilities in general


{s.return}

The main topic of this section is a big generalization of the deduction of exponential heat kernel decay from non-amenability (or fast mixing from positive Cheeger constant): general isoperimetric inequalities also give sharp upper bounds on the return probabilities. This was rst discovered by Varopoulos [Var85a], developed further with his students [VarSCC92]. This approach is very much functional analytic, a continuation of the methods encountered in Subsections 7.1 and 7.2, proving and using the so-called Nash inequalities. We will sketch this approach in the next section, then will study in more depth the method of evolving sets, a beautiful probabilistic approach developed by Ben Morris and Yuval Peres [MorP05]. An important special case can be formulated as follows: Theorem 8.1. IPd pn (x, y ) Cn
d 2

{t.varopoulos}

. For transitive graphs, this is an equivalence.

As can be seen from the second statement of the theorem, there are also bounds going in the other direction. For instance, an nd/2 heat kernel decay implies a so-called d-dimensional Faber-Krahn inequality, see [Cou00] for a nice overview, and [CouGP01] for a geometric approach for groups. For general graphs, this is strong enough to deduce an at least d-dimensional volume growth, but not IPd . However, lower bounds on the volume growth, without any regularity assumption, do not imply upper bounds on return probabilities. We will not study these questions, but before diving into Nash inequalities and evolving sets, let us give two quick exercises: Exercise 8.1. volume growth in a bounded degree graph implies recurrence. (a) Using the Carne-Varopolous bound, Theorem 9.2 below, show that a |Bn (x)| = o(n2 / log n)

(b) Construct a recurrent tree with exponential volume growth.

8.1

Poincar e and Nash inequalities

{ss.PoincareNash}

Recall the Sobolev inequality from Proposition 7.4: for f 0 (V ), V satises IPd () = of -measure n. Then fn f
d/(d1)

C () f

(8.1) {e.Sobolev}

To compare this inequality for dierent d values, take fn to be roughly constant with a support the function f becomes smaller by taking the derivative in any direction, but if the space has better isoperimetry, then from each point we have more directions, hence the integral f less mass compared to f f
1 1 d/(d1)

n(d1)/d . So, one way of thinking of this inequality is that loses

So, in (8.1), to make up for the loss caused by taking the derivative, we are taking dierent norms on the two sides, depending on the isoperimetry of the space. Another natural way of compensation

C () f

1.

1 . In particular, for the non-amenable case (d = ), we already have

74

would be to multiply the right hand side by a factor depending on the size of the support. This is done in the Poincar e inequalities, well-known from PDEs [Eva98, Section 5.8]: If U Rd with Lipschitz boundary, then f (x) fRU with fRU =
Lp (RU )

CU R f

Lp (RU )

(8.2) {e.Poincare}

RU

f (x) dx . vol(RU )

Since the quality of the isoperimetry now does not appear in (8.2), one might hope that this can be generalized to almost arbitrary spaces. Indeed, the following holds: Exercise 8.2 (Salo-Costes Poincar e inequality [Kle10]). Show that if f : R on any group , and BR is a ball of radius R in a Cayley graph, then f fBR Hints: we have f (y ) fBR
z BR 2 (BR )

{ex.SC}

vol(B2R ) R f vol(BR )

2 (B3R ) .

(8.3)

{e.SC}

|f (z ) f (y )| vol(BR )

gB2R

vol(BR )

|f (yg ) f (y )|

(8.4)

{e.Hint a}

and if g = s1 . . . sm with m 2R and generators si S , then f (yg ) f (y ) f (ys1 . . . sm ) f (ys1 . . . sm1 ) + + f (ys1 ) f (y ) . Note that for non-amenable groups, we have the Dirichlet inequality f case of Sobolev inequalities, no compensation is needed. The following exercise says that harmonic functions show that the Poincar e inequality (8.3) is essentially sharp. The statement can be regarded as a discrete analogue of the classical theorem that functions satisfying the Mean Value Property are smooth. (Such a function cannot go up and this contribution to a large distance.) down too much: whenever there is an edge contributing something to f , the harmonicity carries
{ex.revPoincare}
2

(8.5)
2,

{e.Hint b}

Theorem 7.3, i.e., the factor R on the right hand side of (8.2) can be spared. So, just like in the

C f

see

Exercise 8.3 (Reverse Poincar e inequality). Show that there is a constant c = c(, S ) > 0 such that for any harmonic function f on the Cayley graph G(, S ), c R f
2 (BR )

2 (B2R )

(8.6)

{e.reverse}

There are several ways how Poincar e or similar inequalities can be used for proving heat kernel estimates. The rst work in this direction was by Varopoulos [Var85a]. He proved sharp heat kernel upper bounds by rst proving that an isoperimetric inequality IPF (), with f = implies the following Nash inequality: f
2 2 t F(t)

increasing,

g f

2 1/

2 2

EP, (f ), 75

f 0 (V ) ,

(8.7) {e.Nash}

where g = C f(4t)2 . The proof is quite similar to the proofs of Proposition 7.4 and Theorem 7.3. To make the result more readable, consider the case of IPd (), i.e., F(t) = t(d1)/d . Then we have g(t) = C t2/d , and (8.7) reads as f
2

2 1/

2 1/d 2

2,

some kind of mixture of the Sobolev and Poincar e inequalities (8.1) and (8.3). For instance, when f = 1BR , we get the usual f
2

The next step is to note that

C R f

2.

But we will apply (8.7) to some other functions.

EP, (f ) 2( f
n

2 2

Pf

2 2) ,

where P = (I + P )/2 is the Markov operator for the walk made lazy. Therefore, applying the Nash inequality to f = P x for each n, we arrive at a dierence inequality on the sequence u(n) := P x namely u(n) 2g 1/u(n) u(n) u(n + 1) . This gives an upper bound on the decay rate of u(n) as a solution of a simple dierential equation. Then we can use that p2n (x, x) 2p2n (x, x). See [Woe00, Section 14.A] for more detail. A dierent approach, with stronger results, is given in [Del99]. For Markov chains on graphs with nice regular d-dimensional volume growth conditions and Poincar e inequalities, he proves that c1 nd/2 exp(C1 d(x, y )2 /n) pn (x, y ) c2 nd/2 exp(C2 d(x, y )2 /n) . Let us note here that a very general Gaussian estimate on the o-diagonal heat kernel decay is given by the Carne-Varopoulos Theorem 9.2 below.
n 2 2

= (P

2n

x , x ) = p2n (x, x)Cx ,

8.2

Evolving sets: the results

{ss.evolving}

Recall that a Markov chain is reversible if there is a measure m with (x)p(x, y ) = (y )p(y, x); such a is always stationary. This happens iff the transition probabilities can be given by symmetric conductances: c(x, y ) = c(y, x) with p(x, y ) = (x)p(x, y ), and the isoperimetric prole is (r) := inf For instance, IPd () implies (r) r1/d . Q(S, S c ) : (S ) r . (S )
c(x,y ) Cx ,

and then (x) = Cx is a good choice.

Even for the non-reversible case, when the stationary measure (x) is not Cx , we dene Q(x, y ) :=

On a nite chain, we take (V ) = 1, and let (r) = (1/2) for r > 1/2.

76

{t.Morris-Peres}

Theorem 8.2 (Morris-Peres [MorP05]). Suppose 0 < < 1/2, p(x, x) > x V . If n 1 + (1 )2 / 2 then pn (x, y )/ (y ) < or |pn (x, y )/ (y ) 1| < ,
4/ 4 min( (x), (y ))

4 du , u(u)2

(8.8)

{e.Morris-Peres}

depending on whether the chain is innite or nite. For us, the most important special cases are the following: 1) By the Coulhon-Salo-Coste isoperimetric inequality, Theorem 5.7, any group of polynomial growth d satises IPd . Then the integral in (8.8) becomes
4/

C
4

4 du (1/)2/d . u12/d

That is, the return probability will be less than after n 2/d steps, so pn (x, x) < Cnd/2 . 2) IP turns the integral into C log(1/), hence pn (x, x) < exp(cn), as we know from the Kesten-Cheeger-Dodziuk-Mohar Theorem 7.3. 3) For groups of exponential growth, the CSC isoperimetric inequality implies (r) c/ log r. Then the integral becomes
1/ 1

log2 u du log3 (1/) , u

thus pn (x, y ) exp(cn

1/3

). This is again the best possible, by the following exercise.

Exercise 8.4. * Show that on the lamplighter group Z2 Z we have pn (x, x) exp(cn1/3 ). As we discussed in the introduction to this section, these bounds were rst proved by Varopoulos, using Nash inequalities. The beauty of the Morris-Peres approach is that is completely probabilistic, dening and using the so-called evolving set process. Moreover, it works also for the non-reversible case, unlike the functional analytic tools. The integral (8.8) using the isoperimetric prole was rst found by Lov asz and Kannan [LovKa99], for nite Markov chains, but they deduced mixing only in total variation distance, not uniformly as Morris and Peres. To demystify the formula a bit, note that for a nite Markov chain on V with constant (since r = 1/2 is an inmum over a larger set), the Lov asz-Kannan integral bound implies the upper bound
1
1 |Vn |

uniform stationary distribution, using the bound (r) (1/2) = h for all r, where h is the Cheeger

1 log |Vn | dr = rh2 h2

on the mixing time, as was shown earlier in Theorem 7.6. The idea for the improvement via the integral is that, especially in geometric or transitive settings, small subsets often have better isoperimetry than the large ones. (Recall here the end of Section 5.3, and also see the next exercise.) 77

Example: Take an n-box in Zd . A sub-box of side-length t has stationary measure r = td /nd would need to show that sub-boxes are at least roughly optimal for the isoperimetric problem. This and boundary td1 /nd , hence the isoperimetric prole is (r) 1/t = r1/d /n. (Of course, one

can be done similarly to the innite case Zd , see Section 5.4, though it is not at all obvious from at r 1. Using Theorem 7.5, we get that the spectral gap is at least of order 1/n2 , which is still Lov asz-Kannan integral, the mixing time comes to Cd n2 . that.) This is clearly decreasing as r grows. Therefore, the Cheeger constant is h 1/n, achieved

sharp, but then Theorem 7.6 gives only a uniform mixing time Cd n2 log n. However, using the

Another standard example for random walk mixing is the hypercube {0, 1}d with the usual edges. Exercise 8.5. For 0 m d, show that the minimal edge-boundary for a subset of 2m vertices in the hypercube {0, 1}d is achieved by the m-dimensional sub-cubes. (Hint: use induction.) The isoperimetric prole of {0, 1}d can also be computed, but it is rather irregular and hard to

work with in the Lov asz-Kannan integral, and it does not yield the optimal bound. But, as we will see, with the evolving sets approach, one needs the isoperimetry only for sets that do arise in the evolving sets process: Exercise 8.6. * Prove O(n log n) mixing time for {0, 1}n using evolving sets and the standard onedimensional Central Limit Theorem.
{ex.hypermix}

8.3

Evolving sets: the proof

{ss.evolproof}

We rst need to give the necessary denitions for the evolving sets approach, and state (with or without formal proofs) a few basic lemmas. Then, before embarking on the proof of the full Theorem 8.2, we will show how the method works in the simplest case, by giving a quick proof of Kestens Theorem 7.3, the exponential decay of return probabilities in the non-amenable case, even for non-reversible chains. Let S V . Dene: S = {y : Q(S, y ) U (y )}, where U Unif [0, 1]. Remember: Q(S, y ) = process. Thus P y S S = P Q(S, y ) U (y ) = which leads to E[ (S ) | S ] =
y x S

(x)p(x, y ) with (y )P = (y ) (a stationary measure). This is one step of the evolving set Q(S, y ) , (y )

(y )P y S S =

Q(S, y ) = (S ) .
y

Therefore, for the evolving set process Sn+1 = Sn , the sequence { (Sn )} is a martingale. Now take S0 = {x}; then E[ (Sn ) ] = (x). Moreover,
(y ) Lemma 8.3. P[ y Sn | S0 = x ] (x) = pn (x, y ).

78

Proof. We use induction on n: pn1 (x, z )p(y, z ) = P[ z Sn1 | S0 = x ] (z ) p(z, y ) (x)

Note: P[ z Sn1 | S0 = x ]Q(z, y ) = E[ {zSn1 } | S0 = x ]Q(z, y ) But from before, P[ y Sn | S0 = x ] Thus, pn (x, y ) = P[ y Sn | S0 = x ] (y ) . (x) (y ) (y ) Q(Sn1 , y ) = E [ {z S n 1 } | S 0 = x ] (x) (x) (y )

shared by the Markov chain trajectory {Xn }, provided the chain is reversible. However, since P[ y S | S ] =
Q(S,y ) (y ) ,

These two properties, that (Sn ) is a martingale and P{x} [y Sn ] (y ) = pn (x, y ) (x), are the size of S will have a conditional variance depending on the size of the

boundary of S : the larger the boundary, the larger the conditional variance that the evolving set has. This makes it extremely useful if we want to study how the isoperimetric prole aects the random walk.

Recall that, if f is a concave function, by Jensens inequality E f (X ) f (E X ). Moreover, if there is a variance, then the inequality is strict. For example, if f is : Exercise 8.7. If Var[X ] c (EX )2 then E X (1 c ) EX , where c > 0 depends only on c > 0. Recall that Q(x, y ) = (x) p(x, y ) thus Q(A, B ) = (a)p(a, b).

aA,bB

Proof of Theorem 8.2. We will focus on the case of an innite state space. Let us rst state a lemma that formalizes the above observations: large boundary for S means large conditional variance for S , and that in turn means a denite decrease in E (S ) compared to E (S ). We omit the proof,
{l.teclem1}
e) (S (S )

because it is slightly technical, and we have already explained the main ideas anyway. Lemma 8.4. Assume 0 < 1/2 and p(x, x) , for all x V . If (S ) = 1 ES for each S V , we have that (S ) where 2 (S )2 , 2(1 )2 Q(S, S c ) . (S ) then,

(S ) =

79

Notice that, if we dene (r) = inf {(S ) : (S ) r}, then the lemma gives (r) 2 (r)2 , 2(1 )2
1 r (r )

and the proof of Theorem 8.2 reduces to show that pn (x, y )/ (y ) < for any S , if n > nential decay for non-amenable chains: If (r) h > 0 r, then Lemma 8.4 implies that (r) h > 0, i.e., E (S ) | (S ) (1 h ) (S ). (Sn )

dr.

As promised, an example of the usefulness of this method is that we can immediately get expo-

So, by iterating from S0 = {x} to Sn we obtain E{x}

(x) 1, for example in the group case, this implies that pn (x, y ) = P{x} [y Sn ] E{x} P{x} (Sn ) 1 and Markovs inequality.

exp(Cn). If we assume |Sn |

exp(Cn), as desired, where the middle inequality comes from the ridiculous bound P{x} [y Sn ]

However, for the general case we will have to work harder. The replacement for the last ridiculous

bound will be (8.12) below. A more serious issue is the non-uniformity of the isoperimerimetric bound. Recall that usually (e.g., on the lattice Zd ) the boundary-to-volume ratio is smaller for larger sets, hence, if at some point, (Sn ) happens to big (which is not very likely for a martingale at any given time n, since E (Sn ) = E (x), but still may happen at some random times), then our bound from Lemma 8.4 becomes weak, and we lose control. It would be much better to have a stronger downward push for larger sets; when (Sn ) is small, then we are happy anyway. For this reason, we introduce some kind of dual process. Denote the evolving set transition kernel by K (S, A) = PS [S = A], the probability that, being in S , the next step is A. Now, consider a non-negative function on the state space X = 2V of the evolving set process, which equals 1 on some X and equals 0 for another subset X , and harmonic on X \ ( ). Then, we can take Doobs h-transform: K (S, A) := h(A) K (S, A) h(S )

which is indeed a transition kernel: by the harmonicity of h, for A / , K (S, A) =


A A

h(A) h(S ) K (S, A) = = 1. h(S ) h(S )

The usual reason for introducing the h-transform is that the process given by the kernel K is exactly the original process conditioned to hit before (in particular, it cannot be started in , and once it reaches it is killed). Exercise 8.8. Show that SRW on Z, conditioned to hit some n 1 before 0, becomes a biased random walk with transition probabilities p(r, r 1) = (r 1)/2r for r = 1, . . . , n 1, but otherwise independently of n.

For the evolving set process, we will apply the Doob transform with h(S ) := (S ). Recall that (Sn ) is a martingale on X \ {} exactly because (S ) is harmonic for S = . (We have harmonicity 80

at S = V because that is an absorbing state of the original evolving sets chain.) So, for the innite state space evolving set process, K is the process conditioned on never becoming empty. We do not prove this carefully, since we will not use this fact. We now dene Zn := 1 . Then
(S n )

1 E {x } (x)

(Sn ) = E{x} Zn ,
(S n ) (S ) f (Sn )

(8.9) {e.ESEZ} for any function f .

because, as it is easy to check from the denitions, ES f (Sn ) = ES In the hat mean: E Zn+1 Zn Sn = E (Sn+1 ) (Sn )

Sn < 1 ( (Sn )),

(8.10) {e.e-psi}

with the last inequality provided by the denition of (r) after Lemma 8.4. When Zn is large then Sn is small and is good enough to get from (8.10) a signicant decrease for Zn in the new process. This sounds like black magic, but is nevertheless correct. It is a version of the technique of taking a strong stationary dual in [DiaF90]. Let us state another technical lemma (without proof) that formalizes how a control like (8.10) that we have now for Zn in the K -process actually leads to a fast decay in E. Lemma 8.5. Assume f0 : (0, ) [0, 1] is increasing and Zn > 0 is a sequence of random n
Z0 dz zf (z ) ,

{l.lem-tech2}

variables with Z0 xed. Suppose that E [Zn+1 |Sn ] Zn (1 f0 (Zn )) n, with f (Z ) = f0 (Z/2)/2. If for some , then EZn . We have to nd now a good way to deduce the smallness of return probabilities from the smallness of (8.9). Denition 8.1 (Non-symmetric distance between two measures). We dene as: 2 (, ) =
y

{d.chi2dist}

(y )

(y )2 . (y )2

For nite chains, we had the version (7.5), but we will not discuss nite chains here in the proof. When (y ) = pn (x, ) and (y ) = 1 then (pn (x, ), 1) =
1 y

|An | =

then = which is a reasonable measure of similarity. More generally, pn1 +n2 (x, y ) 1 = (z ) (z ) =
y

pn (x, y )2 . If pn (x, y ) = for y An ,

pn1 (x, y ) pn2 (x, y )


y

pn1 (x, y ) p n2 (x, y ) (y ) (y ) (y )

(8.11) {e.p-chi}

p n2 (z, ), m) (pn1 (x, ), m) ( using Cauchy-Schwartz, where p n2 (x, y ) =


(y ) (x) p(x, y )

stands for the stationary reversal.

81

Also, we can compare pn with m using evolving sets: 2 (pn (x, ), m) = = = (y )


y

P{x} [ y Sn ] (y )2 (x)2 (y )2 (y )P{x} [ y Sn , y Tn ]

1 (x)2

1 E{x} [ (Sn Tn ) ], (x)2

where {Sn } and {Tn } are two independent evolving set processes. Then, applying Cauchy-Schwartz, 1 1 E{x} (Sn Tn ) E {x } (x)2 (x)2 (pn (x, ), m) which is better than what we had before. Recall obtain that E{x} [Zn ] is bounded by . By setting = and using the inequalities (8.11) and (8.12), we obtain appropriate change of variables. Using the relation between and from Lemma 8.4, we nish the proof of Theorem 8.2 with the appropriate number of steps n.
1 ( x ) E {x }

(Sn ) (Tn ) .

Thus

1 E {x } (x)

(Sn ) ,

(8.12) {e.p-esp}

(Sn ) = E{x} [Zn ]. Thus, using Lemma 8.5 with f0 (Zn ) = ( (Sn )), we
pn+n (z )

can retrieve the number of steps n by the necessary number of steps in Lemma 8.5, modied by the

. We

9
9.1

Speed, entropy, Liouville property, Poisson boundary


{s.speedetc}

Speed of random walks

{ss.Speed}

We have seen many results about the return probabilities pn (x, x), i.e., about the on-diagonal heat kernel decay. We now say a few words about o-diagonal behaviour. Lemma 9.1. Given a reversible Markov chain with a stationary measure (x), then: a) pn (x, y ) b) sup
x,y

{l.offdiag}

(y ) n . (x)

p2n (x, y ) p2n (x, x) sup . (y ) (x) x

Proof. We will start by proving part a). (x)pn (x, y ) = (x , P n y ) x = P n y (y ) . (x) P n

82

Dividing by (x), pn (x, y ) n For part b), we use the fact that P is self-adjoint: (x)p2n (x, y ) = (x , P 2n y ) = (P n x , P n y ) (P n x , P n x )1/2 (P n y , P n y )1/2 = (x , P 2n x )(y , P 2n y )
1/2

(y ) . (x)

= ( (x)p2n (x, x) (y )p2n (y, y ))

1/2

The inequality here, as usually, is from Cauchy-Schwarz. Now divide both sides by (y ) (x) and take the supremum over all x and y : sup
x,y

p2n (x, x)1/2 p2n (y, y )1/2 p2n (x, y ) sup (y ) (x)1/2 (y )1/2 x,y p2n (x, x) = sup . (x) x
{t.carne}

Theorem 9.2 (Carne-Varopoulos [Car85, Var85b]). Given a reversible Markov chain, pn (x, y ) 2 (y ) n dist(x, y )2 exp (x) 2n ,

where the distance is measured in the graph metric given by the graph of the Markov chain, i.e., there is an edge between x and y if and only if p(x, y ) > 0. This form is due to Carne, with a miraculous proof using Chebyshev polynomials; the Reader is strongly encouraged to read it in [LyPer10]. Varopoulos result was a bit weaker, with a more complicated proof. There is now also a probabilistic proof, see [Pey08]. Proposition 9.3. Consider a graph G with degrees bounded by D, and a reversible Markov chain on its vertex set, with not necessarily nearest-neighbour transitions. Assume that the reversible measure (x) is bounded by B , that (o) > 0, and that < 1. Then, in the graph metric, lim inf
n

{p.speed}

dist(o, Xn ) > 0 a.s. n

Proof. Choose > 0 small enough that D < 1/. Then, Po [dist(o, Xn ) n] = pn (o, x)
xBn (o)

|Bn (o)|Cn Dn Cn exp(c n) . 83

Such a nite constant C exists because of part a) of Lemma 9.1 and the fact that (x) is bounded. The second inequality holds because of the choice of . Now we know that
n

Po [dist(o, Xn ) n] < .

Thus by Borel-Cantelli, dist(o, Xn ) n only nitely many times. So: lim inf
n

dist(o, Xn ) > . n
{l.speedexists}

Lemma 9.4. For a random walk on a group (with not necessarily nearest neighbour jumps in the Cayley graph that gives the metric), lim E dist(o, Xn ) exists. n

Proof. First we dene the sequence (an ) n=1 by an = E dist(o, Xn ). We can see that this sequence is subadditive: E dist(o, Xn+m ) E dist(o, Xn ) + E dist(Xn , Xn+m ) E dist(o, Xn ) + E dist(o, Xm ) . So we may apply Feketes lemma, which gives us that limn an /n exists. case of the following theorem shows. We have seen that non-amenability implies positive speed. The converse is false, as the d 3
{t.LLspeed}

Theorem 9.5. For the lamplighter groups Z2 Zd , the following bounds hold for dn = dist(o, Xn ): 1. d = 1 has E[dn ] 2. d = 2 has E[dn ] n

n log n

3. d 3 has E[dn ] n. In each case, we will prove here the statement only for some specic generating set. Question 9.6. Is it true that positive speed never depends on the nite symmetric generating set? The case for Z2 Zd is known [Ers04a]. Proof. In each case, we will consider the walk given by staying put with probability 1/4, switching the lamp at the present location with probability 1/4, and moving to one of the 2d neighbours with probability 1/(4d) each. Then, if Xn = (Mn , n ), where Mn is the position of the marker (the lamplighter) and n is the conguration of the lamps, then it is easy to give some crude bounds on dn : it is at least | supp n |, the number of lamps on, plus the distance from the origin of the farthest lamp on, and it is at
{q.speedinv}

84

Figure 9.1: The generators for the walk on Z2 Zd . most the number of steps in any procedure that visits all the lamps that are on, switches them o, then goes back to the origin. Moreover, the expected size of supp n can be estimated based on the expected size of the range Rn of the movement of the lamplighter, using the following exercise: Exercise 9.1. Show that P after last visit to x up to time n, the lamp is on x Rn 1/4. Therefore, we have E |Rn | E | supp n | E |Rn | , 4 so estimating E |Rn | will certainly be important. Now, in the d = 1 case, we have | min supp n | | max supp n | dn |Mn | + 3| min supp n | + 3| max supp n | ,

{f.LLd}

{ex.lastvisit}

(9.1) {e.litrange}

(9.2) {e.LLd1}

using that the number of lamps on is at most | min supp n | + | max supp n |. By the Central Limit Theorem, P[ |Mn | > n ] > for some absolute constant > 0. Thus, by Exercise 9.1, we have P supp n [ n, n] > /4. Now, by the left side of (9.2), we have E[ dn ] E[ | min supp n | | max supp n | ] P supp n [ n, n] n 2 n/4, which is a suitable lower bound.
For an upper bound, let Mn = maxkn |Mk |. Clearly, | min supp n | | max supp n | Mn . From the CLT, it looks likely that not only Mn , but even Mn cannot be much larger than n, but

if we do not want to lose log n-like factors, we have to be slightly clever. In fact, we claim that
P[ Mn t ] 4 P[ Mn t ] .

(9.3) {e.maxreflection}

This follows from a version of the reection principle. If Tt is the rst time for Mn to hit t Z+ , then, by the strong Markov property and the symmetry of the walk restarted from t, P[ Tt n, Mn t ] inequality can be rewritten as 1 P[ Tt n ] ; 2

we dont have exact equality because of the possibility of Mn = t. Since {Mn t} {Tt n}, this 2 P[ Mn t ] P[ Tt n ] = P max Mk t .
k n Since P[ Mn t ] 2 P[ maxkn Mk t ], we have proved (9.3).

85

Summing up (9.3) over t, we get E[ |Mn | ] 2 E[ |Mn | ] C n for some large C < , from the CLT. Using (9.2), we see that E[ dn ] C n, and we are done. the general fact that pk (0, 0) k d/2 on Zd ). Thus, Now consider the case d = 2. For the lamplighter, we have pk (0, 0) 1/k . (This comes from
n k=1

exactly the expected number of visits to 0 by time n. This suggests that once a point is visited, it is typically visited roughly log n times in expectation, and thus the walk visited roughly n/ log n dierent points. In fact, it is not hard to prove that E |Rn | n . log n (9.4) {e.2dRn}

pk (0, 0) log n. But

n k=1

pk (0, 0) is

Exercise 9.2. Prove the above statement. The precise asymptotics in the above statement is also known, a classical theorem by Erd os and Dvoretzky. Since, to get from Xn to the origin, we have to switch o all the lamps, by (9.1) we have E |Rn |/4 E dn . On the other hand, Rn is a connected subset of Z2 containing the origin, so we can take a spanning tree in it, and wherever Mn Rn is, can go around the tree to switch o all the lamps and return to the origin. This can be done in less than 3|Rn | steps in Z2 , while the switches

take at most |Rn | steps, hence dn 4|Rn |. Altogether, E dn E |Rn |, so the d = 2 case follows from (9.4). For the case d 3, consider the following exercise: Exercise 9.3. For simple random walk on a transitive graph, we have: E |Rn | = q := Po [never return to o] . n n lim From this exercise and (9.1), we can deduce that: n E[dn ] E | supp n | thus proving the case d 3. The distinction between positive and zero speed will be a central topic of the present section. And, of course, there are also ner questions about the rate of escape than just being linear or sublinear, which will be addressed in later sections. E |Rn | qn , 4 4

9.2

The Liouville property for harmonic functions

{ss.Liouville}

As we saw in the lecture on evolving sets, harmonic functions with respect to a Markov chain are intimately related to martingales. Denition 9.1. A submartingale is a sequence (Mn ) n=1 such that the following two conditions hold: n E |Mn | < , 86
{d.suMG}

n E[ Mn+1 | M1 , . . . , Mn ] Mn . A supermartingale is a similar sequence, except that the second condition is replaced by n E[ Mn+1 | M1 , . . . , Mn ] Mn . Theorem 9.7 (Martingale Convergence Theorem). Given a submartingale (Mn ) n=1 , and some constant B with Mn B almost surely, then there exists a random variable M with Mn M a.s. For a proof, with a condition relaxing boundedness, see [Dur96, Theorem 4.2.10]. Corollary 9.8. A recurrent Markov chain has no nonconstant bounded harmonic functions. Proof. If f is a bounded harmonic function, then Mn = f (Xn ) is a bounded martingale. If f , in addition, is nonconstant, then there are x and y states such that f (x) = f (y ). The walk is recurrent, so returns to these states innitely often. So N m, n > N such that Mm = f (x) and Mn = f (y ). So Mn cannot converge to any value, which contradicts Theorem 9.7. So no such f can exist.
{t.LiouProp} {c.nobddharm}

{t.MGconverge}

Theorem 9.9. For any Markov chain, if x, y V there is a coupling (Xn , Yn ) of random walks harmonic function is constant.

starting from (x, y ) such that P[ Xn = Yn ] 0, then it has the Liouville property, i.e. every bounded

Proof. Suppose f is a bounded harmonic function, say |f | < B . Then f (Xn ) and f (Yn ) are both martingales, with E[ f (Xn ) ] = f (x) and E[ f (Yn ) ] = f (y ). Thus, n |f (x) f (y )| = |Ef (Xn ) Ef (Yn )| E |f (Xn ) f (Yn )| P[ Xn = Yn ] 2B . Therefore, |f (x) f (y )| limn P[ Xn = Yn ] 2B = 0, so f must be constant. Theorem 9.10 (Blackwell 1955). Zd has the Liouville property. Proof. Clearly, f is harmonic with respect to P if and only if it is harmonic with respect to
d I +P 2

{t.Blackwell}

the lazy version of P . So, consider the lazy random walk in Z given by ipping a d-sided coin to determine which coordinate to move in and then using the lazy simple random walk on Z in that coordinate. This is the product chain of lazy walks coordinatewise. Now consider the following coupling: Xn and Yn always move in the same coordinate as one another. If their distance in that coordinate is zero, then they move (or remain still) together in that coordinate. If not, then when X moves, Y stays still and when X stays still, Y moves. Each of X and Y when considered independently is still using the lazy random walk described. Now, considering one coordinate at a time, if Xn and Yn have not yet come together in that coordinate, then whenever that coordinate is chosen, the distance goes up by one with probability 87

1/2 and down by one with probability 1/2. This is equivalent to a simple random walk on Z, which is recurrent, so with probability 1, Xn and Yn will eventually have distance zero in that coordinate, and then will remain together for the rest of the coupling. This happens in each coordinate, so with probability 1, N such that Xn = Yn n N . So by Theorem 9.9, this chain has the Liouville property. Exercise 9.4. * Show that any harmonic function f on Zd with sublinear growth, i.e., satisfying lim
x
2

{ex.sublin}

f (x)/ x

= 0, must be constant.

Exercise 9.5. Any random walk on Zd with symmetric bounded jumps has the Liouville property. In fact, the classical Choquet-Deny theorem (1960) says that any measure on any Abelian group has the Liouville property. See Theorem 9.23 below for a proof. We now give an example that does not have the Liouville property: the d-regular tree Td with d 3.

Figure 9.2: Portion of a d-regular tree (d = 3) Consider Figure 9.2, and let f (y ) = Py [{xn } ends up in A]. By transience and by any vertex

{f.dreg}

a bounded harmonic function. Let p = Px [never hit 0] (0, 1), then f (x) = (1 p)f (o). This shows that f is nonconstant. Exercise 9.6. Show that the lamplighter group Z2 Zd with d 2 has the Liouville property. Exercise 9.7. Show that the lamplighter group with d 3 does not have the Liouville property. A hint for these two exercises is that positive speed and the existence of non-trivial bounded harmonic functions will turn out to be intimately related issues, so you may look at the proof of Theorem 9.5.

being a cutpoint of the tree, limn 1xn A exists a.s, so this denition makes sense. It is obviously

9.3

Entropy, and the main equivalence theorem

{ss.SpEnt}

relationship between positive speed and the existence of non-trivial bounded harmonic functions.

As we have seen in the previous two sections, on the lamplighter groups Z2 Zd there is a strong

The main result of the current section is that this equivalence holds on all groups, see Theorem 9.12

88

below. However, a transparent probabilistic reason currently exists only on the lamplighter groups. For the general case, the proof goes through more ergodic/information theoretic notions like entropy and tail- -elds. We start by discussing entropy. Let be a nitely supported probability measure on a group such that supp generates . Then we get a nearest neighbour random walk on the directed right Cayley graph of given by the generating set supp : P[ xn+1 = h | xn = g ] = (g 1 h) , where h, g . Then the law of Xn is the n-fold convolution n = n , where ( )(g ) := our previous random walks on groups were examples of such walks.
h

(h) (h1 g ). This walk is reversible iff is symmetric, i.e., (g 1 ) = (g ) for all g . All

Denition 9.2. The Shannon entropy H () of a probability measure is dened by: H () := (x) log (x) .
x

Similarly, for a random variable X with values in a countable set, H (X ) is the entropy of the measure (x) = P[ X = x ]. Note that H () log | supp | by Jensens inequality, with equality for the uniform measure.

A rough interpretation of H () is the number of bits needed to encode the amount of randomness in , or in other words, the amount of information contained in a random sample from . (The interpretation with bits works the best, of course, if we are using log2 .) A quote from Claude Shannon (from Wikipedia):
My greatest concern was what to call it. I thought of calling it information, but the word was overly used, so I decided to call it uncertainty. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, You should call it entropy, for two reasons. In the rst place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.

Denition 9.3. The asymptotic entropy h() of the random walk generated by is dened by: h() := lim H (Xn ) H (n ) = lim . n n n n

Exercise 9.8. Show that h() exist for any on a group . As a corollary to the fact H () log | supp |, if has sub-exponential volume growth (in the
{t.SMB}

directed Cayley graph given by the not necessarily symmetric supp ), then h() = 0.

Here is a fundamental theorem that helps comprehend what asymptotic entropy means: Theorem 9.11 (Shannon-McMillan-Breiman, version by Kaimanovich-Vershik). For almost every random walk trajectory {Xn }, log n (Xn ) = h(). n n lim 89

exactly H (n ), hence the sequence converges in expectation to h() by denition. So, this theorem is an analogue of the Law of Large Numbers. For a proof, see [KaiV83]. Exercise 9.9. h() = 0 i > 0 there exists a sequence {An } with n (An ) > 1 and log |An | = o(n). In words, zero entropy means that the random walk is mostly conned to a sub-exponentially growing set. We now state a central theorem of the theory of random walks on groups. We will dene the invariant -eld (also called the Poisson boundary) only in the next section, so let us give here an intuitive meaning: it is the set of dierent places where innite random walk trajectories can escape (with the natural measure induced by the random walk, called the harmonic measure at innity). A great introduction to the Poisson boundary is the seminal paper [KaiV83]. Theorem 9.12. For any symmetric nitely supported random walk on a group, the following are equivalent: (S) positive speed () > 0, (E) positive asymptotic entropy h() > 0, (H) existence of non-constant bounded harmonic functions (the non-Liouville property), (I) non-triviality of the invariant -eld (Poisson boundary). Here are some one-sentence summaries of the proofs: being very much spread out. This will be proved in this section. Note that the reversibility of the walk (in other words, the symmetry of ) is important: consider, e.g., biased random walk on Z, which has positive speed but zero entropy. bounded harmonic function evaluated along a random walk trajectory started at some vertex x is a from a bounded harmonic function we can construct a function at innity, and vice versa. the amount of information gained about the rst step by knowing the limit of the random walk trajectory, see (9.10). Hence positive entropy means that non-trivial information is contained in the boundary. This equivalence holds also for non-symmetric measures , but it is important for the walk to be group-invariant, as will be shown by an example of a non-amenable bounded degree graph that has positive speed and entropy, but has the Liouville property and trivial Poisson boundary. Beyond the case of lamplighter groups discussed in the previous sections, the following is open: Exercise 9.10. *** Find a direct probabilistic proof of (S) (H) for symmetric bounded walks on any group. Is there a quantitive relationship between the speed and the amount of bounded harmonic Finally, (E) (I) will be proved in Section 9.5, by expressing the asymptotic entropy as The equivalence (I) (H) holds for any Markov chain, and will be proved in Section 9.4: a The meaning of (S) (E) is that positive speed for a reversible walk is equivalent to the walk
{t.SEHI}

To see why this theorem should be expected to hold, note that the expectation of log n (Xn ) is

bounded martingale, which has an almost sure limit whose expectation is the value at x. Therefore,

90

functions, e.g., their von Neumann dimension, dened later? (It is clear from the discussion of the dierent equivalences above that both the symmetry and the group invariance of the walk is needed.) For symmetric nitely supported measures with support generating a non-amenable group, from Proposition 9.3 we know that the walk has positive speed and hence non-trivial Poisson boundary. The importance of the lamplighter group examples was pointed out rst by [KaiV83]: Z2 Zd shows that exponential volume growth is not enough (d 2) and non-amenability is not needed (d 3) boundary: the rst means that n (1) decays exponentially (Theorem 7.3), while the second means that n (Xn ) decays exponentially (by Theorem 9.11). We will see a characterization of amenability using the Poisson boundary in Theorem 9.22 below. Now, we prove (S) (E) by giving the following quantitative version: Theorem 9.13 (Varopoulos [Var85b] and Vershik [Ver00]). For any symmetric nitely supported measure on a group, ()2 /2 h() () (), growth of the graph, both in the generating set given by supp . (The latter limit exists because |Bn+m | |Bn | |Bm |.) Exercise 9.11. Prove the theorem: for the lower bound, use Carne-Varopoulos (Theorem 9.2) and Shannon-McMillan-Breiman (Theorem 9.11); for the upper bound, called the fundamental inequality by Vershik, use that entropy on a nite set is maximized by the uniform measure. Exercise 9.12 (Vershik). *** Does there exist for any nitely generated group a nitely supported with h() = () ()? The trouble with a strict inequality is that then the random walk measure is far from uniform on the sphere where it is typically located, i.e., sampling from the group using the random walk is not a good idea. As we mentioned in Question 9.6, it is not known if the positivity of speed on a group is independent of the Cayley graph, or more generally, invariant under quasi-isometries. On general graphs, the speed does not necessarily exist, but the Liouville property can clearly be dened. However, for the class of bounded degree graphs it is known that the Liouville property is not quasi-isometry invariant [LyT87, Ben91]. A simple example is given in [BenS96a], which we now describe: be the subset of words in which the ratio of 1s among the letters is more than 2/3. Now consider the lattice Z4 , take a bijection between the vertices along the x-axis of Z4 and the vertices in A, and put an edge between each pair of vertices given by this bijection. This graph will be G. It easily follows from the law of large numbers that SRW on Y spends only nite time in A, and hence SRW on G, started from any y Y , will ever visit the Z4 part only with a probability bounded away from 1. Whenever the walk visits Z4 , by the transience of the Z3 direction orthogonal to the x-axis, with positive probability it will eventually stay in Z4 . So, the function v Pv [ the walk ends up in Z4 ] 91 The set Y = {0, 1} of nite 0-1-words can naturally be viewed as an innite binary tree. Let A
{ex.funda}
where () is the speed of the walk and () := (limn log |Bn |)/n is the exponential rate of volume

for positive speed. Nevertheless, it is worth comparing non-amenability and non-trivial Poisson

{t.VVSE}

for v G is a non-trivial bounded harmonic function. Now, let Y be the tree where each edge

join of Y and Z4 as before. If k is large enough, then SRW on Y will visit A innitely often, and hence it will enter Z4 innitely often, and hence will almost surely end up in Z4 . By the coupling results Theorem 9.9 and Theorem 9.10 for the triviality of bounded harmonic functions, this means that G has the Liouville property. However, G and G are obviously quasi-isometric to each other.

(w, w0) of Y is replaced by a path of length k , for some large but xed k N, and G is the same

9.4

Liouville and Poisson

{ss.LioPoi}

Having proved (S) (E), we now turn to harmonic functions and the invariant sigma-eld. Let us start with a basic result of probability theory:
{t.Levy01}

Theorem 9.14 (L evys Zero-One Law). Given a ltration Fn F , and X almost surely bounded, we have that: E[ X | Fn ] E[ X | F ] a.s. This is called a zero-one law since, for the case X = 1A with A F , it says P[ A | Fn ] 1A a.s., hence the conditional probabilities P[ A | Fn ] converge to 0 or 1. Proof. This is simply a special case of the Martingale Convergence Theorem 9.7. To see this, recall from Section 6.3 that Mn := E[X |Fn ] is a bounded martingale, and then apply the theorem. Although this result might seem obvious, it is quite powerful: for example, it implies the next theorem. Theorem 9.15 (Kolmogorovs 0-1 law). If X1 , X2 , . . . are independent random variables and A is a tail event (i.e., its occcurence or failure is determined by the values of these random variables, but it is probabilistically independent of each nite subset of them), then P[ A ] = 0 or 1. Proof. For each n, A is independent of Fn , so E 1A Fn = P[ A ]. As n , the left-hand side P[ A ] {0, 1}. measure space Let P be a transition matrix for a Markov chain on a countable state space S . Let be the = { yj } j =0 : P (yj , yj +1 ) > 0, j , with the usual Borel -eld (the minimal -eld that contains all the cylinders {{yj } j =0 : y0 = any x S , dene the measure space x0 , . . . , yk = xk }, for all k and x0 , . . . , xk S ), and the natural measure generated by P . Also, for (x) = {yj } j =0 : y0 = x, P (yj , yj +1 ) > 0, j ,
{t.Kol01}

converges to 1A almost surely, by L evys theorem. So P[ A ] = 1A almost surely, and it follows that

92

with the Borel -eld. Now the measure generated by P is a probability measure: the Markov chain trajectories started from x. We dene two equivalence classes. For y, z let y z n and y z k, n m n
I T

m n

ym = zm ,

y m = z m +k .

the invariant -eld I on (and identically for (x) for any x S ): Denition 9.4. A T if and only if (i) A is Borel-measurable.

We identically dene the equivalence for (x) for any x S . We now dene the tail -eld T and
{d.tailinv}

(ii) y A, y z = z A, i.e., A is a union of tail equivalence classes. A I if and only if (i) A is Borel-measurable. (ii) y A, y z = z A, i.e., A is a union of invariant equivalence classes. Borel measureable and y z implies F (y ) = F (z )). Similarly, F is called an invariant function if it is I -measureable. Of course, being an invariant function is a stronger condition than being a tail function. A key example of strict inequality is simple random walk on a bipartite graph G(V, E ) with parts V = V1 V2 : the event A = {y2n V1 for all n} is in the tail eld but is not invariant. It is also important to notice that although the jumps in the Markov chain are independent, Denition 9.5. A function F : R is called a tail function if it is T -measureable (i.e., F is
T I

Kolmogorovs 0-1 law does not say that tail or invariant functions are always trivial, since the yn s are absorbing, and p(0, i) = 1/3 for all i, then B = {yn = 1 eventually} is an invariant event, with P0 [B ] = 1/2. Call two tail or invariant functions f, g : R equivalent if Px [f = g ] = 1 for any x S . themselves are not independent. For instance, if the chain has three states, {0, 1, 2}, where 1 and 2

Accordingly, we say that T or I is trivial if for any x S and any event A in the -eld,

I in the above example of simple random walk on a bipartite graph disappears: for x V1 , we

Px [A] {0, 1}. Note that if we consider only the measures Px , then the distinction between T and have Px [ A ] = 1, hence a Px -measure zero set can be added to A to put it into I , while for x V2 , we have Px [ A ] = 0, hence a Px -measure zero set can be subtracted from A to put it into I . The following theorem shows that for SRW on groups, a similar collapse always happens; on the other hand, the exercise afterwards shows that this is not a triviality.

93

{t.TI}

Theorem 9.16. For simple random walk on a nitely generated group, with starting point x , the tail and invariant -elds up to T and I coincide up to Px -measure zero sets. Exercise 9.13. Give an example of a graph G = (V, E ) and a vertex x V such that for simple random walk started at x, the -elds T and I are not the same up to Px -measure zero sets. The proof of Theorem 9.16 relies on the following result, whose proof we omit, although the Reader is invited to think about it a little bit: Theorem 9.17 (Derriennics 0-2 law [Der76]). For SRW on a nitely generated group, for any k N, n n+k
1

{t.D02}

= 2 dTV (n , n+k ) =

for all n , or as n .

Exercise 9.14. Prove Theorem 9.16 from Theorem 9.17 .

o(1)

We now have the following connection between the invariant -eld and bounded harmonic Theorem 9.9, which roughly said that if there is one possible limiting behavior of a Markov chain, then there are only trivial bounded harmonic functions. Theorem 9.18. There is an invertible correspondence between bounded harmonic functions on S
{t.harmonicinv}

functions for arbitrary Markov chains on a state space S . In some sense, it is a generalization of

and equivalence classes of bounded invariant functions on .

sponding harmonic function u : S R is

Proof. Let f : R be an invariant function representing an equivalence class, then the correu(x) = Ex f (X0 , X1 , . . .) ,

harmonic, then the corresponding equivalence class is the one represented by the invariant function f : R dened by f (y0 , y1 , . . .) = lim sup u(yn ) .
n

where Ex is the expectation operator with respect to Px . From the other direction, let u : S R be

We now prove this correspondence is invertible. For one direction, let u : S R; we need to show u(x) = Ex lim sup u(Xn ) . This follows easily from the fact that u(Xn ) is a bounded martingale, and so by the Martingale Convergence Theorem 9.7 and the Dominated Convergence Theorem, we have that Ex lim sup u(Xn ) = Ex (u(X0 )) = u(x). need to prove that for any x S the following equality holds Px -a.s.:
n

For the other direction, let f : R be invariant and represent an equivalence class, then we f (y0 , y1 , . . .) = lim sup Eyn f (X0 , X1 , . . .) . 94

approximated by the average end-result of a new random walk started at a large yn . Indeed, for any x S we have by L evys 0-1 law Theorem 9.14 that Px -a.s. f (y0 , y1 , . . .) = lim Ex f (y0 , . . . , yn , Xn+1 , . . .) X0 = y0 , . . . , Xn = yn ,
n

In words, for almost every random walk trajectory {yn } started from y0 = x, the end-result is well-

but since f is an invariant function, Ex f (y0 , . . . , yn , Xn+1 , . . .) X0 = y0 , . . . , Xn = yn = Eyn f (X0 , X1 , . . .) , and we are done. By looking at indicators of invariant events, we get the following immediate important consequence, a much more general version of the (I) (H) equivalence in Theorem 9.12. Corollary 9.19. The invariant -eld of is trivial if and only if all bounded harmonic functions on S are constant. The invariant -eld of a Markov chain (or equivalently, the space of bounded harmonic functions) is called the Poisson boundary. We can already see why it is a boundary: it is the space of possible dierent behaviours of the Markov chain trajectories at innity. The name Poisson comes from the following analogue. then we can dene its harmonic extension f : U R by the Poisson formula
1

{c.harmbound}

If U C is the open unit disk, and f : U R is a bounded Lebesgue-measurable function, 1 |z |2 f () d , |e2i z |2

f (z ) =
0

which is nothing else but integration against the harmonic measure from x, i.e., Ex f ( ), where is the rst hitting time of U by Brownian motion in U. That is, U plays the role of the Poisson boundary for Brownian motion on U. This analogy between random walks on groups and complex analysis can be made even closer by equipping U by the hyperbolic metric, and then writing U = P SL(2, R)/SO(2), since P SL(2, R) is the group of the orientation-preserving hyperbolic isometries (the M obius transformations) acting on U transitively, and SO(2) is the stabilizer of 0 U. Now, harmonic measure on U w.r.t. hyperbolic Brownian motion is the same as w.r.t. the Euclidean one, except that the hyperbolic BM never hits the ideal boundary U , only converges to it. So, the Poisson boundary of the innite group P SL(2, R) can be realized geometrically as U. This is true in much greater generality: e.g., the Poisson boundary of Gromov-hyperbolic groups can be naturally identied with their usual topological boundary [Kai00].

9.5

The Poisson boundary and entropy. The importance of group-invariance

{ss.PoiEnt}

We now prove the (E) (I) part of Theorem 9.12, in a more general form: for arbitrary nite entropy measures instead of just symmetric nitely supported ones. 95

{t.PoiEnt}

Theorem 9.20 ([KaiV83]). For any countable group and a measure on it with nite entropy H () < , the Poisson boundary of is trivial iff the asymptotic entropy vanishes, h() = 0. Proof. We will need the notion of conditional entropy: if X and Y are random variables taking values in some countable sets, then H (X | Y ) := In particular, H (X, Y ) H (X ) + H (Y ) , with equality if and only if X and Y are independent.
1 With this denition, using the notation xi = yi 1 yi for i 1 on the trajectory space (y0 ), from

x,y

P[ X = x, Y = y ] log P[ X = x | Y = y ] = H (X, Y ) H (Y ) .

(9.5) {e.Hcond}

(9.6) {e.Hjoint}

Py0 Xi = yi for i = 1, . . . , k Xn = yn = we easily get

1 (x1 ) (xk ) nk (yk yn ) n (yn )

(9.7) {e.9}

H (X1 , . . . , Xk | Xn ) = kh1 + hnk hn ,

(9.8) {e.7}

H (X1 , . . . , Xk | Xn ), and (9.6) implies that H (X1 , . . . , Xk | Xn , Xn+1 ) H (X1 , . . . , Xk | Xn+1 ). Combined with the k = 1 case of (9.8), we get that hn hn1 hn+1 hn . From this and the denition of h(), we get that hn+1 hn decreases monotonically to h(). Now, take the limit n in (9.8) to get
n

where hi := H (i ) = H (Xi ). Now notice that Markovianity implies that H (X1 , . . . , Xk | Xn , Xn+1 ) =

lim H (X1 , . . . , Xk | Xn ) = kh1 kh() = H (X1 , . . . , Xk ) kh() ,

(9.9) {e.12}

where H (X1 , . . . , Xk ) = kH () = kh1 is similar to (9.7). Note that, in particular, we get the following formula for the asymptotic entropy: h() = H (X1 ) lim H (X1 | Xn ) .
n

(9.10) {e.13}

By (9.6), we can reinterpret (9.9) as follows: h() = 0 iff (X1 , . . . , Xk ) is asymptotically indepen(Theorem 9.14) that limn Py0 [ A | Xn ] = Py0 [ A ]. Since, conditioned on Xn , the tail event A A is independent of (X1 , . . . , Xk ). So, by Kolmogorovs 0-1 law (Theorem 9.15), the tail -eld T is trivial. Vice versa, if T is trivial, then (X1 , . . . , Xk ) must be asymptotically independent of Xn , hence h() = 0. Recall now from Theorem 9.16 that for SRW on groups, T = I up to Px -measure zero sets, and we are done. 96 evys 0-1 law dent of Xn for all k 1, under any Py0 . Note that, for any event A T , we have by L

is independent of (X1 , . . . , Xk ), the asymptotic independence of (X1 , . . . , Xk ) from Xn implies that

In the above proof, and hence in the equivalence between positive speed and the non-Liouville property, it is important for the walk to be group-invariant, as shown by the following example, a non-amenable bounded degree graph that has positive speed and entropy, but has the Liouville property and trivial Poisson boundary. The main place where the above proof breaks down is in (9.7),
1 where we wrote P[ Xn = yn | Xk = yk ] = nk (yk yn ), and hence got hnk instead of H (Xn | Xk )

in (9.8).

Example: a non-amenable Liouville graph [BenK10]. Take an innite binary tree, and on the vertex set of each level Ln , place a 3-regular expander. Clearly, simple random walk on this graph G = (V, E ) has positive speed (namely, 2/6 1/6 = 1/6) and positive entropy (after n steps, the walk is close to level n/6 with huge probability, and given that this level is k , the distribution is uniform on this set of size 2k ). However, this graph is Liouville. The main idea is that the expanders mix the random walk fast enough within the levels, so that there can never be a specic direction of escape. Here is the outline of the actual argument. measure on level Ln of simple random walk (i.e., the distribution of the rst vertex hit), for n larger than the level of u. Then, by the Optional Stopping Theorem (see Section 6.3), for all large n, h(u) h(v ) =
u v u v h(w)(n (w) n (w)) max h(x) 2 dTV (n , n ). x G u Let h : V R be a bounded harmonic function. Let u, v V , and let n denote the harmonic

w L n

So, we need to show that the total variation distance between the harmonic measures converges to zero. The idea for this is that when rst getting from Ln to Ln+1 , the walk with positive probability makes at least one step inside the expander on Ln , delivering some strict contractive eect in the L2 -norm. (All Lp norms are understood w.r.t. the uniform probability measure on Ln .) It is also not hard to show that moving in the other directions are at least non-expanding. Formally, dene the operator Tn : L2 (Ln ) L2 (Ln+1 ) by Tn (y ) := 2
x L n x (x)n +1 (y ) ,

where the factor 2 = |Ln+1 |/|Ln | ensures that Tn 1 = 1, and, in fact, one can show that Tn and Tn

looking at the possible rst steps of the walk before hitting Ln+1 , we can write Tn as a weighted sum of composition of other operators, all with L2 -norm at most 1, while the one encoding the step within Ln is a strict L2 -contraction on functions orthogonal to constants. Altogether, we get that for any function L2 (Ln ) with This implies that
u fn

1, so, by the Riesz-Thorin interpolation theorem, also Tn

22

1. Then, by

11

:=

u 2n n

n . This also easily implies that

u 1 = Tn Tn1 Tk+1 (2 u 1), where u Lk , satises fn

x L n

(x) = 0, we have Tn
k

(1 c)

with some c > 0.


2

0 as

u v u 2 dTV (n , n ) fn

v + fn

0,

and we are done. A key motivation for [BenK10] was the following conjecture: 97

{ex.infexp}

Exercise 9.15 (Itai Benjamini). *** Show that there exists no innite expander: this would be a bounded degree innite graph such that every ball Bn (x) around every vertex x is itself an expander, with a uniform Cheeger constant c > 0. Show at least that this property is invariant under quasi-isometries, or at least independent of the choice of a nite generating set of a group. Note that the graph in the above example has expander balls around the root of the binary tree. One may hope that the above proof could be generalized to show that simple random walk on an innite expander would always have trivial Poisson boundary, and thus this could not be a Cayley graph, at least. However, such a generalization seems problematic. It is important in this proof that the harmonic measure from the root on the levels Ln is uniform; in fact, if we start a random walk uniformly at x Ln , then the averaged harmonic measure on Ln+1 is uniform again. Formally, this is used in proving Tn

say, if we place expanders on the levels on an irregular innite tree, one could work with non-uniform measures on the levels in the denition of the operators Tn or in the operator norms, and still could have Tn
2

1 w.r.t. the uniform measures on the levels. Without this uniformity,

and if the harmonic measure on a level is concentrated on a very small subset, then the walk might not feel the contractive eect in the horizontal direction. See, e.g., the following exercise. Exercise 9.16 ([BenK10]). * Consider an imbalanced binary tree: from each vertex, put a double edge to the right child and a single edge to the left child. Then place a regular expander on each level in a way that the resulting graph has a non-trivial Poisson boundary. This issue calls for the following question: Exercise 9.17. *** Does every nitely generated group have a generating set in which the harmonic
o in measures n on the spheres Ln := V Bn (o) are roughly uniform in the sense that there exist 0 < o o o (y ) < C for all c, C < such that for each n there is Un Ln with n (Un ) > c and c < n (x)/n

1 for all n. However, the expanders on the levels are expanders w.r.t. uniform measure,

{ex.unifharm}

x, y Un ? This is very similar to Vershiks question, Exercise 9.12.

An armative answer to this question, together with an armative answer to the question of invariance of the innite expander property under a change of generators would give a proof of Benjaminis conjecture, Exercise 9.15, for Cayley graphs. On the other hand, one could try to nd counterexamples to Exercises 9.17 and 9.12 among non-amenable torsion groups, say (see Section 15.2).

9.6

Unbounded measures

{ss.unbound}

There are a lot of nice results showing that there are good reasons for leaving sometimes our usual world of nitely supported measures.

98

The following two theorems from [KaiV83], whose proof we omit, make a direct connection between non-trivial Poisson boundary and non-amenability. Recall that a measure on a group is called aperiodic if the largest common divisor of the set {n : n (1) > 0} is 1. Theorem 9.21. For any aperiodic measure whose (not necessarily nite) support generates the group , the Poisson boundary of is trivial if and only if the convolutions n converge weakly to a left-invariant mean on L (), see Denition 5.3, or equivalently, dTV n (), n (g 1 ) 0. This immediately implies that on a non-amenable group any non-degenerate has non-trivial Poisson boundary. (For symmetric nitely supported measures we knew this from Proposition 9.3.) Conversely, Theorem 9.21 can also be used to produce a symmetric measure with trivial Poisson boundary in any amenable group. The idea is to take an average of the uniform measures on larger and larger Flner sets. This establishes the following result, conjectured by Furstenberg. Theorem 9.22. A group is amenable iff there is a measure supported on the entire whose Poisson boundary is trivial. In this theorem, the support of the measure that shows amenability might indeed need to be large: it is shown in [Ers04a, Theorem 3.1] that, on the lamplighter groups Z2 Zd with d 3, any measure with nite entropy has non-trivial Poisson boundary. For non-amenable groups, the theorem says that positive speed cannot be ruined by strange large generating sets. Can the spectral radius being less than 1 be ruined? It turns out that there exist non-amenable groups where the spectral radius of nite symmetric walks can be arbitrary close to 1 [Osi02, ArzBLRSV05], hence I expect that an innite support can produce a spectral radius 1. Exercise 9.18. ** Can there exist a symmetric measure whose innite support generates a nitely generated non-amenable group such that the spectral radius is () = 1? We have already mentioned the Choquet-Deny theorem (1960), saying that any aperiodic measure on any Abelian group has the Liouville property. How small does a group have to be for this to remain true? Raugi found a very simple proof of the Choquet-Deny theorem, and extended it to nilpotent groups: Theorem 9.23 ([Rau04]). For any aperiodic probability measure on a countable nilpotent group , the associated Poisson boundary is trivial. Proof for the Abelian case. Let h : R be a bounded harmonic function, T1 , T2 , . . . independent un (x) : = Ex =
t1 ,...,tn

{t.PoissonMean}

{t.FurstConj}

{t.Raugi}

steps w.r.t. , and Xn = X0 + T1 + + Tn the random walk. For all n 1, dene h(Xn ) h(Xn1 )
2 2

h(x + t1 + + tn ) h(x + t1 + + tn1 ) (t1 ) (tn ) .

99

Now, for n 2, un (x) = E Ex E =E = Ex h(Xn ) h(Xn1 )


2

T2 , . . . , Tn
2

Ex h(Xn ) h(Xn1 ) T2 , . . . , Tn
2

by Jensen or Cauchy-Schwarz
2

h(x + T2 + + Tn ) h(x + T2 + + Tn1 ) h(Xn1 ) h(Xn2 ) = un1 (x) .


N n=1

by harmonicity and commutativity

increments (Pythagoras for martingales), hence

On the other hand, un (x) = Ex h(Xn )2 Ex h(Xn1 )2 by the orthogonality of martingale un (x) = Ex h(XN )2 h(X0 )2 . This is a

must be zero, which obviously implies that h is constant.

sum of non-decreasing non-negative terms that remains bounded as N , hence all the terms We know from the entropy criterion for the Poisson boundary that a nitely supported measure on any group with subexponential growth has trivial boundary. On the other hand, Anna Erschler proved in [Ers04b] that there exist a group with subexponential growth (an example of Grigorchuk, see Section 15.1) and some innitely supported measure on it with nite entropy that has a nontrivial Poisson boundary. The following question from [KaiV83] is still open: Is it true that on any group of exponential growth there is a with non-trivial Poisson boundary? For solvable groups, this was shown (with symmetric with nite entropy) in [Ers04a, Theorem 4.1]. On the other hand, Bartholdi and Erschler have recently shown that there exists a group of exponential growth on which any nitely supported has a trivial boundary [BarE11].

10

Growth of groups, of harmonic functions and of random walks

10.1

A proof of Gromovs theorem

{ss.Gromov}

We now switch to the theorem characterizing groups of polynomial growth due to Gromov [Gro81] and give a brief sketch of the new proof due to Kleiner [Kle10], also using ingredients and insights from [LeeP09] and [Tao10]. In fact, [Tao10] contains a self-contained and mostly elementary proof of Gromovs theorem. We will borrow some parts of the presentation there. Theorem 10.1 (Gromovs theorem [Gro81]). A nitely generated group has polynomial growth if and only if it is virtually nilpotent. We have already proved the if direction, in Theorem 4.8. For the harder only if direction, we will need several facts which we provide here without proper proofs but at least with a sketch of their main ideas. The rst ingredient is the following:
{t.harmfindim} {t.Gromov}

100

Theorem 10.2 ([ColM97], [Kle10]). For xed positive integers d and there exists some constant f (d, ), such that for any Cayley graph G of any nitely generated group of polynomial growth with degree d, the space of harmonic functions on G with growth degree has dimension f (d, ) (so in particular is nite-dimensional). For a tiny bit of intuition to start with, recall Exercise 9.4, saying that sublinear harmonic functions on Zd are constant. But there, in the coupling proof, we used the product structure of Zd heavily, while here we do not have any algebraic information this is exactly the point. The Colding-Minicozzi proof used Gromovs theorem, so it was not good enough for Kleiners purposes, but it did motivate his proof. Sketch of proof of Theorem 10.2. We give Kleiners proof for the case when the Cayley graph satises for any radius R. This does not follow easily from being a group of polynomial growth, so this is an annoying but important technicality all along Kleiners proof. Now, the key lemma is the following: Lemma 10.3. Cover the ball BR by balls B of radius R, and suppose that a harmonic function f : G R has mean zero on each B i . Then f
2 (BR ) i

the so-called doubling condition: there is an absolute constant D < such that |B (2R)| D |B (R)|

{l.elliptic}

C f

2(B4R ) .

Proof. Shrink each B i by a factor of two, and take a maximal subset of them in which all of them Enlarge now each B i B by a factor of three. We claim that each point in B2R is covered by at are disjoint. Then the corresponding original balls B i still cover BR . Let the set of these B i s be B .

most some D of these 3B i s. This is because of the doubling property: if there were more than D D |BR/2 | |B3.5R |, which cannot be for large enough D .

of them covering a point x, then B3.5R (x) would contain D disjoint balls B i /2, so we would have Now apply Salo-Costes Poincar e inequality Exercise 8.2 to each B i , sum over the B i s, use the

existence of D , then apply the reverse Poincar e inequality Exercise 8.3 to B2R to get |f (x)|2 |f (x)|2 O(1) 2 R2 O(1) 2 R2 which proves the lemma. This lemma implies that by imposing relatively few constraints on a harmonic function we can make it grow quite rapidly. To nish the proof of Theorem 10.2, the idea is that if we take a vector space spanned by many independent harmonic functions, then this lemma implies that some of the functions in this space would grow quickly. So, the space of harmonic functions with moderate growth cannot be large. To realize this idea, the trick is to consider the Gram determinant det (ui , uj )2 (BR )
N i,j =1

xBR

B i B

x B i

B i B

x3B i

|f (x)|2 |f (x)|2 ,

xB2R

|f (x)|2 O(1) 2

xB4R

where {ui : i = 1, . . . , N } is a basis for the harmonic function of growth degree such that {ui : 1 i M } span the subspace where the mean on each B i B is zero. This determinant is the 101

squared volume of the parallelepiped spanned by the ui s. Now, on one hand, obviously ui ui
2 (B4R )

2 (BR )

squared volume of the parallelepiped is det (ui , uj )2 (BR )


N i,j =1

by Lemma 10.3. By the construction of B and the doubling property, |B| = O(). Altogether, the O(2 )N O() det (ui , uj )2 (B4R ) 1 det (ui , uj )2 (BR ) 2
N i,j =1 N i,j =1 N i,j =1

for all i, while, for 1 i M = N |B|, we also have ui

2 (BR )

C ui

2 (B4R )

O(2 )N O() O(1) 4d+2 det (ui , uj )2 (B4R ) proves Theorem 10.2.

by the growth conditions

if is small and N is large .

This means that the determinant is zero for all R whenever N is larger than some f (d, ). This

The second ingredient for the proof of Gromovs theorem is complementary to the previous one: it will imply that for groups of polynomial growth there do exist non-trivial harmonic functions of moderate growth. Combining the two ingredients, we will be able to construct a non-trivial representation of our group over a nite-dimensional vector space, and then use the better understanding of linear groups. Theorem 10.4 ([Mok95], [KorS97], [Kle10], [LeeP09], [ShaT09]). Let be a nitely generated group without Kazhdans property (T); in particular, anything amenable is ne. Then there exists an isometric (linear) action of on some real Hilbert space H without xed points and a non-constant -equivariant harmonic function : H. Equivariance means that (gh) = g ((h)) for all g, h , and harmonicity is understood w.r.t. some symmetric nite generating set.
{t.Mok}

The rst three of the ve proofs listed above use ultralters to obtain a limit of almost harmonic functions into dierent Hilbert spaces. The last two proofs are constructive and quite similar to each other (done independently), inspired by random walks on amenable groups. We will briey discuss now the proof of Lee and Peres [LeeP09]. Let us focus on the amenable case. (The trick for the general non-Kazhdan case is to consider the right Hilbert space action instead of the regular representation on 2 () that is so closely related to random walks.) It works for all amenable transitive graphs, not only for groups. The key lemma is the following: Lemma 10.5. For simple random walk on any transitive amenable graph G(V, E ), we have inf 2 (I P ) 2 = 0. , (I P )
2

{l.yafoo}

(V )

are small if we take to be the indicator function 1A of a large Flner set A V . But the lemma

As we have seen in Theorem 7.3 and Lemma 7.2, both (I P ) 2 /

and , (I P ) /

concerns the ratio of these two quantities, which would be of constant order for typical Flner indicator functions 1A . Instead, the right functions to take will be the smoothened out versions
k 1 i=0

Pi 1A (x) = Ex {0 i k 1 : Xi A} , which are some truncated Greens functions. 102

{ex.smoothing}

Exercise 10.1. (b) Let k = k (f ) := (c) Show that (I (a) Show that if (V, P ) is transient or null-recurrent, then P i f 0 pointwise for any f 2 (V ).
k 1 i 1 i=0 P f . Show that k (k , f ) 0. P )k 2 4 f 2 and (k , (I P )k ) =

(2k 2k , f ).

{ex.boosting}

Exercise 10.2. * Show that if (I P )f / f < is small (e.g., for the indicator function of a show that there is an such that ( (f ), f ) is large, then use part (b) of the previous exercise.)
i=0

large Flner set), then there is a k N such that 2k (f ) 2k (f ), f L() is large. (Hint: rst The combination of Exercise 10.1 (c) and Exercise 10.2 proves Lemma 10.5. Let us note that the proof would have looked much simpler if we had worked with = (f ) = (, (I P )) =
xA (x)

P i f , with f = 1A
2

for a large Flner set A. Namely, we would have (I P ) = 1A , hence (I P ) r(|A| d


r in |V A|)

for any given r, where G is d-regular, since we stay

= |A|, while

in A for at least r steps of the walk if we start at least distance r away from the boundary. The

above identity and the inequality together give the required small ratio in Lemma 10.5. However, we would need to prove here that this exists (which is just transience) and is in 2 (V ), which is in fact true whenever the volume growth of G is at least degree 5, but it is not clear how to show that without relying on Gromovs polynomial growth theorem, which we obviously want to avoid here. the inmum 0 in Lemma 10.5, and dene j : G 2 () by j (x) : g j (g 1 x) 2 j , (I P )j We now want to prove Theorem 10.4 for the amenable case. Choose a sequence j 2 () giving

These are clearly equivariant functions on . It is easy to see that p(x, y ) j (x) j (y )
2

=1

(10.1) {e.sumone}

for every x , and j (x) p(x, y )j (y )


y 2

(I P )j 2 0, 2 j , (I P )j

(10.2) {e.almostharm}

as j . Using (10.1), the j s are uniformly Lipschitz, and by (10.2), they are almost harmonic. Then one can use a compactness argument to extract a limit, a harmonic equivariant function that is non-constant because of (10.1). This nishes the construction. As the nal ingredient for Gromovs theorem, here is what we mean by linear groups being better understood: Theorem 10.6 (Tits alternative [Tit72]). Any nitely generated linear group (i.e., a subgroup of some GL(n, F)) is either almost solvable or contains a subgroup isomorphic to F2 .
{t.Tits}

103

GL(n, C) will suce for us, which is already a statement that can be proved in a quite elementary way, as I learnt from [Tao10]. First of all, H is isomorphic to a subgroup of U(n), since any inner product on Cn can be averaged by the Haar measure of H to get an H -invariant inner product on Cn , thereby H becoming a subgroup of the unitary group associated to this inner product. Then, the key observation is that if g, h U(n) are close to the identity (in the operator norm), then their commutator is even closer: [g, h] 1
op

Actually, the special case of a nitely generated subgroup of a compact linear Lie group H

= gh hg 2 g1

op op

= (g 1)(h 1) (h 1)(g 1)
op

(10.3) {e.commclose}

h1

op ,

where the rst line used that g, h are unitary and the last line is by the triangle inequality. Using this, one can already easily believe Jordans theorem: any nite subgroup of U(n) contains an Abelian subgroup of index at most Cn . Why? The elements of lying in a small enough -neighbourhood of the identity will generate a good candidate for : the nite index comes from the compactness of U(n) and the positivity of , while a good amount of commutativity comes from taking the element g of closest to the identity, and then using (10.3) to show that [g, h] = 1 for any h . Then one can use some sort of induction. Similarly to this and to the proof of Proposition 4.6, one can prove that any nitely generated subgroup of U(n) of subexponential growth is almost Abelian. Proof of Gromovs Theorem 10.1. We now use these facts to give a proof of the only if direction, by induction on the growth degree of . The base case is clear: groups of growth degree 0 are precisely the nite ones, and they are almost nilpotent, since the trivial group is nilpotent. Suppose we already proved the result for groups of degree d 1, and let be nitely generated of polynomial growth with degree d. is amenable, hence from Theorem 10.4 we have a non-trivial equivariant harmonic embedding into a real Hilbert space H. It is Lipschitz, because (gs) (g ) = g ( (s) (e)). Let

V be the vector space of harmonic real-valued Lipschitz functions on it is nite dimensional by Theorem 10.2. Since : G H is non-constant, there is a bounded linear functional : H R such that 0 := V is a non-constant harmonic function on . This 0 cannot attain its maximum by the maximum principle, so it takes innitely many values.

norm, and also acts on the vector space W = V /C, where u(x) u(x) + c for any c C. On W , equivalent (up to constant factor bounds), hence the action of preserves a Euclidean structure up image is innite, because by applying elements of we can get innitely many dierent functions to constant factors. Thus, we get a representation : GL(W ) with a precompact image. This

Now, acts on V via g : u ug , where ug (x) = u(g 1 x). Moreover, it preserves the Lipschitz

the Lipschitz norm is a genuine norm, and on a nite dimensional vector space any two norms are

from 0 . So, the group B = Im is nitely generated, innite, and of polynomial growth, inside a compact Lie group, so by the compact Lie case of the Tits alternative Theorem 10.6, it is almost solvable (and, in fact, almost Abelian, but let us use only solvability). Let A0 = A be the nite 104

index solvable subgroup of B , and Ak = [Ak1 , Ak1 ]. Since A is innite and solvable, there is 1 := 1 (A1 ) has nite index in , so is also nitely generated of polynomial growth with degree d, by Corollary 3.5. It is enough to show that 1 is almost nilpotent. The group A1 /A is : A1 Z, and then an exact sequence 1 N 1 Z 1 . We now apply the results of Section 4.3: by Proposition 4.6, N is nitely generated and of polynomial Proposition 4.5, 1 is almost nilpotent. growth with degree d 1, so, it is almost nilpotent by the inductive assumption. Then, by This proof was made as quantitative as possible [ShaT09]; one of the many kinds of things that Terry Tao likes to do. In particular, they showed that there is some small c > 0 such that a |B (R)| Rc(log log R) volume growth implies almost nilpotency and hence polynomial growth. This is the best result to date towards verifying the conjectured gap between polynomial and exp(c n) volume growth. We will see in Chapter 16 that there are transitive graphs that are very dierent from Cayley graphs: not quasi-isometric to any of them. But this does not happen in the polynomial growth regime: there is an extension of Gromovs theorem by [Tro85] and [Los87], see also [Woe00, Theorem 5.11], that any quasi-transitive graph of polynomial growth is quasi-isometric to some Cayley graph (of an almost nilpotent group, of course). In particular, it has a well-dened integer growth degree.
c

a smallest index such that A has innite index in A1 . Then A1 has nite index in B , so

Abelian, innite and nitely generated, hence it can be projected onto Z. So, we get a projection

10.2

Random walks on groups are at least diusive

{ss.diffusive}

We have seen that non-amenability implies positive speed of escape. However, we may ask for other rates of escape. Exercise 10.3. Using the results seen for return probabilities in Chapter 8, show that: (a) Any group with polynomial growth satises E d(X0 , Xn ) c n. (Notice that this is sharp for Zd , d 1, using the Central Limit Theorem.) (b) If the group has exponential growth, then E d(X0 , Xn ) c n1/3 . Although it may seem that a random walk on a group with exponential growth should be further away from the origin than on a group with polynomial growth, the walk with exponential growth also has more places to go which are not actually further away, or in other words, it is more spread-out than random walk on a group with polynomial growth. Moreover, there can be many dead ends in the Cayley graph: vertices from which all steps take us closer to the origin. Therefore, it is not at all obvious that the rate of escape is at least c n on any group, and this was in fact unknown for quite a while.

105

Exercise 10.4 (Anna Erschler). Using the equivariant harmonic function into a Hilbert space H claimed to exist for any amenable group in Theorem 10.4, show that E d(X0 , Xn )2 c n , which is slightly weaker than the rate c n for the expectation in the previous exercise. Hint: If Mn martingales (the orthogonality of martingale incremements) in Hilbert spaces. If we want to improve the linear bound on the second moment to a square root bound on the rst moment, we need to show that d(X0 , Xn ) is somewhat concentrated. One way to do that is to show that higher moments do not grow rapidly. Exercise 10.5. * Show that E Cn2 , using the orthogonality of martingale alint Vir ag.) incremements. Then deduce that E d(X0 , Xn ) c n. (This improvement is due to B (Xn ) (X0 ) Even stronger concentration is true: Theorem 10.7 (James Lee and Yuval Peres [LeeP09]). There is an absolute constant C < such that, for SRW on any amenable transitive d-regular graph, for any > 0 and all n 1/2 , we have 1 n
n 4

is martingale in H, then E

Mn M0

n1 k=0

Mk+1 Mk

, a Pythagorean Theorem for

{t.lee-peres09}

P d(X0 , Xk ) <
k=0

n/d C .

Let us point out that one does not really need an actual harmonic Lipschitz embedding of the transitive amenable graph G into Hilbert space to bound the rate of escape from below, but can use the approximate harmonic functions (1A ) or k (1A ) for large Flner sets A directly, and get quantitative bounds: E[ d(X0 , Xn )2 ] n/d, where G is d-regular. There is a similar bound for nite eigenvalue of G. See [LeeP09]. transitive graphs: E[ d(X0 , Xn )2 ] n/(2d) for all n < 1/(1 2 ), where 2 is the second largest

+ Exercise 10.6. ** The tail of the rst return time: is it true that Px [x > n] c/ n for all groups? (This was a question from Alex Bloemendal in class, and I did not know how hard it was. Later this turned out to be a recent theorem of Ori Gurel-Gurevich and Asaf Nachmias.)

11
11.1

Harmonic Dirichlet functions and Uniform Spanning Forests


{s.HDandUSF}

Harmonic Dirichlet functions


x y

{ss.HD}

Recall that the Dirichlet energy for functions f : V (G) R is dened as

We have seen that the question whether there are non-constant bounded harmonic functions is interesting. It is equally interesting whether a graph has non-constant harmonic Dirichlet functions

|f (x) f (y )| .

(i.e., harmonic functions with nite Dirichlet energy, denoted by HD(G) from now on). This is not to be confused with our earlier result that there is a nite energy ow from x to iff the network is transient: such a ow gives a harmonic function only o x. Note, however, that the dierence of two 106

dierent nite energy ows from x with the same inow at x is a non-constant harmonic Dirichlet function, hence the non-existence of non-constant HD functions is equivalent to a certain uniqueness of currents in the graph. We will explain this more in a later version of these notes, but, very briey, the non-trivial harmonic Dirichlet functions lie between the extremal currents, the free and wired ones. This connection was rst noted by [Doy88]. See [LyPer10, Chapter 9] for a complete account. The property of having non-constant HD functions (i.e., HD > R) is a quasi-isometry invariant, which is due to Soardi (1993), and can be proved somewhat similarly to Kanais Theorem 6.6. Thus we can talk about a group having HD > R or not. In amenable groups, all harmonic Dirichlet functions are constants, HD = R. One proof is by [BLPS01], using uniform spanning forests, see the next section. Exercise 11.1. If F is a free group, then HD > R. Exercise 11.2. * Any non-amenable group with at least two ends (actually, it would have a continuum of ends, by Stallings Theorem 3.2) has HD > R. Theorem 11.1 (He-Schramm [HeS95], Benjamini-Schramm [BenS96a, BenS96b]). For any innite triangulated planar graph, there exists a circle packing representation, i.e., the vertices are represented by disks and edges are represented by two disks touching. This circle packing representation either lls the plane, in which case the graph is recurrent, or it lls a domain conformally equivalent to a disk, in which case the graph is transient. If the case is the latter (conformally invariant to a disk), then HD > R. For the circle packing representation, see the original paper [HeS95] or [Woe00, Section 6.D]. For the HD results, see the papers or [LyPer10, Sections 9.3 and 9.4]. It is worth mentioning another theorem which shows where Kazhdan groups ts. Theorem 11.2 ([BekV97]). If is a Kazhdan group, then HD = R for any Cayley graph. As HD is a vector space, we can try to compute its dimension. Of course, when HD > R, this dimension is innite, due to the action of on the Cayley graph. But there is a notion, the dimensions relative to L2 (): that is, dimG (L2 ()) = 1, and in general the values can be in R0 . Theorem 11.3 ([BekV97]). We have that dimG HD =
(2) (2) 1 ()

{t.circlepacking}

{t.BeVaKazhdan}

von Neumann dimension of Hilbert spaces with group action, denoted by dimG H, which gives
{t.BeVaHDbeta}

for any Cayley graph of , where

1 () is the rst Betti number of the L2 -cohomology of the group. In particular, dimG HD does not depend on the Cayley graph. We will see a probabilistic denition of 1 () in the next section, giving a way of measuring the dierence between free and wired currents.
(2)

107

11.2

Loop-erased random walk, uniform spanning trees and forests

{ss.USF}

For this section, [LyPer10, Chapters 4 and 10] and the fantastic [BLPS01] are our main references. On a nite graph we can use the loop-erased random walk to produce, uniformly at random, a spanning tree with Wilsons algorithm: In a connected nite graph G, we choose an order of the vertices x0 , x1 , . . . , xn , then produce a path from x1 to x0 by starting a random walk at x1 and stopping it when we hit x0 . If the random walk produces some loops, we erase them in the order of appearance. Then we start a walk from x2 till we hit the path between x0 and x1 , take the loop-erasure of it, and so on. Theorem 11.4 (Wilsons algorithm). If G is a nite graph, then Wilsons algorithm (dened above) samples a spanning tree with the uniform distribution (from now on, UST, uniform spanning tree). One corollary to this beautiful algorithm (but it was proved rst by Kirchho in 1847) is that edge marginals in the UST have random walk and electric network interpretations. Moreover, the This is the Transfer-Current Theorem of Burton and Pemantle [BurtP93] that we will include in a later version of these notes. In an innite graph G, we can take an exhaustion Gn of it. In each step of the exhaustion, we obtain a UST, denoted by USTn . Using the electric interpretations, it can be shown that if These limit probabilities form a consistent family, since any consistency condition will be satised in some large enough Gn , hence, the Kolmogorov extension theorem gives us a measure, the free uniform spanning forest, denoted by FUSF. It cannot have cycles, but it is not necessarily connected, since a USTn restricted to a nite subgraph has these properties, as well. vice versa, hence the limit limn P S USTn = P S FUSF does not depend on the exhaustion.
If we have two exhaustions, Gn and G n , then for any n there is an m such that Gn Gm , and

joint probability for k edges being in the UST is given by a k k determinant involving currents.

S E (Gn ), then we have P S USTn P S USTn+1 , hence the limit exists for all nite S .

This also implies that the FUSF of a Cayley graph has a group-invariant law: if we translate a nite edge-set S by some g , and want to show P S FUSF = P g (S ) FUSF , we can just translate the entire exhaustion we used in the denition. Another possible approach for a limit object could be that, at each step in the exhaustion Gn G,

we wire all the boundary vertices of Gn into a single vertex, obtaining G n . (We can keep or delete the resulting loops, this will not matter.) In this wired Gn we pick a UST and call it UST n . It can
now be shown that if S E (G n ), then P S USTn P S USTn+1 , hence we again have a

limit, called WUSF: wired uniform spanning forest. This again does not depend on the exhaustion, and on a Cayley graph has a group-invariant law. Since G n can be considered as having the same edge set as Gn , but less vertices, it is intuitively
clear that USTn stochastically dominates UST n , i.e., they can be coupled so that USTn USTn .

(This domination can be proved using electric networks again.) This implies that, in the limit, FUSF

stochastically dominates WUSF. However, there seems to be no unique canonical coupling on the

108

nite level, hence there does not seem to be a unique coupling in the limit, and the above proof of the group invariance breaks down. Exercise 11.3. *** Does there exist a group invariant monotone coupling between FUSF and WUSF? A recent result of R. Lyons and A. Thom is that this is the case for soc groups (see Question 14.1 below). On the other hand, the following theorem shows that the situation in general is not simple. Theorem 11.5 ([Mes10]). There are two Aut(G)-invariant processes F , G E (G) on G = T3 Z exists no invariant coupling F G .
{t.mester} {ex.WFUSF}

such that there exists a monotone coupling F G (i.e., G stochastically dominates F ), but there The group invariance of WUSF also follows from Wilsons algorithm rooted at innity, which is

also a reason for considering it to be a very natural object: Theorem 11.6. On transient graphs, the WUSF can be generated by Wilsons algorithm rooted at This walk will escape to innity. Now start a loop-erased random walk at x2 . Either this second end the walk there. Repeat ad innitum. No algorithm is known for generating the FUSF in a similar way, but there are special cases where the WUSF is the same as the FUSF, for example, on all amenable transitive graphs [H ag95], as follows (and see Theorem 11.8 below for the exact condition for equality): Proposition 11.7. On any amenable transitive graph, FUSF = WUSF almost surely. Proof. Both FUSF and WUSF consists of innite trees only. For any such forest F , if Fn is a Flner sequence, then the average degree along Fn is 2. Hence we also have E degWUSF (x) = E degFUSF (x) = 2 for all x. On the other hand, we have the stochastic domination. Hence, by Exercise 13.2 (b), we must have equality. The edge marginals of WUSF are given by wired currents, while those of the FUSF are given by free currents. More generally, the Transfer-Current Theorem mentioned above implies that both the FUSF and the WUSF are determinantal processes. Now, since the dierence between free and wired currents are given by the non-trivial harmonic Dirichlet functions, see Section 11.1, we get the following result, with credits going to [Doy88, H ag95, BLPS01, Lyo09]: Theorem 11.8. WUSF = FUSF on a Cayley graph if and only if HD = R. More quantitively: (1) E degWUSF (x) = 2. (2) E degFUSF (x) = 2 + 21 (). Due to Wilsons algorithm rooted at innity, WUSF is much better understood than FUSF, as shown below by a few nice results. On the other hand, the connection of FUSF to 1 () makes the free one more interesting. For instance:
(2) (2)

innity: Order the vertices of the network as {x1 , x2 , . . . }. Start a loop-erased random walk at x1 .

walk will go to innity, or it will intersect the rst walk at some point (possibly x2 ). If it intersects,

{p.amenUSF}

{t.WFUSFHD}

109

Exercise 11.4 (Gaboriau). *** Let be a group, and be given. Does there exist an invariant percolation P with edge marginal less than or equal to such that FUSF P is connected? If the answer is yes, it would imply that cost() = 1 + 1 (), see Section 13.4 below. The corresponding question for WUSF has the following answer: Theorem 11.9 ([BLPS01]). WUSF PBer() is connected for all if and only if is amenable. Wilsons algorithm has the following important, but due to the loop erasures, not at all immediate, corollary: Theorem 11.10 ([LyPS03]). Let G be any network. The WUSF is a single tree a.s. iff two independent random walks started at any dierent states intersect with probability 1. More generally, we have the following beautiful result: Theorem 11.11 ([BenKPS04]). Consider USF = WUSF = FUSF on Zd . If d 4, then the USF
{t.numTrees} {t.rwcollision}
(2)

neighbours, in the sense that given any two of these trees, there is a a vertex in the rst, and a vertex in the second which are neighbours. If 8 < d 12 then the USF contains innitely many will have a common neighbour. In general, if 4n < d < 4(n + 1) then the USF contains innitely many spanning trees, and given any two such trees, there is a chain of trees, each the neighbour of original two trees. the next, connecting the rst to the second, with at most (n 1) trees in the chain, not including the spanning trees. It may be the case that there exist two such trees which are not neighbours, but they

is a single tree. If 4 < d 8 then the USF contains innitely many spanning trees, but all are

12

Percolation theory
{s.perc}

Percolation theory discusses the connected components (clusters) in random graphs. Frequently, we construct such a graph by taking a non-random graph, and erasing some of the edges based on some probability model. One example of this is Bernoulli(p) bond percolation, where each edge of a graph is erased (gets closed) with probability 1 p, and kept (remains open) with probability p. Another version is site percolation, where we keep and delete vertices, and look at the components induced by the kept vertices. We will usually consider bond percolation, but the results are always very similar in the two cases. (In fact, bond percolation is just site percolation on the line graph of the original graph, hence site percolation is more general.) We will usually denote Bernoulli distribution by Ber(p). Bernoulli percolation on Zd is a classical subject, one of the main examples of statistical mechanics, see [Gri99]. Percolation in the plane is a key example in modern probability, the study of critical phenomena, see [Wer07]. Percolation on groups beyond Zd was initiated by Benjamini and Schramm in 1996 [BenS96c], for which the standard reference is [LyPer10]. In a slightly dierent direction, Ber(p) bond percolation on the complete graph Kn on n vertices is the classical Erd os-R enyi (1960) model G(n, p) of probabilistic combinatorics, see [AloS00].

110

12.1

Percolation on innite groups: pc , pu , unimodularity, and general invariant percolations


{ss.percinfty}

The most important feature of Bernoulli percolation on most innite graphs is a simple phase component, and below which there is not. The strict denition is as follows: transition: the existence of some critical pc (0, 1), above which there is an innite connected
{d.pc}

Denition 12.1. Consider Ber(p) percolation on any innite connected graph G(V, E ). Dene pc := inf {p : Pp [ an innite cluster ] = 1}. We can also write pc = inf {p : x (p) > 0 x V }, where x (p) := Pp [ |C (x)| = ] and C (x) is the cluster of the vertex x. It is not quite obvious that these two denitions are equivalent to each other, though pc (Z) = 1 is clear with either one, for instance. We will later see that for non-1-dimensional graphs one expects pc (Z2 , bond) = 1/2. But rst some explanations are in order. with the product topology? Yes, since it is equal to pc (0, 1), and will prove, among other things, that pc (Td+1 , bond) = pc (Td+1 , site) = 1/d and First of all, is { an innite cluster} really an event, i.e., a Borel measurable subset of {0, 1}E (G)
x V n1 {x

that Pp [ an innite cluster ] is monotone in p? Well, a simple proof is by the standard coupling assign an independent U (e) Unif [0, 1] variable, and then p := {e E (G) : U (e) p} is a

Bn (x)}. Then, is it obvious

of all Ber(p) percolation measures for p [0, 1]: to each edge e (in the case of bond percolation)

Ber(p) bond percolation conguration for each p, while p p for p p . In this coupling, or not an innite cluster exists does not depend on the states of any nite set of edges, so this is a tail either 0 or 1, and so the rst denition makes perfect sense. Now, if the probability that an innite event. As a result, Kolmogorovs 0-1 law (see Theorem 9.15) implies that Pp [ an innite cluster ] is { cluster in p } { cluster in p }, hence the probability is monotone. Moreover, whether

cluster exists is 0 for some p, then the probability that any given x is part of an innite cluster must also be 0. For the other direction in the equivalence of the two denitions, it is clear that if there is an innite cluster with positive probability, then there must be some x with Pp [ |C (x)| = ] > 0 (by graph? We are going to show this in three dierent ways, just to introduce some basic techniques that will often be useful, but, before that, here is an exercise clarifying what the measurability of having an innite cluster means: Exercise 12.1. Let G(V, E ) be any bounded degree innite graph, and Sn V an exhaustion by nite connected subsets. Is it true that, for p > pc (G), we have
n

the simplest union bound), but why does this happen for all x at the same p values in a non-transitive

lim Pp [ largest cluster for percolation inside Sn is the subset of an innite cluster ] = 1 ?

connected graph, the events En := {Bn (x) intersects an innite cluster} increase to { cluster} as n , hence Pp [ cluster ] = 1 implies that there exists some n = n(x) such that
in Pp [ V Bn (x) ] > 0 .

First proof of Pp [ cluster ] = 1 implying Pp [ |Cx | = ] > 0 for any x V (G). On any

111

On the other hand, we also have


in Pp [ x y for all y V Bn (x) ] > 0 ,

since this requires just nitely many bits to be open. These two events of positive probability depend on dierent bits, hence are independent of each other, hence their intersection also has positive probability. Since the intersection implies that {x }, we are done. Second proof. Bernoulli percolation has a basic property called nite energy or insertion and deletion tolerance. For the case of bond percolation, for any event A and e E (G), let A {e} := { {e} : A}, A \ {e} := { \ {e} : A}. any e E (G). In fact, for Ber(p) percolation, we have Pp [ A {e} ] p Pp [ A ] and Then the property is that P[ A ] > 0 implies that P[ A {e} ] > 0 and P[ A \ {e} ] > 0, as well, for Pp [ A \ {e} ] (1 p) Pp [ A ] .

And then, Pp [ A {e} ] = Pp [ A ] if e {ei }, Pp [ A {e} ] = p/(1 p) Pp [ A ] if e {fj }, and Pp [ A {e} ] = pPp [ A ] if e {ei } {fj }. Similarly, Pp [ A \ {e} ] (1 p)Pp [ A ].

0, j = 1, . . . , }, since any measurable event can be approximated by a nite union of such events.

Why? First of all, we can focus only on nite cylinder events A = { : (ei ) = 1, i = 1 . . . , k, (fj ) =

Why is this property called nite energy? One often wants to look at not just Bernoulli perco-

lation, but more general percolation processes: a random subset of edges or vertices. For instance, the Uniform Spanning Tree and Forests from Section 11.2 are bond percolation processes, while the Ising model of magnetization, where the states of the vertices (spins) have a tendency to agree with their neighbours (with a stronger tendency if the temperature is lower) is a site percolation process; see Section 13.1. Such models are often dened by assigning some energy to each (nite) conguration, then giving larger probabilities to smaller energy congurations in some way (typically using a negative exponential of the energy). Then the nite energy property is that any nite modication of a conguration changes the energy by some nite additive constant, and hence the probability by a positive factor. If changing one edge or site changes the probability by a uniformly positive factor, independently of A, then the model has uniformly nite energy or uniform deletion/insertion tolerance. Bernoulli percolation and the Ising model are such examples, while the UST is neither deletion nor insertion tolerant. then the same holds for any y V (G) in place of x. Take a nite path between x and y , and then Pp [y ] Pp [ is open, x ] p| | Pp [x ], by repeated applications of the insertion tolerance shown above. We can now easily establish that if Pp [|C (x)| = ] > 0 holds at some p for some x V (G),

112

Third proof. Given a partially ordered set , we say that an event A is increasing if 1A (1 ) 1A (2 ) whenever 1 2 , where 1A is the indicator of the event A. For bond percolation on G, the natural partially ordered set is, of course, the set of all subsets of E (G) ordered by inclusion.

being kept.

In other words, = {0, 1}E (G) with coordinate-wise ordering, where (e) = 1 represents the edge e
{t.HarrisFKG}

Theorem 12.1 (Harris-FKG inequality). Increasing events in the Bernoulli product measure are positively correlated: P[ A B ] P[ A ] P[ B ], or E[ f g ] E[ f ] E[ g ] if f and g are increasing functions with nite second moments. Before proving this, let us go back to the equivalence in Denition 12.1 as an example: for any Harris inequality, x, y V (G), the events {x y } and {y } are both monotone increasing, hence, by the Pp [y ] Pp [x y, x ] Pp [x y ] Pp [x ], and we are done again. Theorem 12.1 is in fact classical for the case when we have only one bit. Even more generally than measure , and then the statement is one of Chebyshevs inequalities: f (x)g (x) d(x) f (x)d(x)
R R

just Ber(p) measure on {0, 1}, one can consider two increasing functions on R with any probability g (x)d(x) . (12.1) {e.Cheb}

The proof of this is very easy: [f (x) f (y )] [g (x) g (y )] d(x) d(y ) 0 ,

since the integrand is always nonnegative, and then rearranging gives (12.1). The general case, namely a product probability measure 1 d on Rd , can be proved by induction, coordinateby-coordinate. Prove it yourself, or see [LyPer10, Section 7.2]. The explanation for the name Harris-FKG is that Harris proved it for product measures (just what we stated) in [Har60], a fundamental paper for percolation theory. But there is a generalization of Theorem 12.1 for dependent percolation processes, where the energy of a conguration is given by certain nice local functions favouring agreement: this is the FKG inequality, due to Fortuin, Kasteleyn and Ginibre (1971). For instance, the Ising model does satisfy this inequality. The proof of this must be very dierent from what is sketched above, since the measure does not have the product structure anymore. Indeed, we will do this in Theorem 13.1, via a beautiful Markov chain coupling argument. For now, see [Wik10b] for a very short introduction to the general FKG inequality, and [GeHM01] for a proper one. At the opposite end from the FKG inequality, in the UST, as usual for determinantal measures, increasing events supported on disjoint sets are negatively correlated [LyPer10, Section 4.2]. When looking at general percolation processes on groups and transitive graphs G, one usually wants that it is an invariant percolation process: a random subset of edges or vertices whose 113

law is invariant under the automorphism group of G. How does the automorphism group act on congurations and events? Functions can always be pulled back, even via noninvertbile maps of the underlying spaces: for any Aut(G) and x V (G), we let (x) := ( (x)). Note that if is another automorphism, then (x) = ( ( (x))) = ( )( (x)) = ( ) (x), hence if Aut(G) acts from the left on G, then it acts from the right on congurations. An event A can be considered as

a set of congurations, hence A = { : A} and Aut acts from the right again. For the event process its probability is the same. For the event B = { x : |Cx | > 100}, we have B = B , hence Besides Bernoulli percolation, the Uniform Spanning Forests from Section 11.2 and the Ising model from Section 13.1 are examples of invariant percolations. But studying percolation in this generality also turns out to be useful for Bernoulli percolation itself, as we will see later on. A possible basic property of invariant percolation processes is the following: Lemma 12.2. Ber(p) bond (or site) percolation on any innite transitive graph G is ergodic: any invariant event has probability 0 or 1. In fact, instead of transitivity, it is enough that there is an edge with an innite orbit (and then it is easy to see that every edge has an innite orbit).

A = {|Cx | > 100}, we have A = {|C 1 (x) | > 100}, which is a dierent event, but in an invariant

this is an invariant event.

{l.ergodic}

Proof. Any measurable event A can be approximated by a nite union of nite cylinder events, i.e., for any > 0 there is an event A depending only on nitely many coordinates such that Pp [ AA ] < . Then, there exists a large enough Aut(G) such that the supports of A and by invariance. Altogether, by choosing small enough, Pp [ A ] can be arbitrarily well approximated by Pp [ A ] , hence Pp [ A ] {0, 1}. There are natural invariant percolations that are not ergodic: a trivial example is taking the empty set or the full vertex set with probability 1/2 each; a less trivial one is the Ising model on Z2 at subcritical temperature with a free boundary condition, where there are more than one phases, see again Section 13.1. We will be concerned here mainly with ergodic measures, although see, e.g., Theorem 12.15 below. Most results and conjectures below will concern percolation on transitive graphs, but let us point out that there are always natural modications (with almost identical proofs) for quasi-transitive graphs, i.e., when V (G) has nitely many orbits under Aut(G). Here is a basic application of ergodicity and insertion tolerance: Lemma 12.3. In any ergodic insertion tolerant invariant percolation process on any innite transitive graph, the number of innite clusters is an almost sure constant, namely 0, 1, or . invariant, hence it has probability 0 or 1 by ergodicity. I.e., the number of innite clusters is an almost sure constant. Now, assume that this constant is 1 < k < . By the measurability of the Proof. For any k {0, 1, 2, . . . , }, the event {the number of innite clusters is k } is translation
{l.01infty}
2

(A ) are disjoint, hence Pp [ A (A ) ] = Pp [ A ] . On the other hand, Pp [ A (A) ] = Pp [ A ]

number of clusters, for any c < 1 there exists an integer r such that the probability that the ball 114

Br (o) intersects at least two innite clusters is at least c. But then, by insertion tolerance, we can change everything in Br (o) to open, resulting in an event with positive probability, on which the number of innite clusters is at most k 1: a contradiction. lower bound, note that if C (o) is innite, then it contains an innite self-avoiding path starting from o, while the number of such paths of length n in Z2 is at most 4 3n1 , hence Pp [ open self-avoiding path of length n] Now, back to the question of pc (0, 1), it is easy to prove that 1/3 pc (Z2 ) 2/3. For the

Ep [number of open self-avoiding paths of length n] 4(3p)n exp(cn)

for p < 1/3, which implies by Borel-Cantelli that there is no innite cluster almost surely. For the upper bound, we count closed circuits in the dual graph in a similar way (i.e., circuits formed by dual edges that correspond to primal edges that were closed in the percolation): Pp [ closed self-avoiding dual circuit of length n surrounding o] n 4(3(1 p))n exp(cn) for p > 2/3, where the factor n comes from the fact that any such dual circuit must intersect the segment [0, n] on the real axis of the plane. This is summable in n, hence, for N large enough, with positive probability there is no closed dual circuit longer than N surrounding the origin. This
out implies that V BN/8 (o) must be connected to innity. (Proof: If the union U of the open clusters + intersecting BN/8 (o) is nite, then take its exterior edge boundary E U , i.e., the boundary edges

closed circuit around o, with length larger than N .) I.e., there is an innite cluster with positive probability. (This counting of dual circuits and using the rst moment method is called the Peierls argument.)

with an endpoint in the innite component of Z2 \ U . The edges dual to these edges contain a

Figure 12.1: Counting primal self-avoiding paths and dual circuits. For other graphs, the lower bound generalizes easily: if G has maximal degree d, then pc (G)

{f.Peierls}

straightforward generalization that does hold is the following:

1/(d 1). However, the upper bound relied on planar duality, hence it is certainly less robust. The
{ex.cutsets}

Exercise 12.2. Show that if in a graph G the number of minimal edge cutsets (a subset of edges whose removal disconnects a given vertex from innity, minimal w.r.t. containment) of size n is at 115

exponential bound. In particular, pc (Zd ) < 1, although we know that already from Z2 Zd . ric inequality IP1+ for some > 0 (dened in Subsection 5.1), then pc < 1.

most exp(Cn) for some C < , then pc (G) 1 (C ) < 1. Show that Zd , d 2, has such an
{c.pc<1}

Conjecture 12.4 ([BenS96c]). If G is a non-one-dimensional graph, i.e., it satises an isoperimet-

This has been veried in many cases, in fact, for all known groups. We start with Cayley graphs of nitely presented groups. Let G(V, E ) be a bounded degree graph, and let G be the set of its ends, see Section 3.1. Let CutCon(G) be the cutset-connectivity of G: the smallest t Z+ such that any minimal edge cutset between any two elements of V (G) G is t-connected in the sense that in any non-trivial

t. For instance, CutCon(Z2 ) = 1, CutCon(hexagonal lattice) = 2, and CutCon(Td ) = 1 for all d,

partition of it into two subsets, = 1 2 , there are ei i whose distance in G is at most

despite the fact that cutsets separating a vertex from innity (i.e., from the set of all ends of Td ) do not have bounded connectivity. We are going to prove that if is a nitely presented group, then any nitely generated Cayley graph G(, S ) has CutCon(G) < ; the rst proof [BabB99] used am Tim cohomology groups, but there is a few-line linear algebra proof by Ad ar [Tim07], which we now present. As we will see, this is closely related to having an exponential bound on the number of minimal cutsets. Tim ar also proved that CutCon(G) < and having an exponential bound are both quasi-isometry invariants.
{pr.CutCon}
E (G )

Proposition 12.5. Note that each cycle in G can be viewed as a conguration of edges, i.e., an element of {0, 1}E (G), or even as an element of the vector space F2 F2 is then the linear subspace spanned by all the cycles. . The cycle space of G over

Assume that the cycles of length at most t generate the entire cycle space of G. (This is obviously the case if is nitely presented.) Then CutCon(G) t/2. Proof. Let x, y V (G) G and = 1 2 a minimal edge cutset separating them, with a nontrivial partition. Note that must be nite. If x (or y ) is an end, dene x (y ) to be a vertex Because of the minimality of , there is a path Pi between x and y that avoids 3i , for i = 1, 2.
E (G )

such that there is a path between x and x (y and y ) in G \ . Otherwise, let x := x (y := y ). Now look at P1 + P2 F2 from 2 , and write . This is clearly in the cycle space, so, we can write P1 + P2 =
cK

for some set K of cycles of length at most t. Let K1 K be the subset of cycles that are disjoint := P1 +
c K1

c = P2 +
c K \K 1

c.

We see from the rst sum that this E (G) is disjoint from 2 . But the only odd degree vertices in it are x and y , so it must contain a path from x to y . That must intersect , but is disjoint from 2 , hence it intersects 1 . But in the second sum, P2 is disjoint from 1 , so there must be some cycle in K \ K1 that intersects 1 . It also intersects 2 , and its length is at most t, which proves the claim.

116

Now, if is a nitely presented group with one end, then the proposition says that minimal cutsets between a vertex o and innity have bounded connectivity. This implies that once we x an edge in a minimal cutset of size n, there are only exponentially many possibilities for the cutset. So, if we show that there is a set Sn of edges with size at most exponential in n such that each such cutset must intersect it, then we get an exponential upper bound on the number of such cutsets, and hence, by Exercise 12.2, pc (G) < 1. We claim that the ball BAn (o) is a suitable choice for Sn for A > 0 large enough. If has at least quadratic volume growth, then for any nite set K V (G) the Coulhon-Salo-Coste isoperimetric inequality, Theorem 5.7, says BAn (o), then || c that |E K | c |K |, since (|K |) C

o must intersect BAn (o), as desired. And must have volume growth at least quadratic: if it has

|BAn (o)| c An. So, by choosing A large enough, a cutset of size n around

|K |. Therefore, if the component of o in G \ contains

polynomial growth, then by Gromovs Theorem 10.1 it is almost nilpotent, hence it has an integer growth rate that cannot be 1, because then would be a nite extension of Z and it would have two ends. Now, if an innite group does not have 1 or 2 ends, then it has continuum many, see Exercise 3.4. But in this case it is non-amenable by Exercise 5.3. And, for any nonamenable graph G, we have pc (G) 1/(h + 1) < 1, even without nitely presentedness or even transitivity, as proved by [BenS96c] (or see [LyPer10, Theorem 6.24]): Proposition 12.6. For bond percolation on any graph G with edge Cheeger constant h > 0, we have pc (G) 1/(h + 1) < 1. Moreover, for any p > 1/(h + 1), we have Pp [ n < |C (o)| < ] < exp(cn) for some c = c(p) > 0. Proof. Fix an arbitrary ordering of the edges, E (G) = {e1 , e2 , . . . }. Explore the cluster of a xed vertex o by taking the rst ei with an endpoint in o, examining its state, extending the cluster of o by this edge if its open, then taking the rst unexamined ei with one endpoint in the current cluster of o and the other endpoint outside, and so on. If the full C (o) is nite, then this process stops after exploring an open spanning tree of the cluster plus its closed boundary and possibly other closed edges between vertices in the spanning tree. If C (o) = n, then we have found n 1 open edges if p > 1/(h + 1), then this is exponentially unlikely, by a standard large deviation estimate, say, Proposition 1.6. Furthermore, with positive probability there is no such n, just like a biased random walk on Z might never get back to the origin. It is not very hard to show (but needs some stochastic domination techniques we have not discussed yet) that pc (G(, S )) < 1 is independent of the generating set. Moreover, it is invariant under quasi-isometries. It is also known [LyPer10, Theorem 7.24] that any group of exponential growth has pc < 1; see Exercise 12.4 below for an example. On the other hand, recall that groups of polynomial growth are all almost nilpotent and hence nitely presented, see Exercise 4.2 and Section 10.1. This means that among groups, Conjecture 12.4 remains open only for groups of intermediate growth. However, and at least |E C (o)| hn closed edges in this process of examining i.i.d. Ber(p) variables. But

{pr.nonamenexplore

117

all known examples of such groups, see Section 15.1, have Cayley graphs containing a copy of Z2 , hence the conjecture holds also for them [MuP01]. The general Conjecture 12.4 is also known to hold for planar graphs of polynomial growth that have an embedding into the plane without vertex accumulation points [Koz07]. Exercise 12.3. * Show that the Cayley graph of the lamplighter group = Z2 Z with generating CutCon(G(, S )) = .
{ex.LLCG}

set S = {R, Rs, L, sL} is the Diestel-Leader graph DL(2, 2), see [LyPer10, Example 7.2], and satises
{ex.LLpc}

Exercise 12.4. Show that the lamplighter group with generating set S = {L, R, s} has pc (G(, S )) < 1, by nding a subgraph in it isomorphic to the so-called Fibonacci tree F : a directed universal cover of the directed graph with vertices {1, 2} and edges {(12), (21), (22)}. (There are two directed covers, quote the result (due to Russ Lyons) that this being positive implies pc (F ) < 1.

with root either 1 or 2.) Find the exponential volume growth of F , lim inf n (log |Bn |)/n, and just As may be guessed from the previous exercise, percolation on general trees is quite well understood, largely due to the work of Russ Lyons. Of course, the simplest case is percolation on a regular tree, which turns out to be a special case of a classical object: the cluster of a xed vertex in Ber(p) percolation on a k + 1-regular tree Tk+1 is basically a Galton-Watson process with ospring distribution = Binom(k, p). A usual GW process is a random tree where we start with a root in the zeroth generation, then each individual in the nth generation gives birth to an independent number of ospring with distribution , together giving the (n + 1)th generation. For Ber(p) percolation on Tk+1 , the root has a special ospring distribution Binom(k + 1, p). However, this dierence does not aect questions like the existence of innite clusters. So, to nd pc (Tk+1 ), it is enough to understand when a GW process survives with positive probability. A standard method for this is the following. Consider the probability generating function of the ospring distribution, f (s) := E[ s ] =
k 0

P[ = k ]sk .

(12.2) {e.GWpgf}

Notice that if Zn is the size of the nth generation, then E[ sZn ] = f n (s), therefore we have q := P[ extinction ] = lim P[ Zn = 0 ] = lim f n (0) .
n n

(12.3) {e.GWq}

Assuming that P[ = 1 ] = 1, the function f (s) is strictly increasing and strictly convex on [0, 1], so (12.3) easily implies that q is the smallest root of s = f (s) in [0, 1]. (Just draw a gure of how f (s) may look like and what the iteration f n (0) does!) Exercise 12.5. Using the above considerations, show that if E[ ] 1 but P[ = 1 ] = 1, then the GW process almost surely dies out. Deduce that pc (Tk+1 ) = 1/k and (pc ) = 0. A quite dierent strategy is the following:
{ex.GWpgf}

118

{ex.GWMG}

Exercise 12.6. * (a) Consider a GW process with ospring distribution , E = . Let Zn be the size of the nth level, with Z0 = 1, the root. Show that Zn /n is a martingale, and using this, assuming P[ = 1 ] = 1, that 1 implies that the GW process dies out almost surely.
2 (b) On the other hand, if > 1 and E[ 2 ] < , then E Zn

Moment Method (if X 0 a.s., then P[ X > 0 ] (EX )2 /E[ X 2 ] prove this), deduce that the GW process survives with positive probability.

C (EZn )2 , and by the Second

Let us describe yet another strategy, which is robust enough to be used in dierent nite random graph models (see, e.g., [vdHof13]). Consider the following exploration process of any rooted tree, with the children of each vertex being ordered. During the process, vertices will be active, inactive, or neutral, and active vertices will be ordered. In the 0th step, start with the root as the only active vertex; all other vertices are neutral. In the (i + 1)th step, examine the children of the rst vertex v in the active list after the ith step, put these children at the beginning of the active list, and turn v inactive. If the tree is nite, the process will end up with all vertices being inactive; if the tree is innite, then the process will run forever, with the vertices of the rst innite ray all being put in the active list. See Figure 12.2. 5 8 3 2 4 6 1 7 10 9 5 4 3 2 1 0

Figure 12.2: On the left, the shades of vertex colours show the order in which the sets of children are put into the active list. The labels on the vertices show the order in which they turn inactive. On the right, the steps of the walk are indexed by the vertices as they turn inactive, while the height shows the current number of active vertices. The above exploration process is well-suited to construct a subcritical GW tree fully or a supercritical GW tree partially. Indeed, let {Xi : i 1} be a sequence of iid variables from the ospring distribution , which will be the number of children being put into the active list at each step, let S0 = 1, and then Si+1 = Si + Xi 1 is the size of the active list after the (i + 1)th step. This is a
{f.GWexplore}

random walk with iid increments distributed as 1. If E 1 but P[ = 1 ] = 1, then the walk is either recurrent or has a negative drift, hence Sn = 0 will eventually happen almost surely. If ever reaching zero. Hence we have reached the same conclusion as in Exercises 12.5 and 12.6. For general trees T , Lyons dened an average branching number br(T ), also measuring the E > 1, then the walk has a positive drift, hence with positive probability will go to innity without

119

Hausdor dimension of the boundary of the tree with a natural metric, and showed that pc (T ) = 1/br(T ). He has also found very close connections between percolation and random walks on T ; see, e.g., [Lyo92], and, of course, [LyPer10]. A main conjecture in percolation theory is that the critical behaviour (pc ) = 0 should hold in general: Conjecture 12.7 ([BenS96c]). On any transitive graph with pc < 1, (pc ) = 0. (It can be shown that the only possible discontinuity of (p) is at p = pc , hence the conjecture says that (p) is continuous everywhere.) That transitivity is needed can be seen from the case of general trees: Exercise 12.7. Consider a spherically symmetric tree T where each vertex on the nth level Tn has moment method, show that pc = 1/k and (pc ) > 0. dn {k, k + 1} children, such that limn |Tn |1/n = k , but
n=0

{c.thetapc}

k n /|Tn | < . Using the second

The main reasons to conjecture continuity at pc are the following: (1) it is known to hold in the extreme cases (Z2 and regular trees) and in some other important examples, as we will discuss below; (2) computer simulations show it on Euclidean lattices (Z3 being one of the most famous problems of statistical physics); (3) simple models of statistical physics tend to have a continuous phase transition. (In physics language, the phase transition is of second order, i.e., key observables like (p) are continuous but non-dierentiable at pc .) However, it should be noted that the FK(p, q ) random cluster models (discussed in Section 13.1; the q = 1 case is just percolation) on the lattice Z2 conjecturally have a rst order (i.e., discontinuous) phase transition for q > 4. For this and some other reasons to be discussed in a later version of these notes, having a rst versus second order transition should be thought of as a quantitative question, rather than a philosophical one. Besides regular trees, Conjecture 12.7 is quite classical also on nice planar lattices, where even the critical value is known in some cases, e.g., pc (Z2 , bond) = pc (TG, site) = 1/2, where TG is the triangular grid these are Kestens theorems from 1980. Percolation, and more generally, statistical mechanics at criticality in the plane is indeed a miraculous world, exhibiting conformal invariance. This eld has seen amazing progress in the past few years, due to the work of Schramm, Smirnov, and others; see Section 12.3 for a bit more details. As an appetizer: although the value of pc for percolation is a lattice-dependent local quantity (see Conjecture 14.11), critical percolation itself should be universal: viewed from far, it should look the same and be conformally invariant on any planar lattice, even though criticality happens at dierent densities. For instance, critical values of such exponents are proved only for site percolation on the triangular lattice. exponents such as (p) = (p pc )5/36+o(1) as p pc should always hold, though the existence and Conjecture 12.7 is also known for Zd with d 19, using a perturbative Fourier-type expansion

method called the Hara-Slade lace expansion. Again, this method is good enough to calculate, e.g.,

by many-many transitive graphs, namely all mean-eld graphs; this should include Euclidean

(p) = (p pc )1+o(1) as p pc . This, together with other critical exponents, are conjecturally shared

lattices for d > 6, all non-amenable groups, and probably most groups in between. This is known 120

only in few cases, like highly non-amenable graphs (including regular trees, of course); again see Section 12.3 for more information. The following important general theorem settles Conjecture 12.7 for all non-amenable groups. On the other hand, the cases of Zd , with 3 d 18, and of all non-Abelian amenable groups remain wide open.
{t.BLPSpc}

Theorem 12.8 (Benjamini-Lyons-Peres-Schramm [BLPS99a, BLPS99b]). For any non-amenable Cayley (or more generally, unimodular transitive) graph, percolation at pc dies out. We will give a rough sketch of a proof of this theorem, but rst we need to dene when a transitive graph G is called unimodular. Let be the automorphism group of G, and x be the stabilizer of looks the same from all vertices, but it does not have + and directions in which it looks dierent the vertex x. Then the condition is that |x y | = |y x| for all (x, y ) E (G). So, not only the graph

on a quantitative level. (But dierent directions are still possible in a ner sense: see Exercise 12.8).

A simple example of a non-unimodular transitive graph is the grandparent graph: take a 3-regular tree, pick an end of it, and add an edge from each vertex to its grandparent towards the xed end. Another class of examples are the Diestel-Leader graphs DL(k, ) with k = . A more complicated example can be found in [Tim06b]. The importance of unimodularity is the Mass Transport Principle, discovered by lle H aggstr om: randomness, that is diagonally invariant, i.e., f (x, y, ) has the same distribution as f (x, y, ) for any Aut(G), we have
y V

G is unimodular iff for any random function f (x, y, ), where x, y V (G) and is the

Ef (x, y, ) =
y V

Ef (y, x, ).

(12.4) {e.MTP}

We think of f (x, y, ) as the the mass sent from x to y when the situation is given by , e.g., the percolation conguration. Then the MTP means the conservation of mass on average. Given a non-unimodular transitive graph, one can easily construct a deterministic mass transport rule that does not satisfy (12.4): for instance, in the grandparent graph, if every vertex sends mass 1 to each grandchild, then the outgoing mass is 4, but the incoming mass is only 1. In the other direction, given a unimodular graph, a simple resummation argument gives (12.4). As the simplest case, Cayley graphs do satisfy the Mass Transport Principle (and hence are unimodular): using F (x, y ) := Ef (x, y, ), we have rst equality we used that multiplying from the left by a group element is a graph-automorphism, and in the second equality we used that x x1 is a self-bijection of . See [LyPer10, Sections 8.1, 8.2] for more details on MTP and unimodularity.
{ex.+-}
xG F (o, x)

1 , o) xG F (x

y G F (y, o),

where, in the

Exercise 12.8. such that there is no graph-automorphism interchanging them. (b)* Can you give an example with a Cayley graph? (a) Give an example of a unimodular transitive graph G such that there exist neighbours x, y V (G)

121

In some sense, MTP is a weak form of averaging. Exercise 12.11 below is a good example of this. And it is indeed weaker than usual averaging: Exercise 12.9 (Soardi-Woess 1990). Show that amenable transitive graphs are unimodular. A typical way of using the MTP is to show that whenever there exist some invariantly dened special points of innite clusters in an invariant percolation process, then there must be innitely many in any innite cluster. For instance: Exercise 12.10. Given a percolation conguration E (G), a trifurcation point is a vertex components. (a) In any invariant percolation process on any transitive graph G, show that the number of trifurcation points is either almost surely 0 or almost surely . (b) In an invariant percolation process on a unimodular transitive graph G, show that almost surely the number of trifurcation points in each innite cluster is 0 or . (c) Give an invariant percolation on a non-unimodular transitive graph with innitely many trifurcation points a.s., but only nitely many in each innite cluster. A more quantitative use of the MTP is the following: Exercise 12.11. Let be any invariant bond percolation on a transitive unimodular graph G. Let K be the average degree inside a nite subgraph K G, and let (G) be the supremum of K over E[ deg (o) ] > (G), then has an innite cluster with positive probability. In words, since (G) is the supremum of the average degrees in nite subgraphs, it is not surprising that a mean degree larger than (G) implies the existence of an innite clusters. The MTP is needed to pass from spatial averages to means w.r.t. invariant measures. Exercise 12.12. The bound of Exercise 12.11 is tight: show that for the set of invariant bond percolations on the 3-regular tree T3 without an innite cluster, the supremum of edge-marginals is 2/3. (Hint: the complement of a perfect matching has density 2/3 and consists of Z components.) Exercise 12.11 is, of course, vacuous if G is amenable. And non-amenability is in fact essential for the conclusion that a large edge-marginal implies the existence of an innite cluster, as can be seen from the following two exercises: Exercise 12.13. For any > 0, give an example of an invariant bond percolation process on Zd with only nite clusters a.s., but E[ deg (o) ] > 2d . all nite K . Then clearly (G) + E (G) = degG (o), where E is the Cheeger constant. Show that if
{ex.marginonamen} {ex.trifurcation}

in an innite cluster C whose removal from C would break it into at least three innite connected

{ex.tree2/3margin}

122

{ex.marginamen}

Exercise 12.14. Generalize the previous exercise to all amenable Cayley graphs. (Hint: close the boundaries of a set of randomly translated Flner sets, with a density high enough to ensure having only nite clusters, but low enough to ensure high edge marginals. The statement holds not only for Cayley graphs, but also for all amenable transitive graphs, but, for any two vertices in a Cayley graph, there is a unique natural automorphism moving one to the other, which makes the proof a bit easier.) Note that Exercises 12.11 and 12.14 together give a percolation characterization of amenability, similar to Kestens random walk characterization, Theorem 7.3: Proposition 12.9 ([BLPS99a]). A unimodular transitive graph is amenable iff for any > 0 there is an invariant bond percolation with nite clusters only and edge-marginal P[ e ] > 1 . A related characterization can be found in Exercise 13.17. On the other hand, maybe surprisingly, a very similar condition characterizes not non-amenability but a stronger property, Kazhdans (T), see Theorem 12.15 below. Sketch of proof of Theorem 12.8. We will base our sketch on [BLPS99b]. First of all, by the ergodicity of Bernoulli percolation (Lemma 12.2) and by Lemma 12.3, we need to rule out the case of a unique and the case of innitely many innite clusters. C . For any > 0, we dene a new invariant percolation on E (G), thicker than in some Let E (G) be the percolation conguration at pc , and assume it has a unique innite cluster
{pr.marginal}

are the closest points of C to x. Choose one of them uniformly at random, denoted by x . Let be an independent Ber() bond percolation. Then let the edge (x, y ) E (G) be open in if lim0 P[ (x, y ) ] = 1. Hence, by Exercise 12.11, for some small enough > 0, there is an innite dist(x, C ) < 1/, dist(y, C ) < 1/, and x and y are connected in \ . It is easy to see that cluster in with positive probability. However, it is also easy to check that such an innite cluster

sense and sparser in another. For each x V (G), there is a random nite set {x i } of vertices that

implies the existence of an innite cluster in \ . But this is just Ber(pc ) percolation, so it has no innite clusters a.s. a contradiction. Assume now that there are innitely many innite clusters in . Using insertion tolerance, they can be glued to get innitely many trifurcation points as in Exercise 12.10. Moreover, the same application of MTP shows that, almost surely, if a trifurcation point is removed, then each of the of trifurcation points; see Figure 12.3 (a). There is an obvious graph structure G on V as vertices, with an edge between two trifurcation points if there is a path connecting them in that does not go through any other trifurcation point; see Figure 12.3 (b). This graph, although not at all a subgraph of G, represents the structure of the innite clusters of well, which is a critical percolation two properties appear to point in dierent directions, so we can hope to derive a contradiction. For structure; on the other hand, each component of G is kind of tree-like, with a lot of branching. These resulting innite clusters still has innitely many trifurcation points. Let V V (G) denote the set

123

the actual proof, we will exhibit a nonamenable spanning forest F inside each component of G in an invariant way. For each v V , let {Ci (v ) : i = 1, . . . , jv } be the set of innite clusters of G \ {v } neighbouring

v , where jv 3. For each 1 i jv , let {wi : 1 ki } be the set of G -neighbours of v in Ci (v ).

and 1 i jv , draw an edge from v to that element of {wi : 1 ki } that has the smallest

We now use some extra randomness: assign an i.i.d. Unif [0, 1] label to each v V , and for each v

label. See Figure 12.3 (c). We then forget the orientations of the edges to get F . It is not very hard to show that F has no cycles; the Reader is invited to write a proof or look it up in [BLPS99b].

.07 .31 .43 .88 .09 .45 .27 .52 .34 .71

Figure 12.3: Constructing the graph G and forest F of trifurcation points in an innite cluster. Now again let be an independent Ber() bond percolation. Then \ has only nite clusters.

{f.trifurtree}

Let F be the following subgraph of F : if (x, y ) E (F ), then (x, y ) F if x and y are in the same not in E (G). It has only nite clusters, while lim P[ (x, y ) F \ F ] = 0. But each tree of F

cluster of \ . Clearly, F is an invariant bond percolation process on V (G), with edges usually is a non-amenable tree with minimum degree at least 3, suggesting that this cannot happen. We cannot just use Exercise 12.11, since F itself is not a transitive unimodular graph, but a similar MTP argument on the entire V (G) can be set up to nd the contradiction. We omit the details. The above proof strategy clearly breaks down for non-unimodular transitive graphs. Nevertheless,

Conjecture 12.7 has been established for most such known graphs in the union of [Tim06b] and [PerPS06]. The number of innite clusters in Bernoulli percolation is also a basic and interesting topic. We proved in Lemma 12.3 that it is an almost sure constant, 0,1, or . It is pretty obvious that for all p (pc (Tk ), 1), there are a.s. innitely many innite clusters on the k -regular tree. On the other hand, there is the following very elegant theorem: Theorem 12.10 (Burton-Keane [BurtK89]). For any insertion tolerant ergodic invariant percolation on any amenable transitive graph, the number of innite clusters is almost surely 0 or 1. Proof. If there were innitely many innite clusters, then, using insertion tolerance, they could be glued, as in the proof of Lemma 12.3 above, to get that a given vertex is a trifurcation point with 124
{t.BurtonKeane}

positive probability. Hence, in a large Flner set Fn , the expected number of trifurcation points Xn grows linearly with |Fn |. On the other hand, deterministically, the inner vertex boundary of Fn has |Fn | should grow linearly with |Fn |, contradicting the denition of a Flner sequence. For the non-amenable transitive case, where having innitely many innite clusters is a possibility, for all p (pu , 1] there is uniqueness a.s., and for all p (pc , pu ) there is non-uniqueness [H aPS99]. Note that there is no monotonicity that would make this obvious: as we raise p, innite clusters can one can dene a second critical point, pu (G) := inf {p : Pp [! cluster] > 0}. It is known that to be at least Xn + 2. (This requires a little thought. See Figure 12.3.) Combining these two facts,

merge, reducing their number on one hand, but new ones can also appear by nite clusters merging on the other hand. So, how is this result proved? But, rst things rst, what does it really mean that clusters merge as we raise p? Recall the standard monotone coupling of Ber(p) percolations if p is such that p has an innite cluster a.s., and p p, then each innite cluster of p contains an innite cluster of p . The proof of this uses Invasion Percolation, see Section 13.3. (pc , pu ) and (pu , 1) are non-empty: Conjecture 12.11 ([BenS96c]). For transitive graphs, pc < pu iff G is non-amenable. Conjecture 12.12 ([BenS96c]). For transitive non-amenable graphs with one end, pu < 1. Exercise 12.15. Show that in a transitive graph with innitely many ends, pu = 1. Before discussing what is known about the non-triviality of these intervals, here is an important characterization of uniqueness: Theorem 12.13 ([LySch99]). If is an ergodic insertion-tolerant invariant percolation on a unicluster. modular transitive graph G satisfying inf x,yV (G) P x y

{p : p [0, 1]} using Unif [0, 1] labels. And then the real result is that a.s. in this standard coupling,

At pu the situation could be either way, but more importantly, it is not known when the intervals
{c.pcpu} {c.pu}

{t.uniconn}

> 0, then has a unique innite

Conversely, if an ergodic invariant satisfying the FKG inequality has a unique innite cluster, then inf x,yV (G) P x y > 0. increasing event with a positive probability p > 0 that is independent of x, hence P[ x y ] P[ x, y C ] p2 . The second part is obvious: if C is the unique innite cluster of , then {x C } is an

Exercise 12.16. Give an example of a Ber(p) percolation on a Cayley graph G that has nonuniqueness, but there is a sequence xn V (G) with dist(x0 , xn ) and inf n Pp [ x0 xn ] > 0. Exercise 12.17. * Give an example of an ergodic uniformly insertion tolerant invariant percolation [H aM09].) on Z2 with a unique innite cluster but inf x,yZ2 P x y = 0. (Hint: you can use the ideas of

125

The proof of the rst part of Theorem 12.13 in [LySch99] uses a fundamental result from the same paper: Theorem 12.14 (Cluster indistinguishability [LySch99]). If is an ergodic insertion-tolerant invariant percolation on a unimodular transitive graph G, and A is a Borel-measurable translationinvariant set of subgraphs of G, then either all innite clusters of are in A a.s., or none. If inf x,y P x y

{t.indist}

A very rough intuitive explanation of how Theorem 12.14 implies Theorem 12.13 is the following. > 0, then each innite cluster must have a positive density that may be measured by the frequency of visits by a simple random walk, independently of the starting point of the walk. This density must be the same for each cluster by indistinguishability, while the sum of the densities of the innite clusters must be at most 1. Hence, there are only nitely many innite clusters, and by insertion tolerance and ergodicity, there must be a unique one. Regarding pc < pu , it was shown by Pak and Smirnova-Nagniebeda [PaSN00] that every nonamenable group has a generating set satisfying this. Of course, this property should be independent of the generating set taken (probably also a quasi-isometry invariant). Here is a brief explanation of what kind of Cayley graphs make pc < pu easier. As usually, we will consider bond percolation. In a transitive graph G, let an be the number of simple loops of length n starting (and ending) at a given vertex o, and let (G) := lim supn an . The smaller this is, the more treelike the graph is, hence there is hope that pu will be larger. Indeed, as proved by Schramm, see [LyPer10, Theorem 7.30], 1 pu (G) , (G) (12.5) {e.gammapu}
1/n

for any transitive graph G. The proof is a nice counting argument, which we outline briey. Taking p < p+ < 1/ , it is enough to prove by the easy direction of Theorem 12.13 that, for o and x far away either there are many cut-edges between them already at level p+ , or there are many p+ -open simple loops. In the rst scenario, keeping many (say, k ) cut-edges open even at level p is exponentially costly: the probability is (p/p+ )k , conditionally on the connection at p+ . But the second scenario is unlikely because there are not enough simple loops in G: if uk (r) denotes the probability that there is some x Br (o) such that o and x are p+ -connected with at most k cut-edges, then u0 (r) 0 noticing that uk+1 (r) u0 (s) + |Bs (o)| uk (r s). Then, by choosing k then r large enough, both Now, if G is d-regular, then an /dn pn (o, o) for SRW on G, hence 1 1 , dG (G) (G) where is the spectral radius of the SRW. On the other hand, by Proposition 12.6, pc (G) 1 1 1 < = , hE (G) + 1 hE (G) dG E (G) (12.7) {e.pcrho} (12.6) {e.gammarho} as r because of p+ < 1/ , and the same can be shown for each uk (r) by induction on k , by contributions to Pp [ o x ] will be small, proving (12.5). from each other, Pp [ o x ] must be small. If o x at level p (in the standard coupling), then

126

where hE is the edge Cheeger constant dened using the ratios |E S |/|S |, while E is the edge Cheeger constant of the Markov chain (the SRW), i.e., it uses the ratios C (E S )/ (S ), as in Section 7.2. After comparing the three displayed inequalities (12.5, 12.6, 12.7), the aim becomes to nd a Cayley graph G for which E is close to 1 and is close to 0. Fortunately, these two aims are really the is easy: take any nite generating set S , for which we have some (G(, S )) = 0 < 1, then take keep the multiplicities with which group elements occur as k -wise products). The transition matrix for SRW on the resulting multigraph is just the k th power of the original transition matrix, hence (G(, S k )) = k 0 0 as k , and we are done. It is not known if this 0 can be achieved with generating sets without multiplicities. For G(, S k ) for some large k , where S k = S S is the multiset of all possible k -wise products (i.e., we same, by the quantitive bound 2 E /2 1 E from Kestens Theorem 7.3. And making 0

S instance, would the ball Bk (1) work, given by any nite generating set S ? The answer is yes for

groups having a free subgroup F2 , or having the so-called Rapid Decay property, which includes all Gromov-hyperbolic groups. See [PaSN00] for the (easy) proofs of these statements. The following exercise shows that if we do not insist on adding edges in a group-invariant way (i.e., by increasing the generating set), but still want to add them only locally, then we indeed can push hE close to the degree (or in other words, can push E close to 1) without using multiple edges.
out Even the outer vertex Cheeger constant hV := inf |V S |/|S | can be close to the degree, which is of out course stronger, since |V S | |E S | (d 1)|S | in a d-regular graph.

Exercise 12.18. Show that for any d-regular non-amenable graph G and any > 0, there exists K < such that we can add edges connecting vertices at distance at most K , such that the new graph G will be d -regular, no multiple edges, and V (G ) := hV (G )/d will be larger than 1 for the

outer vertex Cheeger constant. (Hint: use the wobbling paradoxical decomposition from Exercise 5.6. The Mass Transport Principle shows that this proof cannot work in a group-invariant way.) Exercise 12.19. *** radius k in any nite generating set S ?
S S (a) Is it true that E (G(, Bk ))/|Bk | 1 as k for any nonamenable group and the ball of

S S (b) Is it true that V (G(, Bk ))/|Bk | 1 for any group and any nite generating set S ?

Possibly the best attempt so far at proving pc < pu in general is an unpublished argument of Oded Schramm, which we discuss in Section 12.4 below, see Theorem 12.29. There is an interpretation of pc < pu in terms of the Free and Wired Minimal Spanning Forests, see Theorem 13.5 below. Conjecture 12.12 on pu < 1 is known under some additional assumptions (besides being nonamenable and having one end): CutCon(G) < (for instance, being a nitely presented Cayley graph) or being a Kazhdan group are sucient, see [BabB99, Tim07] or [LyPer10, Section 7.6], and [LySch99], respectively. Here is how being Kazhdan plays a role:

{t.Kazhdanclosure}

127

Theorem 12.15 ([GlW97]). A f.g. innite group is Kazhdan iff the measure half on 2 that gives probability half to the emptyset and probability half to all of is not in the weak* closure of the -invariant ergodic probability measures on 2 . In other words, for a transitive d-regular graph G(V, E ) and any o V , let erg (G) := sup E {(o, x) E : (x) = (o)} ergodic invariant measures on {1}V : dG with E (o) = 0 .

Then, a f.g. group is Kazhdan iff any (or one) of its Cayley graphs G has erg (G) < 1. There are some natural variants of erg (G): instead of all ergodic measures, we can take only the tail-trivial ones (i.e., all tail events have probability 0 or 1), or only those that are factors of an i.i.d. Unif [0, 1] process on V or E (i.e., we get as a measurable function f of the i.i.d. process , where f commutes with the action of on the congurations and ). The corresponding suprema are they really dierent? An unpublished result of Benjy Weiss and Russ Lyons is that, for Cayley graphs, id (G) < 1 is equivalent to nonamenability. (One direction is easy: see Exercise 12.20 (a) below.) On the other hand, it is only conjectured that tt (G) < 1 is again equivalent to nonamenability. In general, it is a hard task to nd out what tail trivial processes are factors of some i.i.d. process. See [LyNaz11] and Subsection 14.2 for a bit more on these issues. Exercise 12.20. (a) Show that id (G) = 1 for any amenable transitive graph G. (Hint: have i.i.d. coin ips in large Flner neighbourhoods vote on the -value of each vertex.) (b) Show that erg (T3 ) = 1. (Hint: free groups are not Kazhdan e.g. because they surject onto Z.) Exercise 12.21. *** (a) Find the value of id (T3 ). (b) Show that tt (T3 ) < 1. We come back now to the question of pu < 1: Corollary 12.16 ([LySch99]). If G is a Cayley graph of an innite Kazhdan group , then pu (G) < 1. Moreover, at pu there is non-uniqueness. Sketch of proof of pu < 1. Assume pu (G) = 1, and let p be Ber(p) percolation at some p < 1. Let p be the invariant site percolation where the vertex set of each cluster of p is completely deleted with 0. It is not surprising that this implies that p is ergodic. On the other hand, as p 1, it is clear that Let us also sketch a non-probabilistic proof that uses directly Denition 7.3 of being Kazhdan, instead of the characterization Theorem 12.15. It is due to [IKT09], partly following a suggestion made in [LySch99]. probability 1/2, independently from other clusters. By Theorem 12.13, we have inf x,y Pp [ x y ] =
{c.Kazhdanpu} {ex.agree}

are denoted by tt and id . Clearly, erg (G) tt (G) id (G) (wait, not that clearly...), but

p converges to half in the weak* topology, so Theorem 12.15 says that could not be Kazhdan.

128

density p. Given the right Cayley graph G(, S ) for a nite generating set S , and a site percolation conguration , let C be the set of its clusters, C (g ) be the cluster of g , and 2 (C ) be the integral of these Hilbert spaces over all , H := 2 (C ) ,

Sketch of another proof of pu < 1. Let = {0, 1}, with the product Bernoulli measure p with

Hilbert space of square summable real-valued sequences dened on C . Consider now the direct

(, )p :=
C C

(, C ) (, C ) dp , for , H .

g ()(, C ) := ( g , g 1 C ) for C C , where g (h) := (gh); note here that g 1 C Cg , since g 1 x g 1 y iff x y . Now, if H is the vector that is 1 at C (1) and 0 at the other clusters of , for each , then
p g

has a natural unitary (in fact, orthogonal) representation on H, translating vectors by

= 1, and

Pp [ g h ] = Pp [ C (g ) = C (h) ] = (g (), h ())p = (, gh1 ())p . This implies, by the way, that Pp [ g h ] is a positive denite function. s ()
p

Take an > 0 smaller than the Kazhdan constant (, S ) > 0. If p is close enough to 1, then = 2 2(, s ()) = 2Pp [ 1 s ] < for all s S . Hence, by the Kazhdan property,

there is an invariant vector H.

It is easy to see that a vector is invariant under the representation iff (, C ) = ( g , g 1 C )

attained at nitely many clusters. Taking these clusters, we get an invariant choice of nitely many clusters. By a simple application of the Mass Transport Principle, all of these clusters must be innite. If there are innitely many innite clusters in p , then, by the cluster indistinguishability Theorem 12.14, there is no invariant way to choose nitely many of them, hence we must have a unique innite cluster instead. Exercise 12.22. Fill in the missing details in either of the above proof sketches for pu < 1 for Kazhdan groups. Exercise 12.23 (Todor Tsankov). *** Prove pc < pu for Kazhdan groups by nding an appropriate representation.

for all g . On the other hand, for each , we have (, ) 2 (C ), hence its maximum is

12.2

Percolation on nite graphs. Threshold phenomema

{ss.percfin}

The best-known example is the Erd os-R enyi random graph model G(n, p), which is just Ber(p) percolation on the complete graph Kn . We will give a very brief introduction; see [JaLR00] for more on random graphs, [AloS00] for probabilistic combinatorics in general, and [KalS06] for a nice survey of inuences and threshold phenomena that we will dene in a second. [Gri10] contains a bit of everything, similarly to the present notes. the diagonal action of the permutation group Sym(V ), or in other words, that does not care about A graph property A over some vertex set V is a subset of {0, 1}V V that is invariant under

129

the labelling of the vertices. Examples are containing a triangle, being connected, being 3colourable, and so on. Such properties are most often monotone increasing or decreasing. It was noticed by Erd os and R enyi [ErdR60] that, in the G(n, p) model, monotone graph properties have a relatively sharp threshold: there is a short interval of p values in which they become extremely likely or unlikely. Here is a simple example: Let X be the number of triangles contained in G(n, p) as a subgraph. Clearly, Ep X =
n 3

p3 ,

have Ep X , but, in order to conclude that Pp [ X 1 ] 1, we also need that X is somewhat X 0, applying Cauchy-Schwarz to E[ X ] = E[ 1X>0 X ] gives P[ X > 0 ] Now, back to the number of triangles, Ep [ X 2 ] =
, E (K n ) triangles

hence, if p = p(n) = o(1/n), then Pp [ X 1 ] Ep X 0. What can we say if p(n)n ? We

concentrated. This is the easiest to do via the Second Moment Method: for any random variable

(EX )2 . E[ X 2 ]

(12.8) {e.2MM}

Pp [ , are both open ] =


=

+
||=1

+
=

n 3 n p + 3 2

n2 5 1 n p + 2 2 3

n3 6 p . 3

Thus, for p > c/n, we have Ep [ X 2 ] C (Ep X )2 , and (12.8) yields that Pp [ X 1 ] c > 0. Moreover, if pn , then Ep [ X 2 ] (1 + o(1))(Ep X )2 , hence Pp [ X 1 ] 1. Here are now the exact general denitions for threshold functions:
{d.threshold}

Denition 12.2. Consider the Ber(p) product measure on the base sets [n] = {1, . . . , n}, and let pt A (n) be the p for which Pp [ An ] = t, and call pA (n) := pA (n) the critical probability for A.
1/2

An {0, 1}[n] be a sequence of increasing events (not the empty and not the full). For t [0, 1], let (These exist since p Pp [ An ] is strictly increasing and continuous, equalling 0 and 1 at p = 0 and

p = 1, respectively.) The sequence A = An is said to have a threshold if 1 if p(n) 1pA (n) , pA (n) 1p(n) Pp(n) [ An ] 0 if p(n) 1pA (n) 0 . pA (n) 1p(n) Furthermore, the threshold is sharp if for any > 0, we have
p1 A (n) pA (n) 0 as n , pA (n) (1 pA (n))

and it is coarse if there are , c > 0 such that the above ratio is larger than c for all n. Similar denitions can be made for decreasing events. A sequence An of events will often be dened only for some subsequence nk : for instance,

the base set [nk ] may stand for the vertex or the edge set of some graph Gk (Vk , Ek ).

130

called the Margulis-Russo formula, is fundamental in the study of threshold phenomena: for any event A, d Pp [ A ] = dp =
i[n] A p I (i) , which is i[n]

A short threshold interval means that Pp [ A ] changes rapidly with p, hence the following result,

(12.9) {e.Russo} Pp [ i is pivotal for A ] for A increasing,

A p where I (i) := Pp [ i A ] Pp [ i A ] is the signed inuence of the variable i on A, with i A :=

a given conguration changes the outcome of the event, i.e., that i A i A. The ordinary A (i) for A increasing. (unsigned) inuence is just I A (i) := Pp [ i is pivotal for A ], equalling I
p p

{ : {i} A} and i A := { : \ {i} A}, while pivotal means that changing the variable in

term,

The proof of (12.9) is simple: write Pp [ A ] = d Pp [ A ] = dp

1A ( )Pp [ ], compute the derivative for each

then, by monitoring that a given conguration A is appears for what congurations as {i} = (hence i A), notice that
n i=1

1 1 , 1A ( )Pp [ ] | | (n | |) p 1p

(12.10) {e.ddpA}

Pp [ i A ] =

1A ( )Pp [ ] | | + | |

1p , p

and similarly,
n i=1

Pp [ i A ] =

1A ( )Pp [ ] (n | |) + (n | |)

p , 1p

and the dierence of the last two equations is indeed equal to (12.10). One intuition behind the simplicity. In the standard coupling of the Ber(p) measures for p [0, 1], when we gradually raise the density from p to p + , then the increase in the probability of the event is exactly the number of these pivotal openings is
p+ p

formula (which could be turned into a second proof) is the following. Take an increasing A, for

probability that there is a newly opened variable that is pivotal at that moment. The expected pending even on n, the number of bits), it is reasonable to guess that (1) this expectation is about expected number. Dividing by and taking lim0 gives (12.9).
A inuence Ip := A Ip A i Ip (i)

Eq [ number of pivotals for A ] dq . For very small (de-

Ep [ number of pivotals for A ], and (2) the probability of having a pivotal opening is close to this So, one could prove a sharp threshold for some increasing A = An by showing that the total

proving isoperimetric inequalities in the hypercube. For instance, for the uniform measure p = 1/2,
A n1 the precise relationship between total inuence and edge boundary is I1 , and the /2 = |E A|/2

is the size of the edge boundary of A in {0, 1}[n], measured using Pp , hence we are back at

is large for all p around the critical density pA (n). But note that

standard edge-isoperimetric inequality for the hypercube (which can be proved, e.g., using a version of Theorem 5.9) can be written as
A I1 /2 2 P1/2 [ A ] log2

1 . P1/2 [ A ]

(12.11) {e.InfIsop}

131

An easier inequality is the following Poincar e inequality (connecting isoperimetry and variance, just like in Section 8.1):
A I1 /2 4 P1/2 [ A ] (1 P1/2 [ A ]) .

(12.12) {e.InfPoin}
{ex.InfIsop}

Exercise 12.24.
A n1 (a) Prove the identity I1 . /2 = |E A|/2 A (b) Show that, among all monotone events A on [n], the total inuence I1 /2 is maximized by the

majority Majn , and nd the value. (Therefore, Majn has the sharpest possible threshold at p = 1/2. For general p, but still bounded away from 0 and 1, the optimum remains similar: see (12.15).) Exercise 12.25. Prove the Poincar e inequality (12.12). Fourier analysis, dened very briey as follows: E1/2 f ( ) S ( ) , where S ( ) := Hint: Dene a map from the set of pairs (, ) A Ac into E A. Alternatively, use discrete For any function f : {1, 1}n R of n bits, dene the Fourier-Walsh coecients f (S ) :=
iS

{ex.InfPoin}

E1/2 [ f g ]. (In a slightly dierent language, these are the characters of the group Zn 2 .) Determine the coecients f (S )2 .
A variance Varf , and, for Boolean functions f = 1A , the total inuence I1 /2 in terms of the squared

(i) : S [n] is an orthonormal basis w.r.t. (f, g ) :=

one. In general, we have the following basic result, due to Bollob as and Thomason [BolT87], who used isoperimetric considerations to prove it.

event, for some xed i [n]. For this event, Pp [ Di ] = p, hence it has a threshold, but a very coarse
{ex.threshold}

sharp, as shown by the dictator: Di ( ) := (i) as a Boolean function, or Di := { : i } as an

For balanced events, i.e., when P1/2 [ A ] = 1/2, the stronger (12.11) gives I1/2 (A) 1, which is

Exercise 12.26. Prove that for any sequence monotone events A = An and any there is C < events has a threshold. (Hint: take many independent copies of a low density percolation to get success with good probability at a larger density.) Now, what properties have sharp thresholds? There is an ultimate answer by Friedgut and Bourgain [FriB99]: an increasing property has a coarse threshold iff it is local, i.e., its probability can be signicantly increased by conditioning on a bounded set of bits to be present. Typical examples are the events of containing a triangle or some other xed subgraph, either anywhere in the graph, or on a xed subset of the bits as in a dictator event. Sharp thresholds correspond to global properties, such as connectivity, and k -colorability for k 3. The exact results are slightly dierent in the case of graph and hypergraph properties (Friedgut) and general events (Bourgain), and we omit them. Note that although it is easy to show locality of events that are obviously local, it might be much harder to prove that something is global and hence has a sharp threshold. Therefore, it is still useful to have more robust conditions for the quantication of how sharp a threshold is. The following is a key theorem, which is a generalization of the p = 1/2 case proved in [KahKL88] that strengthens (12.12):
1 such that p1 A (n) pA (n) < C pA (n) (1 pA (n)). Conclude that every sequence of monotone

132

{t.BKKKL}

Theorem 12.17 ([BouKKKL92]). For Ber(p) product measure on [n], and any nontrivial event A {0, 1}[n], we have
A Ip c Pp [ A ] (1 Pp [ A ]) log

1 , 2 mA p

(12.13)

{e.totalInf}

A where mA p := maxi Ip (i). Furthermore,

mA p c Pp [ A ] (1 Pp [ A ]) In both inequalities, c > 0 is an absolute constant.

log n . n

(12.14)

{e.maxInf}

as n , uniformly in a large enough interval of p values around pA (n), then (12.13) and the

For instance, if we can prove for some sequence of monotone events A = An that mA p 0

any > 0. Note that this condition is going in the direction of excluding locality: a bounded set of small-inuence bits usually do not have a noticeable inuence even together. Furthermore, if there is a transitive group on [n] under which An is invariant (such as a graph property), then all the
A individual inuences are the same, so Ip = n mA p , and (12.14) implies that the threshold interval is

Margulis-Russo formula (12.9) show that the threshold interval is small: p1 A (n) pA (n) 0 for

at most C / log n. We will see some applications of these ideas in Subsections 12.3 and 13.2. The proofs of these inuence results, including the Friedgut-Bourgain theorem, use Fourier analysis on the hypercube Zn 2 , as dened briey in Exercise 12.25. From the viewpoint of percolation, the most relevant properties are related to connectivity and sizes of clusters. On a nite graph, instead of innite clusters, we talk about giant clusters, i.e., clusters that occupy a positive fraction of the vertices of Gn = (Vn , En ). For the Erd os-R enyi model G(n, p), the threshold for the appearance of a giant cluster is pA (n) = 1/n: below that the largest cluster has size O(log n), while above there is a unique cluster of macroscopic volume, and all other clusters are O(log n). The threshold window is of size n4/3 , and within this window, the largest, second largest, etc. clusters all have sizes around n2/3 , comparable to each other. A beautiful description of this critical regime is done in [Ald97], using Brownian excursions. But there is much beyond percolation on the complete graph Kn . The rst question is the analogue of pc < 1 for non-1-dimensional graphs: Conjecture 12.18 (Benjamini). Let Gn = (Vn , En ) be a sequence of connected nite transitive graphs with |Vn | and diameter diam(Gn ) = o(|Vn |/ log |Vn |). Then there is a, > 0 such that P1 [there is a connected component of size at least a|Vn |] > for all large enough n. Exercise 12.27. Show by example that the o(|Vn |/ log |Vn |) assumption is sharp. The next question is the uniqueness of giant clusters.
{c.finitepc}

133

{c.finiteunique}

, then for any a > 0 and > 0,


p<1

Conjecture 12.19 ([AloBS04]). If Gn is a sequence of connected nite transitive graphs with |Vn | sup Pp [there is more than one connected component of size at least a|Vn |] 0

as n , where Pp denotes the probability with respect to Ber(p) percolation. Exercise 12.28. Show by example that the 1 cuto is needed. Besides the Erd os-R enyi G(n, p), another classical example where these conjectures are known to are known for expanders, even without transitivity [AloBS04]. The proof of Conjecture 12.19 for hold is the hypercube {0, 1}n, again with critical value (1 + o(1))/n [AjKSz82]. Furthermore, they

this case proceeds by rst showing a simple general upper bound on the average inuence, somewhat that, for a uniformly chosen random i [n], complementing (12.14): for any increasing event A {0, 1}[n], if p (, 1 ), then = () such (12.15) {e.pmaxInf} Pp [ i is pivotal for A ] ; n A in other words, the total inuence is always Ip = O( n). (This generalizes Exercise 12.24 (b) from the case p = 1/2.) This bound is applied to proving that there cannot be many edges whose insertion would connect two large clusters. On the other hand, two macroscopic clusters in an expander would necessarily produce such pivotal edges, hence there is uniqueness. Despite the conjectured uniqueness of the giant cluster, there is a possible analogue of the nonuniqueness phase of the non-amenable case: in the intermediate regime, the identity embedding of the giant cluster into the original graph should have large metric distortion. (Locally we see many large clusters, only later they hook up, due to the niteness of the graph itself.) For the hypercube, the conjectured second critical value is around 1/ n [AngB07].

12.3

Critical percolation: the plane, scaling limits, critical exponents, mean eld theory
{ss.critperc}

Statistical mechanics systems with phase transitions are typically the most interesting at criticality, and Bernoulli percolation is a key example for the study of critical phenomena. Critical percolation is best understood in the plane and on tree-like (so-called mean eld) graphs, where the latter is understood broadly and vaguely: e.g., Zd for high d is locally tree-like enough and the global structure is simple enough so that critical percolation can be understood quite well, while Gromovhyperbolic groups are very much tree-like globally, but that presently is not enough. This section will concentrate on critical planar percolation, with some discussion on more general ideas and the mean eld theory. The planar self-duality of the lattice Z2 was apparent in our proof of the upper bound in 1/3

pc (Z2 ) 2/3. A more striking (but still very simple) consequence of this self-duality is the following: 134

{l.half}

have a probability exactly 1/2, regardless of n.

Ber(1/2) bond percolation Z2 , and also in an n n rhombus in Ber(1/2) site percolation TG both

Lemma 12.20. The events of having an open left-right crossing in an n (n + 1) rectangle in

Figure 12.4: The self-duality of percolation on TG and Z2 . For the proof, we will need some basic deterministic planar topological results. First of all, opening/closing the sites of TG is the same as colouring the faces of the hexagonal lattice white/black, so, for better visualization, we will use the latter, as in the left hand picture of Figure 12.4. Now, it is intuitively quite clear that, in any two-colouring, either there is a left-right white crossing, or a top-bottom black crossing in the rhombus, exactly one of the two possibilities. (In other words, a hex game will always end with a winner.) The fact that there cannot be both types of crossings is a discrete version of Jordans curve theorem, and the fact that at least one of the crossings must occur is a discrete version of Brouwers xed point theorem in two dimensions: any continuous map from the closed disk to itself must have a xed point. However, to actually prove the discrete versions (even assuming the topology theorems), some combinatorial hacking is inevitable, as shown, e.g., by colouring the faces of Z2 : if being neighbours requires a common edge, then it can happen that neither crossing is present, while if it requires only a common corner, then both crossings might be present at the same time. I am not going to do the discrete hacking in full detail, but let me mention two approaches. The most elegant one I have seen is an inductive proof via Shannons and Schensteds game of Y, see [PerW10, Section 1.2.3]. A more natural approach is to use the exploration interface. Add an extra row of hexagons (an outer boundary) to each of the four sides, the left and right rows coloured white, the top and bottom rows coloured black; the colours of the four corner hexagons will not matter. Now start a path on the edges of the hexagonal lattice in the lower right corner, with black hexagons on the right, white ones on the left. One can show that this path exploring the percolation conguration cannot get stuck, and will end either at the upper left or the lower right corner. Again, one can show that in the rst case the right boundary of the set of explored hexagons will form a black path from the top to the bottom side of the rhombus, while, in the second case, the left boundary will form a white path from left to right, and we are done. For bond percolation on Z2 , a similar statement holds. Given a percolation conguration in the n (n + 1) rectangle (the red bonds on the right hand picture of Figure 12.4), consider the dual 135

{f.duality}

is open iff the corresponding primal edge was closed. Again, either there is a left-right crossing on the primal graph, or a top-bottom crossing on the dual graph, exactly one of the two possibilities. The exploration interface now goes with primal bonds on its left and dual bonds on its right, with the left and right sides of the rectangle xed to be present in the primal conguration, and the top and bottom sides xed to be in the dual conguration. Exercise 12.29. Assuming the fact that at least one type of crossing is present in any two-colouring of the n n rhombus, prove Brouwers xed point theorem in two dimensions. Proof of Lemma 12.20. By the above discussion, if A is the event of a left-right primal crossing in Ber(p) percolation, then the complement Ac is the event of a top-bottom dual crossing (on TG, primal/dual simply mean open/closed). But, by colour-ipping and planar symmetry, we have Pp [ Ac ] = P1p [ A ]. This says that Pp [ A ] + P1p [ A ] = 1, hence, for p = 1 p = 1/2, we have P1/2 [ A ] = 1/2, as desired. For a physicist, the fact that at p = 1/2 there is a non-trivial event with a non-trivial probability that is independent of the size of the system suggests that this should be the critical density pc . This intuition becomes slightly more grounded once we know that the special domains considered in Lemma 12.20, where percolation took place, are actually not that special: Proposition 12.21 (Russo-Seymour-Welsh estimates). Consider Ber(1/2) site percolation on TG , the triangular grid with mesh size , represented as a black-and-white colouring of the hexagonal lattice, so that connections are understood as white paths. We will use the notation P = P 1/2 . be the images of the corners of [0, 1]2 . Then (i) Let D C be homeomorphic to [0, 1]2 , with piecewise smooth boundary, and let a, b, c, d D 0 < c0 < P ab cd in percolation on TG inside D < c1 < 1 for some ci (D, a, b, c, d) and all 0 < < 0 (D, a, b, c, d) small enough. pieces 1 A and 2 A. Then (ii) Let A C be homeomorphic to an annulus, with piecewise smooth inner and outer boundary 0 < c0 < P 1 A 2 A in percolation on TG inside A < c1 < 1 for some ci (D, a, b, c, d) and all 0 < < 0 (A) small enough. Similar statements hold for Ber(1/2) bond percolation on Z2 , with any reasonable denition of open crossing of a domain. Proof. The key special case is the existence of some s > r such that crossing an r s rectangle in the harder (length s) direction has a uniformly positive probability, depending only on r/s. Indeed, as one can easily convince themselves, Lemma 12.20 fed into the following inequality, together with a lot of applications of the Harris-FKG inequality, imply all the claims: P LR(r, 2s) P LR(r, s) 136
2

(n + 1) n rectangle on the dual lattice, with the dual conguration (the blue bonds): a dual edge

{pr.RSW}

(12.16)

{e.RSWquad}

(12.17)

{e.RSWannu}

/4 ,

(12.18) {e.RSWkey}

where LR(r, s) is the left-to-right crossing event in the rectangle r s in the horizontal (length s) direction, and r, s are arbitrary, except that are chosen relative to the mesh in a way that the vertical midline of the r 2s rectangle is an axis of symmetry of the lattice rectangle (i.e., the

lattice rectangle (the hexagons cut into two by the midline are considered to be part of both halves). See Figure 12.5.a. The following very simple proof is due to Stas Smirnov. (Anecdote: when Oded Schramm received the proof from Smirnov, in the form of a one-page fax containing basically just (12.18) and Figure 12.5, he almost posted the fax to the arXiv under Smirnovs name, to make sure that everyone gets to know this beautiful proof.) (a) (b)

union of lattice hexagons intersecting the rectangle), and both halves basically agree with the r s

(c)

Figure 12.5: Smirnovs proof of RSW. Open/white denoted by yellow, closed/black denoted by blue. Let D be the r 2s lattice rectangle, with left and right halves A and B . Fixing the outer

{f.RSW}

boundary black along the top side of A and white along its left side, start an exploration interface in the upper left corner until it hits one of the two other sides of A. The event of hitting the midline between A and B is equivalent to LR(A): the right (white) boundary of the stopped interface will be the uppermost left-right crossing of A (provided that a crossing exists). Condition now on this event and also on the right (white) boundary of the interface. See again Figure 12.5.a. Note that the conguration in the part of A below and in the entire B is independent of . Now let the reection of across the midline between A and B be ; if ended at a hexagon cut into two by the midline, then this hexagon will be in and not in . See Figure 12.5.b. In the , , in a clockwise order, where is the bottom side of A together with the part of the left side below , and similarly for in B . (The possible middle hexagon on the bottom side, just below the midline, will be in and not in .) Namely, there is either a white crossing between and or a black crossing between and , exactly one of the two possibilities. By (almost-)symmetry, we have P[ ] 1/2. If the white crossing between and occurs, then, together with the white , we get a white crossing from the left side of A to the bottom or right side of B . subdomain of D below , we can repeat the argument of Lemma 12.20, with boundary arcs ,

137

bound 1/2 over all possible choices of , that P LR(|A, B ) P LR(A) 1/2. Similarly, we have the intersection of the events LR(|A, B ) and LR( A, B |) implies LR(D), see Figure 12.5.c. So, by P LR(D) P LR(|A, B ) LR( A, B |) P LR(A) P LR(B ) /4, which is just (12.18). The next exercise nishes the proof. Exercise 12.30. Using (12.18), complete the proofs of (i) and (ii) of the proposition, and explain what to change so that the proof works for bond percolation on Z2 , too. There are several other proofs of Proposition 12.21, which is important, since they generalize in dierent ways. The above proof used the symmetry between primal and dual percolation, and hence pc = 1 pc , in a crucial way. Nevertheless, the result is also known, e.g., for critical site percolation on Z2 ; in particular, (pc ) = 0 is known there. One symmetry is still needed there: there is no dierence between horizontal and vertical primal crossings, since the lattice has that non-trivial symmetry. One of the most general such results is [GriM11]. It is also important that on such nice lattices, a bound of the type (12.18) holds for all p, even in a stronger form; see [Gri99, Lemma 11.73]: Pp LR(r, r) f Pp LR(r) (12.19) the RSW-inequality. As we mentioned above, these RSW estimates are the sign of criticality. For instance, they imply the following polynomial decay, which means that p = 1/2 is neither very subcritical nor supercritical: Exercise 12.31. Show that for p = 1/2 site percolation on TG or bond percolation on Z2 , there exist constants ci , i such that c1 n1 P[ 0 Bn (0) ] c2 n2 . However, more work was needed to fully establish the natural conjecture: , with f (x) c() x and lim f (x) = 1 .
x1

So, denoting this event by LR(|A, B ), we get, after averaging the conditional probability lower

P LR( A, B |) P LR(B ) 1/2. Since and its reection intersect only in at most one hexagon,

the FKG-Harris inequality,

(12.19) {e.RSWpkey}

Some people (including Grimmett) call Proposition 12.21 the box-crossing lemma, and (12.18) or

{ex.1armpoly}

{t.HarrisKesten}

Theorem 12.22 (Harris-Kesten theorem [Har60],[Kes80]). pc (Z2 , bond) = pc (TG, site) = 1/2. Moreover, (pc ) = 0. Sketch of proof. The simpler direction is pc 1/2: the RSW estimates in annuli, Proposition 12.21 (ii),

imply the polynomial decay of Exercise 12.31, hence (1/2) = 0. However, Harris did not have the

RSW estimates available, hence followed a dierent route. If there is an innite white cluster on TG at p = 1/2 a.s., then, by symmetry, there is also an innite black cluster. We know from Theorem 12.10 that there is a unique white and a unique black innite cluster. Given the insertion and 138

deletion tolerance and the FKG property, it is pretty hard to imagine that a unique innite white cluster can exist and still leave space for the innite black cluster, so this really should not happen. The simplest realization of this idea is due to Yu Zhang (1988, unpublished), as follows. N

? W E

S Figure 12.6: Zhangs argument for the impossibility of (1/2) > 0. Let Ci (n) for i {N, E, S, W} be the event that the North, East, South, West side of the n n

{f.Zhang}

box B (n) is connected to innity within R2 \ B (n), respectively. Also, let Di (n) be the same events in the dual percolation. Assuming (1/2) > 0, the box B (n) will intersect the innite cluster for n large enough, hence, using the FKG-inequality, we have P Ci (n)c
4

i{N,E,S,W}

Ci (n)c 0

as n ,

and the same for Di (n). Therefore, P


i{N,E,S,W}

Ci (n) Di (n) > 0

for n large enough. However, this event, together with the uniqueness of both the primal and the dual innite cluster, is impossible due to planar topology. See Figure 12.6. crossed with positive probability (the RSW lemma), hence, with a bit of experience with the sharp threshold phenomena in critical systems, discussed in Section 12.2, it seems plausible that by raising the density to any 1/2 + , the probability of crossings in very large boxes will be very close to 1. Indeed, this can be proved in several ways. Kesten, in order to use the Margulis-Russo formula (12.9), right crossing of the n n square. This is not surprising: it is easy to prove using two exploration rst of all proved that E1/2 [ |Piv(n)| ] c log n, where Piv(n) is the set of pivotal bits for the leftFor the direction pc 1/2, the idea is that at p = 1/2 we already have that large boxes are

paths that there is a pivotal point with positive probability, at distance at least n/4, say, from the to produce logarithmically many pivotals with high conditional probability. See Figure 12.7. 139

boundary of the square, and then one can use RSW in dyadic annuli around that rst pivotal point

00000 11111 00000 11111 00000 11111 00000 11111 11111 00000 11111 00000 0000 1111 11111 00000 00000 11111 00000 11111 0000 1111 00000 11111 00000 11111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 00000 11111 00000 11111 1111 0000 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 00000 11111 00000 0000 1111 0000 11111 1111 0000 1111 00000 11111 00000 0000 1111 0000 1111 00000 11111 11111 00000 11111 0000 1111 0000 1111 00000 11111 00000 11111 0000 1111 00000 11111 0000 1111 00000 11111 00000 11111 0000 1111 00000 11111 00000 11111 00000 11111 0000 1111 00000 11111 00000 11111 00000 11111 0000 1111 00000 11111 00000 11111 00000 11111 0000 1111 00000 11111 00000 11111 00000 11111 00000 11111

Figure 12.7: Producing c log n pivotals in an n-box with positive probability. In fact, the same proof shows that assuming that p satises 1 > Pp [ LR(n) ] > , we still
d dp Ep [ LR(n) ]

{f.logpiv}

LR(n) must be at most C/ log n. Therefore, for all , > 0, if n > n, , then P1/2+ [ LR(n) ] > 1 . Russo argument in the n 2n rectangle, for large enough n we have P1/2+ [ LR(n, 2n) ] > 1 .

for all p with 1 > Pp [ LR(n) ] > , but that means that the size of this -threshold interval for

have Ep [ |Piv(n)| ] c() log n. Therefore, by the Margulis-Russo formula,

c() log n

Moreover, either from the RSW-inequality (12.19) for general p, or from doing the above Margulis-

(12.20) {e.epsdelta}

A slightly dierent route to arrive at the same conclusion, with less percolation-specic arguments but with more sharp threshold technology, is to use the BKKKL Theorem 12.17 instead of the Margulis-Russo formula. Then we need to bound only the maximal inuence mp
LR(n)

of any bit on

the event LR(n). Well, if a bit is pivotal, then it has the alternating 4-arm event to the sides of the n n square. Wherever the bit is located, at least one of these primal and one of these dual arms is of length n/2, hence
LR(n) mp Pp 0 B (n/2) P1p 0 B (n/2)

c2 n2

for all p ,

by Exercise 12.31. Then Theorem 12.17 shows that the -threshold interval for LR(n) is at most C/ log n, and then we get (12.20) again. Now, the highly probable large crossings given by (12.20) can be combined using the FKG property and a simple renormalization idea to give long connections. Namely, take a tiling of the innite lattice by n n boxes, giving a grid Gn isomorphic to Z2 (with the n-boxes as vertices process on Gn : declare the edge between two boxes open if the n 2n or 2n n rectangle given by direction. See Figure 12.8. edges do not share an endpoint, then their states are independent of each other. Therefore, a Peierls 140 The probability of each edge of Gn to be open is larger than 1 if n is large enough, and if two and with the obvious neighbouring relation), and dene the following dependent bond percolation the union of these two boxes is crossed in the long direction and each box is crossed in the orthogonal

Figure 12.8: Supercritical renormalization: bond percolation on the grid Gn of n-boxes. argument establishes the existence of an innite cluster in this renormalized percolation process, which implies the existence of an innite cluster also in the original percolation. This nishes the proof of (1/2 + ) > 0. At criticality, there are large nite clusters, but there is no innite one. It is tempting to try to dene a just-innite cluster at pc . A natural idea is to hope that the conditional measures Pp [ | 0 Bn (o) ] have a weak limit as n . This limit indeed exists [Kes86], and is called that many other natural denitions yield the same measure [J ar03]. See also [HamPS12]. Exercise 12.32. (a) Show that the conditional FKG-inequality does not hold: nd three increasing events A, B, C (b) Show that conditional FKG would imply that Pp [ | 0 Bn+1 (o) ] stochastically dominates in some Ber(p) product measure space such that Pp [ AB | C ] < Pp [ A | C ] Pp [ B | C ]. Kestens Incipient Innite Cluster. It was later shown by Kestens PhD student Antal J arai

{f.renorm}

Pp [ | 0 Bn (o) ] restricted to any box Bm (0) with m < n. (However, this monotonicity is not known and might be false, and hence it was proved without relying on it that, for p = pc (Z2 ), these measures have a weak limit as n , the IIC.) Here will come a very short intro to the miraculous world of statistical mechanics in the plane at criticality, see [Wer07, Wer03, Schr07]. Very briey, the main point is that many such models have conformally invariant scaling limits: the classical example is that simple random walk on any planar lattice converges after suitable rescaling to the conformally invariant planar Brownian motion (Paul L evy, 1948). Similar results hold or should hold for the uniform spanning tree, the loop-erased random walk, critical percolation, critical Ising model, the self-avoding walk (the n limit of the uniform measures on self-avoiding walks of length n), domino tilings (the uniform measure on perfect matchings), the Gaussian Free Field (the canonical random height function), and the FK random-cluster models. For instance, for percolation, although the value of pc is a lattice-dependent local quantity (see Conjecture 14.11), critical percolation itself should be universal: viewed from far, it should look the same and be conformally invariant on any planar lattice, even though 141

criticality happens at dierent densities. However, the existence and conformal invariance of a critical percolation scaling limit has been proved so far only for site percolation on the triangular lattice, by Stas Smirnov (2001); there is some small combinatorial miracle there that makes the proof work. On the other hand, Oded Schramm noticed in 1999 that using the conformal invariance and the Markov property inherent in such random models, many questions can be translated (via the Stochastic L owner Evolution) to It o calculus questions driven by a one-dimensional Brownian motion. The methods developed in the last decade are good enough to attack the popular percolation questions regarding critical exponents: for instance, the near-critical percolation probability satises (p) = (p pc )5/36+o(1) as p pc (proved for site percolation on the triangular lattice). On the other hand, Conjecture 12.7 is open for Zd with 3 d 18, and all non-Abelian

amenable groups. For Zd with d 19, a perturbative Fourier-type expansion method called the

Hara-Slade lace expansion works, see [Sla06]. Again, this method is good enough to calculate critical exponents, e.g., (p) = (p pc )1+o(1) as p pc , which are conjecturally shared by manymany transitive graphs, namely all mean-eld graphs; this should include Euclidean lattices for d > 6, all non-amenable groups, and probably most groups in between. This is very much open, partly due to the problem that lace expansion works best with Fourier analysis, which does not really exist outside Zd . The method relates Fourier expansions of percolation (or self-avoiding walk, etc.) quantities to those of simple random walk quantities, and if we know everything about Greens function, we can understand how percolation, etc., behaves. (One paper when it is done entirely in x-space, without Fourier, is [HarvdHS03].) The method also identies that the scaling limit should be the Integrated Super-Brownian Excursion on Rd , but does not actually prove the existence of any scaling limit. A readable simple introduction to this scaling limit object is [Sla02].
{ex.treebeta}

Exercise 12.33. Using Exercise 12.5 or 12.6, show that on a regular tree Tk , k 3, we have (p) p pc as p pc . Mean-eld criticality has been proved without lace expansion in some non-amenable cases other than regular trees: for highly non-amenable graphs [PaSN00, Scho01], for groups with innitely many ends and for planar transitive non-amenable graphs with one end [Scho02], and for a direct product of two trees [Koz11]. For tree-like graphs even pc can be computed in many cases, using multi-type branching processes [Spa09]. However, for these non-amenable cases it is not clear what the scaling limit should be: it still probably should be Integrated Super-Brownian Excursion, but on what object? Some sort of scaling limit of Cayley graphs, a bit similarly to how Rd arises from Zd , is the construction of asymptotic cones; in particular, Mark Sapir conjectures [Sap07] that if two groups have isometric asymptotic cones, then they have the same critical exponents. An issue regarding the use of asymptotic cones in probability theory was pointed out by Itai Benjamini [Ben08]: the asymptotic cone of Zd is Rd with the 1 -metric, while the scaling limits of interesting stochastic processes are usually rotationally invariant, hence the 2 -distance should be more relevant. A possible solution Benjamini suggested was to consider somehow random geodesics in the denition of the asymptotic cone; for instance, a uniform random 1 -geodesic in Z2 between (0, 0) and (n, n)

142

is very likely to be O( n)-close to the 2 geodesic, the straight diagonal line. However, it is far from clear that this idea would work in other groups.

12.4

Geometry and random walks on percolation clusters

{ss.percgeom}

When we take an innite cluster C at p > pc (G) on a transitive graph, are the large-scale geometric and random walks properties of G inherited? Of course, no percolation cluster (at p < 1) can be non-amenable, or could satisfy any IPd , since arbitrarily bad pieces occur because of the randomness. There are relaxed versions of isoperimetric inequalities that are conjectured to survive (known in some cases) but I dont want to discuss them right now. Instead, let me just give some of my favourite conjectures, and see [Pet08] for more details. As we will see in a minute, these questions and ideas are related also to such fundamental questions as Conjecture 12.11 on pc < pu above, and Conjecture 14.11 on the locality of pc below. Conjecture 12.23 ([BenLS99]). (i) If G is a transient transitive graph, then all innite percolation clusters are also transient. (ii) Similarly, positive speed and zero speed also survive. Part (i) is known for graphs with only exponentially many minimal cutsets (such as Cayley graphs of nitely presented groups) for p close enough to 1 [Pet08]. Part (ii) and hence also (i) are known for non-amenable Cayley graphs for all p > pc [BenLS99]. Part (ii) is known also for p close enough to 1 on the lamplighter groups Z2 Zd , d 3, and for all p > pc for d 2 [ChPP04]. Exercise 12.34. * Without consulting [LyPer10], show that any supercritical GW tree (you may assume E[ 2 ] < for the ospring distribution, if needed), conditioned on survival, is a.s. transient. Conjecture 12.24 (folklore). The giant cluster at p = (1+ )/n on the hypercube {0, 1}n has poly(n) mixing time, possibly C n log n. By the result of [AngB07] mentioned in the last paragraph of Section 12.2, the conjecture is proved for p > n1/2+ , for any > 0. A simple discovery I made (in 2003, literally on the margin of a paper by my advisor) was that good isoperimetry inside C would follow from certain large deviation results saying that it is unlikely for the cluster Co of the origin to be large but nite. My favourite formulation is the following, called exponential cluster repulsion. Conjecture 12.25 ([Pet08]). If G is a transitive (unimodular?) graph, then for all p > pc (G), Pp |Co | < but an innite C with e(Co , C ) > n < Cp exp(cp n), where e(C1 , C2 ) is the number of edges with one endpoint in C1 another in C2 . In [Pet08], I proved this result for Zd , all p > pc (Zd ), and for any innite graph with only exponentially many minimal cutsets for p close enough to 1. The result implies that a weaker (the so-called anchored) version of any isoperimetric inequality satised by the original graph is still satised by any innite cluster. These anchored isoperimetric inequalities are enough to prove, e.g., 143
{c.repulsion} {c.BLS}

transience (using Thomassens criterion (5.2)), therefore, Conjecture 12.25 would imply part (i) of Conjecture 12.23. However, they are only conjecturally strong enough for return probabilities. But, on Zd , not only the survival of the anchored d-dimensional isoperimetric inequality follows, but also the survival of almost the entire isoperimetric prole (as dened in Section 8.2), hence, using the evolving sets theorem, the return probabilities inside C are pn (x, x) Cd nd/2 . This was known before [Pet08], but this is the simplest proof by far, or at least the most conceptual. A nice result related to Conjecture 12.25 (but very far from the exponential decay) is a theorem by Tim ar: in a non-amenable unimodular transitive graph, any two innite clusters touch each other only nitely many times, a.s. [Tim06a] The reason that Conjecture 12.25 is known on Zd for p arbitrarily close to pc is due to a fundamental method of statistical physics (in particular, percolation theory), presently available only on Zd , called renormalization. See [Pet08] for a short description and [Gri99] for a thorough one. This technique uses that Zd has a tiling with large boxes such that the tiling graph again looks like Zd itself. One algebraic reason for this tiling is the subgroup sequence (2k Zd ) k=0 , given by the such tilings are possible: expanding endomorphism x 2x, already mentioned in Section 4.4. It is unclear on what groups
{q.tiling}

Question 12.26 ([NekP09]). A scale-invariant tiling of a transitive graph G is a decomposition of its vertex set into nite sets {Ti : i I } such that (1) the subgraphs induced by these tiles Ti are connected and all isomorphic to each other; (2) the following tiling graph G is isomorphic to G: the

vertex set is I , and (i, j ) is an edge of G iff there is an edge of G connecting Ti with Tj ; (3) for each n 1, there is such a tiling graph Gn+1 on Gn in such a way that the resulting nested sequence of tiles T n (x) Gn containing any xed vertex x of G exhausts G. Furthermore, G has a strongly scale-invariant tiling if each T n is isomorphic to T n+1 . If G has a scale-invariant tiling, is it necessarily of polynomial growth? From the expanding endomorphism of Zd it is trivial to construct a strongly scale-invariant tiling, but this uses the commutativity of the group very much. A harder result proved in [NekP09] is that the Heisenberg group also has Cayley graphs with strongly scale-ivariant tilings. Although this geometric property of the existence of scale-invariant tilings looks like a main motivation and ingredient for percolation renormalization, it is actually not that important for the presently existing forms of the method. See [NekP09] for more on this. On the other hand, there is a key probabilistic ingredient missing, whose proof appears to be a huge challenge: Question 12.27 ([NekP09]). Let G be an amenable transitive graph, and let C be its unique innite percolation cluster at some p > pc (G), with density (p). For a nite vertex set W G, let a connected Flner sequence Fn G such that for almost all percolation congurations,
n

{q.unique}

ci (W ) denote the number of vertices in the ith largest connected component of W . Does there exist

lim

c2 (Fn C ) = 0, c1 (Fn C )

144

moreover,
n

lim

Why would this be true? The main idea is that the intersection of the unique percolation cluster with a large Flner set should not fall apart into small pieces, because if two points are connected to each other in the percolation conguration, then the shortest connection should not be very long. This is also the underlying idea why random walk on the innite cluster should behave similarly to the original graph. Furthermore, it is closely related to Conjecture 12.19 on the uniqueness of the giant cluster in nite transitive graphs, and would clearly imply Oded Schramms conjecture on the locality of pc in the amenable setting, Conjecture 14.11 below. An armative answer would follow from an armative answer to the following question, which can be asked for any transitive graph, not only amenable ones. Let denote the percolation conguration, and dist (x, y ) the chemical distance, i.e., the distance measured inside the percolation clusters (innite if x and y are not connected). Question 12.28. Is it true that for any transitive (unimodular?) graph, for any p > pu (G) there is a K (p) < such that for any x, y V (G), Kx,y (p) := E dist (x, y ) dist(x, y ) dist (x, y ) < < K (p) < , (12.21)
{e.chemical} {q.chemical}

|Fn C | c1 (Fn C ) = lim = (p) ? n |Fn | |Fn |

moreover, there is a < 1 with K (p) < O(1) (p pu )+o(1) as p pu ? The niteness K (p) < might hold for almost all values p > pc , and the exponent < 1 might hold with pu replaced by pc ; however, there could exist special values (such as p = pu on certain graphs) with K (p) = . I do not presently see a conceptual reason for < 1 to hold this condition is there only because we will need it below. On the other hand, I do not know any example where some > 0 is actually needed. A non-trivial example where one might hope for explicit calculations is site percolation on the hexagonal lattice [Wer07]. If x and y are neighbours in G, then the event that they are connected in and the shortest path between them leaves the r-ball but does not leave the 2r-ball around them is comparable to the largest radius of an alternating 4-arm event around them being between r and 2r. (This rough equivalence uses the so-called separation of arms phenomenon.) On the other hand, the outer boundary of a closed cluster of radius roughly r is an open path having length r4/3+o(1) with large probability. (This follows from the dimension being 4/3 and from [GarPS10b].) These, imply that together with the 4-arm exponent 5/4 and the obvious observation that P[ dist (x, y ) < ] > c > 0, n5/4+o(1) = c1 4 (n) P dist (x, y ) > n dist (x, y ) < c2 4 (n3/4 ) = n15/16+o(1) , which does not decide the niteness of the mean at pc , but, using the theory of near-critical percolation and the Antal-Pisztora chemical distance results (which are easy in 2 dimensions), it implies that 1/9 in (12.21). This value comes from the following facts: on the level of exponents, 145 The existence of K (p) < is known on Zd [AntP96], but the p pc behaviour is not known.

ignoring smaller order eects, (1) at p = pc + , percolation looks critical up to scales 4/3 and supercritical at larger scales; (2) the probability that the radius of the 4-arm event is about 4/3 is 4/35/4 ; and (3) at this scale, the length of the path is at most 4/34/3 ; so, altogether, we get an expectation at most 5/316/9 = 1/9 . Further examples to try are the groups where critical percolation has the mean-eld behaviour: amenable unimodular transitive graphs), the expected chemical distance between neighbours is cause continuity is far from clear here. known to be nite [KoN09a]. However, this is not known to imply that limppc K (p) < , beAs Gady Kozma pointed out to me, a natural candidate to kill Question 12.28 for all p > pc could be percolation near pu on graphs where there is already a unique innite cluster at pu . The canonical example of such a graph is a non-amenable planar graph G with one end here pu (G) understood using the mean-eld theory at pc (G ), and, our back-of-the-envelope arguments, based though barely. So, probably any > 0 would do for p pu ; but, again, proving this seems hard. in a planar unimodular transitive non-amenable graph with one end satises Kx,y (pu ) = ? An armative answer to Question 12.28 would also prove that pc < pu on non-amenable transitive graphs, via the following unpublished gem of Oded Schramm (which can now be watched in [Per09] or read in [Koz11]): Theorem 12.29 (O. Schramm). Let G be a transitive unimodular non-amenable graph, with spectral radius < 1. Consider percolation at pc , where we know from Theorem 12.8 that there are only nite clusters. Take a SRW (Xn ) n=0 on G, independent from the percolation. Then, in the joint probability space of the percolation and the SRW, P X0 and Xn are in the same percolation cluster at pc 2n . Proof. Consider the following branching random walk on G. Start m + 1 particles from o, doing independent simple random walks for n steps. Then each particle of this rst generation branches into m particles, and all the (m + 1)m particles (the second generation) continue with independent SRWs for n more steps, then each particle branches into m particles again, and so on. The expected number of particles at o at the end of the tth generation, i.e., after tn SRW steps, is ptn (o, o) (m + 1)mt1 exponentially fast in t. Since p2n (o, x) p2n (o, o) for any x V (G) by Lemma 9.1, we get that for the branching random walk is transient. Now, using this branching walk and Ber(pc ) percolation on G, we dene a bond percolation process on the rooted (m + 1)-regular tree Tm+1 that indexes the branching random walk: if v is 146 tn mt , where means equal exponential growth rates. If m < n , then this expectation decays any xed radius r > 0, the expected number of total visits to Br (o) by all the particles is nite, i.e.,
{t.odedpcpu}

here, whenever the mean-eld behaviour is actually proved (such as Zd , d 19, or planar non-

is equal to 1 pc (G ), where G is the planar dual to G. Here, at pu (G), many things can be

on [KoN09b], suggest that Ppu > dist (x, y ) > r 1/r for neighbours x, y . Hence K (pu ) = ,

Exercise 12.35. *** Is it true that the chemical distance in percolation at pu between neighbours

a child of u in Tm+1 , then the edge (u, v ) will be open in if the branching random walk particles corresponding to u and v are at two vertices of G that are connected in the Ber(pc ) percolation. So, not a Bernoulli percolation. At rst sight, it does not even look invariant, since the tree is rooted and the ow of time for the random walk has a direction. However, using that G is unimodular and SRW is reversible, it can be shown that is in fact Aut(Tm+1 )-invariant: Exercise 12.36. Show that the above percolation process on Tm+1 is automorphism-invariant. We have now all ingredients ready for the proof. We know from Theorem 12.8 that we have only nite clusters in Ber(pc ). Together with the transience of the branching random walk for m < n , we get that has only nite clusters. So, by Exercise 12.11, the average degree in is at most 2. That is, (m + 1) P[X0 Xn ] 2. So, taking m := n 1, we get that P[X0 Xn ] 2/n 2n , as desired. The way this theorem is related to pc < pu is the following. (1) The theorem implies that we have a denite exponential decay of connectivity at pc in certain directions; (2) it seems reasonable that we still should have a decay at pc + for some small enough > 0 (we will explain this in a second); (3) but then pc + < pu , since from the FKG-inequality it is clear that at p > pu any two points are connected with a uniformly positive probability: at least with (p)2 . The chemical distance result would be used in making the second step rigorous. Assuming pc = pu and taking p = pu + with small > 0, we would have X0 and Xn in the same cluster with probability (p)2 > 0. On the other hand, their distance in G is at most n, hence, by (12.21) and the level of percolation from p to pu , the probability that we have a connection between X0 and Xn if < 1 and is chosen suciently small. Markovs inequality, Pp dist (X0 , Xn ) > 2K (p)n dist (X0 , Xn ) < < 1/2. So, when lowering at pu = pc is at least (p)2 /2 (1 )2K (p)n exp(1+o(1) n), which contradicts Theorem 12.29 Finally, as pointed out in [Tim06a], good bounds on the chemical distance (see Question 12.28) can imply good cluster repulsion (see Conjecture 12.25) not only through the renormalization method (see Question 12.27), but also directly. The idea is that for neighbours x, y V (G), if Cx = Cy , into the bond percolation conguration, the endpoints of the edges ei would be connected but with then a large touching set e(Cx , Cy ) = {ei : i I } would mean that after inserting the edge {x, y } large chemical distances: am Tim Exercise 12.37 (Ad ar). Let G be a unimodular transitive graph. Assume that, for some we have Ep dist (x, y ) Cx = Cy < . Show that Ep e(Cx , Cy ) Cx = Cy < . (Hint: use the Ber(p) bond percolation, there are innitely many innite clusters and for any neighbours x, y V (G)
pc pc

the probability for any given edge of Tm+1 to be open in is P[X0 Xn ], but this is, of course,

pc

idea described above, implemented using a suitable Mass Transport.)

147

13
13.1

Further spatial models


Ising, Potts, and the FK random cluster models
{ss.Ising}

The Ising and the more general Potts models have a huge literature, partly because they are more important for physics than Bernoulli percolation is. The FK random cluster models, in some sense, form a joint generalization of Potts, percolation, and the Uniform Spanning Trees and Forests, and hence are truly fundamental. Since percolation and the USF already exhibit many of their main features, we will discuss them rather briey. See [GeHM01] for a great introduction and survey of these models (and further ones, e.g., the hard core model), [Lyo00] specically for their phase transitions on non-amenable graphs, and [BerKMP05, Sly10] for relationships between phase transitions in spatial, dynamical, and computational complexity behaviour. A future version of our notes will hopefully expand on these phase transitions a bit more. The Ising model is the most natural site percolation model with correlations between the sites. The Potts(q ) model is the obvious generalization with q states for each vertex instead of the Ising case q = 2. For the denition of these models, we rst need to focus on nite graphs G(V, E ), : V {0, 1, . . . , q 1}, consider the Hamiltonian (or energy function) H ( ) :=
(x,y )E (G)

with a possibly empty subset V V of so-called boundary vertices. For spin congurations 1 { ( x ) = ( y ) } , (13.1) {e.Hamq}

then x 0 and dene the following probability measure (called Gibbs measure) on congurations that agree with some given boundary conguration on V : P [ ] := exp(2H ( )) , Z
where Z := :|V =

exp(2H ( )) .

(13.2) {e.Gibbsq}

(The reason for the factor 2 in the exponent will become clear in the next paragraph.) This Z is called the partition function. The more disagreements between spins there are, the larger the Hamiltonian and the smaller the probability of the conguration is. The interpretation of is the inverse temperature 1/T : at large , disagreements are punished more, the system prefers order, while at small the system does not care that much: thermal noise takes over. In particular, = 0 gives the uniform measure on all congurations. A natural extension is to add an external eld with which spins like to agree. From now on, we will to be interpreted as + and magnetic spins, instead of (x) {0, 1, . . . , q 1}. So, we will consider H (, h) := h (x) (x) (y ) ,
(x,y )E (G)

focus on the Ising case, q = 2, and therefore switch to the usual Ising setup : V (G) {1, +1},

the Hamiltonian

(13.3) {e.Hamh}

x V (G )

where h > 0 means spins like to be 1, while h < 0 means they like to be 1. Note that H (, 0) =

2H ( ) |E (G)| with the denition of (13.1), hence, if we now dene the corresponding measure and

148

partition function without the factor 2, as P ,h [ ] := exp(H (, h)) Z,h


and Z,h := :|V =

exp(H (, h)) ,

(13.4) {e.Gibbsh}

then exp(H (, 0)) = K exp(2H ( )) with a constant K , and hence P ,0 [ ] = P [ ].

A further generalization is where edges (x, y ) and vertices x have their own coupling strengths

neighbouring spins like to agree, the model is called ferromagnetic, while if Jx,y < 0 for all (x, y ), then it is called antiferromagnetic. But we will stick to (13.3).

Jx,y and Jx , instead of constant 1 and h, respectively. If Jx,y > 0 for all (x, y ) E (G), i.e.,

If this is the rst time the Reader sees a model like this, they might think that the way of dening the measure P ,h from the Hamiltonian was a bit arbitrary, and Z,h is just a boring normalization factor. However, these are not at all true. First of all, since we dened the measure P ,h using a product over edges of G, it clearly satises the spatial Markov property, or in other words, it
out is a Markov random eld: if U V (G) and we set V = V \ U and U = V U V , then

for any boundary condition on V , the measure P ,h is already determined by |U . In fact, the

Hammersley-Cliord theorem says that for any graph G(V, E ), the Markov random elds satisfying

the positive energy condition are exactly the Gibbs random elds: measures that are given by an exponential of a Hamiltonian that is a sum over cliques (complete subgraphs) of G. See [Gri10, Section 7.2]. The role of the positive energy condition will be clear from the following example. A colouring of the vertices of a graph with q colours is called proper if no neighbours share their colours. The uniform distribution on proper q -colourings of a graph can be considered as the zero temperature antiferromagnetic Potts(q ) model, and it is certainly a Markov eld. But it does not have the nite energy property, and it is not a Gibbs measure, only in a certain limit. Another good reason for dening the measure from the Hamiltonian via exp(H ) is that, for

boundary condition and have E [ H ( ) ] = E , our Gibbs measures P ,h maximize the entropy. This is probably due to Boltzmann, proved using Lagrange multipliers. See, e.g., [CovT06, Section 12.1]. So, if we accept the Second Law of Thermodynamics, then we are forced to consider these Gibbs measures. Similarly to generating functions in combinatorics, the partition function contains a lot of information about the model. The rst signs of this are the following: Exercise 13.1. (a) Show that the expected total energy is E,h [ H ] = ln Z,h , with variance Var,h [H ] = E,h [ H ] .
(x), we have E,h [ M ] = h f (, h).

any given energy level E R, among all probability measures on {1}V (G) that satisfy the

{ex.partition}

average magnetization M ( ) := |V |1

(b) The average free energy is dened by f (, h) := ( |V |)1 ln Z,h . Show that for the total
x V

So far, we have been talking about the Ising (and Potts) model on nite graphs, only. As opposed to Bernoulli percolation, it is not obvious what the measure on an innite graph should be. We 149

certainly want any innite volume measure to satisfy the spatial Markov property, and, for any nite this requirement already suggests how to construct such measures: if P ,h is such a Markov eld on
out subgraphs Gn (Vn , En ), and n are random boundary conditions on V Vn sampled according to n P ,h , then the laws E P,h out U V (G) and boundary condition on V U , the distribution should follow (13.4). Note that

the innite graph G(V, E ), and Vn V is an exhaustion by nite connected subsets, with induced

(with the expectation taken over n ) will converge weakly to P ,h .

n Vice versa, any weak limit point of a sequence of measures P ,h will be a suitable Markov eld. That

limit points exist is clear from the Banach-Alaoglu theorem, since {1, 1}V or {0, 1, . . . , q 1}V with the product topology is compact. Therefore, the question is: what is the set of (the convex combinations of ) the limit points given by nite exhaustions? Is there only one limit point, or several? In particular, do dierent boundary conditions have an eect even in the limit? Intuitively, a larger increases correlations and hence helps the eect travel farther, while = 0 is just the product measure, hence there are no correlations and there is a single limit measure. Is there a phase transition in , i.e., a non-trivial critical c (0, )? This certainly seems easier when h = 0, since setting h > 0, say, aects every single spin directly, which probably overrides the eect of even a completely negative boundary condition, at least for amenable graphs. In the case h = 0, do we expect a phase transition for all larger-than-one-dimensional graphs, as in percolation? Actually, is this question of a phase transition in the correlation decay related in any way to the connectivity phase transition in percolation? Before starting to answer these questions, let us prove the FKG inequality for the Ising model, as promised in Section 12.1: Theorem 13.1. Take the Ising model on any nite graph G(V, E ), with any boundary condition
any two increasing events A and B are positively correlated: P ,h [A | B ] P,h [A].

{t.FKGIsing}

on V V , and consider the natural partial order on the conguration space {1, +1}V . Then, Proof. Recall that a probability measure on a poset (P , ) is said to stochastically dominate

another probability measure if (A) (A) for any increasing measurable set A P . Strassens theorem says that this domination is equivalent to the existence of a monotone coupling of the measures and on P P ; i.e., a coupling such that x y for -almost every pair

(x, y ) P P . The simplest possible example, which we will actually need in a minute, is the following: for P = {1, +1}, we have iff (+1) (+1), and in this case, (+1, +1) := (+1), (+1, 1) := (+1) (+1), (1, 1) := (1)

(13.5) {e.pm1example}

is the unique monotone coupling . (In general, Strassens coupling need not be unique. This is one reason for the issues around Exercise 11.3.) using the heat-bath dynamics or Gibbs sampler, which is the most natural Markov chain with the Ising model at a given temperature as stationary measure: each vertex has an independent exponential clock of rate 1, and when a clock rings, the spin at that vertex is updated according Now, the FKG inequality says that P[ | B ] stochastically dominates P[ ]. We will prove this

150

to the Ising measure conditioned on the current spins of the neighbours. One can easily check that of Markov chains that use independent local updates and keep Ising stationary; another example is the Metropolis algorithm, which we do not dene here. congurations on V \ V , respectively, and xed to equal on V forever. {Xi }i0 is just standard
+ heat-bath dynamics, with stationary measure P ,h , while {Xi }i0 is a modied heat-bath dynamics,

for < the chain is reversible. This is an example of Glauber dynamics, which is the class

We will run two Markov chains, {Xi+ }i0 and {Xi }i0 , started from the all-plus and all-minus

where steps in which a +1 would change into a 1 such that the resulting conguration would cease to satisfy B are simply suppressed (or in other words, replaced by +1 update). A simple general claim about reversible Markov chains (immediate from the electric network representation) is that the stationary measure of the latter chain is simply the stationary measure of the original chain conditioned on the event that we are keeping: P ,h [ | B ]. Now, we are not running these two Markov chains independently, but coupled in the following way: the clocks on the vertices ring at the same time in the two chains, and we couple the updates such that Xi+ Xi is maintained for

as many +1 neighbours in Xi+ as in Xi , hence the probability of the outcome +1 in a standard heat-bath dynamics update would be at least as big for Xi+ as for Xi , and it is even more so if we take into account that some of the 1 moves in {Xi+ }i0 are suppressed. Hence, by example (13.5),
we can still have Xi+ +1 Xi+1 after the update.

all i 0. Why is this possible? If Xi+ Xi and the clock of a vertex v rings, then v has at least

distributions. Since Xi+ Xi holds in the coupling for i, this stochastic domination also holds for the stationary measures, and we are done.
An immediate corollary is that if on V , then P ,h P,h . In particular, if U

The Markov chains {Xi+ }i0 and {Xi }i0 are clearly ergodic, converging to their stationary

out P U , and hence P ,h on ,h P,h on U . Consider now an exhaustion Vn V (G) with +

U V (G) and and are the all-plus congurations on out U and out U , respectively, then

n measures P,h . Any weak limit point of this sequence dominates all other possible limits, given by

+ n +1 out Vn , giving rise to the monotone decreasing (w.r.t. stochastic domination) sequence of

any sequence Vn V (G) and any n on out Vn , since for any Vn there exists an m0 (n) such that + Vn Vm for all m m0 (n). Therefore (see Exercise 13.2), this limit point for (Vn , n ) must be

unique, and cannot depend even on the exhaustion. It will be denoted by P+ ,h . Similarly, there is
{ex.stochdom}

a unique minimal measure, denoted by P ,h . Exercise 13.2. , then = . Conclude that if two innite volume limits of Ising measures on an innite (a) Show that if and are probability measures on a nite poset (P , ), and both and

graph dominate each other, then they are equal.

(c) On any transitive innite graph, the limit measures P+ ,h and P,h are translation invariant. They are equal iff E+ ,h (x) = E,h (x) for one or any x V (G).

all x V (i.e., all the marginals coincide), then = . (Hint: use Strassens coupling.)

(b) Let P = {1, +1}V with coordinatewise ordering. Show that if on P , and |x = |x for

151

Exercise 13.3. On any transitive innite graph, the limit measures P+ ,h and P,h are ergodic.

In summary, at given and h, all innite volume measures are sandwiched between P ,h and P+ ,h . So, to answer the question of the uniqueness of innite volume measures, we just need to
+ decide if P ,h = P,h . Ernst Ising proved in 1924 in his PhD thesis that the Ising model on Z has

no phase transition: there is a unique innite volume limit for any given h R and R0 . Based on this, he guessed that there is no phase transition in any dimension. However, he turned out to be wrong: using a variant of the contour method that we saw in the elementary percolation result in on Zd , d 2. More precisely, he proved the existence of some values 0 < (d) < + (d) <
out d such that if n is the all-plus spin conguration on Z d [n, n] , then
n lim inf E ,0 (0) > 0

1/3 pc (Z2 , bond) 2/3, Rudolph Peierls showed in 1933 that for h = 0 there is a phase transition

for > + , (13.6) {e.IsingPeierls} for < .

n lim E ,0 (0) = 0

In particular, for > + (d), there are at least two ergodic translation-invariant innite volume measures on Zd , while for < (d) there is uniqueness (see Exercise 13.2 (c)). We will prove (13.6) similar result holds for the Potts(q ) models, as well. directly from 1/3 pc (Z2 , bond) 2/3 once we have dened the FK random cluster measures. A

In 1944, in one the most fundamental works of statistical mechanics, Lars Onsager showed (employing partly non-rigorous math, with the gaps lled in during the next few decades) that 1 ln(1 + 2) 0.440687 for h = 0: for c , there is a unique innite volume measure, c (Z2 ) = 2

n while there is non-uniqueness for > c . He also computed critical exponents like E c [ (0)] =

n1/8+o(1) .

= 0.440687

= 0.45

Figure 13.1: The Ising model with Dobrushin boundary conditions (black on the right side of the box and white on the left), at the critical and a slightly supercritical inverse temperature.

152

Onsager proved his results by looking at the partition function: the critical points need to occur at the singularities of the limiting average free energy f (, h) = lim|Vn | ( |Vn |)1 log Z,h , and the critical behaviour must be encoded in the analytic properties of the singularities. Why is this so? From Exercise 13.1, we can see that a big change of the free energy corresponds to big changes of quantities like the total energy and the average magnetization. These quantities are global, involving the entire nite domain Vn , not just a xed window inside the domain, far from the boundary, hence it is not immediately clear that local quantities that behave well under taking weak limits, like E[ (0) ] or even the average magnetization in the innite limit measure, will also have interesting changes in their behaviour. Nevertheless, using that Zd is amenable, the contribution of the boundary to these global quantities turns out to be negligible, and it can be proved that the singularities of f (, h) describe the critical points. That there is no phase transition on Zd for h = 0 was rst proved in 1972 by Lebowitz and Martinz |S | ai,j lie on the unit circle.

L of and by Ruelle, using the Lee-Yang circle theorem: if A = (ai,j )n i,j =1 is a real symmetric matrix, then all the roots of the polynomial P (z ) := Therefore, the Ising partition function Z,h on any nite graph, as a polynomial in h, can have roots only at purely imaginary values of h. This can be used to prove that, for any h = 0, any limit f (, h) is dierentiable in . This works for any innite graph G(V, E ). A dierent approach (by Preston in 1974) is to use the so-called GHS concavity inequalities [GHS70] to prove the same dierentiability. However, the connection between dierentiability and uniqueness works only on amenable transitive graphs: Jonasson and Steif proved that a transitive graph is nonamenable iff
there is an h = 0 such that P+ ,h = P,h for some < . See [JoS99] and the references there for S [n] iS,j S

pointers to the above discussion.

We have hinted a couple of times at the Ising correlation and uniqueness of measure questions being analogous to the phase transition in the existence of innite clusters in Bernoulli percolation. Indeed, correlations between Ising spins can be interpreted as connectivity in a dierent model, the FK(p, q ) random cluster model with q = 2. Here is the denition of the model, due to Fortuin and Kasteleyn [ForK72]; see [Gri06] for a thorough treatment of the model, including the history. disjoint subsets, for any E , let P FK(p,q) [ ] := On a nite graph G(V, E ), with a so-called boundary V V together with a partition of it into p|| (1 p)|E \| q k () ZFK( p,q)
with ZFK( p,q) := E

p|| (1 p)|E \| q k () ,

(13.7) {e.FK}

where k ( ) is the number of clusters of / , i.e., the clusters given by in the graph where the vertices in each part of are collapsed into a single vertex. we punish a larger number of components more and more, so we get a single spanning cluster in the limit, and then if we let p 0, then we will have as few edges as possible, but otherwise all congurations will have the same probability; that is, we recover the Uniform Spanning Tree, see Section 11.2. If G(V, E ) is innite, and Vn V is an exhaustion by nite subsets, then taking
in Vn := gives FUSF in the limit, while taking Vn := V Vn with n := {Vn } gives WUSF.

The q = 1 case is clearly just Bernoulli(p) bond percolation. Furthermore, if we let q 0, then

153

was introduced somewhat implicitly in [SwW87] and explicitly in [EdS88]. Given G(V, E ) and a boundary partition , color each cluster of / in the FK(p, q ) model independently with one of q colors, then forget , just look at the q -coloring of the vertices. We will prove in a second condition on V , instead of a function : V {0, 1, . . . , q 1}, will be only a partition telling which spins in V have to agree with each other. Note that, in the resulting Potts(q ) model, the
1 that we indeed get the Potts(q ) model, with = (p) = 2 ln(1 p), except that the boundary

For q {2, 3, . . . }, we can retrieve the Potts(q ) model via the Edwards-Sokal coupling, which

correlation between the spins (x) and (y ) is exactly the probability P FK(p,q) [xy ], since if they are connected, they get the same color in the coupling, and if they are not, then their colors will be independent. In particular, the Ising expectation E+ (p),2 [ (0)] is the probability of the connection to get the Ising + measure, we need to condition the measure to have the +1 spin on [n, n]d , correlations as connections also shows that it is more than natural that a larger p gives a higher
1 ln(1 p). in the formula (p) = 2

{0 [n, n]d} in the wired FK measure FK(p, 2) on [n, n]d . (In the Edwards-Sokal coupling, but, by symmetry, that does not change the probability of the connection.) The interpretation of

Let us denote the Edwards-Sokal coupling of the FK(p, q ) conguration and the q -coloring P ES(p,q) [, ] = p|| (1 p)|E \|
(x,y ) ZFK( p,q)

of its clusters by 1{(x)=(y)} , (13.8) {e.ES}

where the formula is clear from the facts that for any given , the number of compatible q -colorings is q k () , and that we need to arrive at (13.7) after summing over these colorings . So, we need to prove that the marginal on is the Potts(q ) model with = (p) and boundary partition . Fix a q -coloring ; we may assume that it is compatible with (i.e., (x) = (y ) for all (x, y ) ), otherwise with P ES(p,q) [, ]

= 0 come in pairs: (x, y ) can be kept in or deleted, leaving other edges intact.

P ES(p,q) [, ] = 0. If (x, y ) E and (x) = (y ), then the congurations

On the other hand, if (x) = (y ), then (x, y ) whenever P ES(p,q) [, ] = 0, i.e., such an edge (x, y ) always contributes a factor 1 p to the probability of the conguration. Summarizing, P ES(p,q) [, ] =

1 ZFK( p,q) 1
ZFK( p,q)

(x,y)E (x)=(y)

(p + 1 p)

ary partition , and with normalization ZFK(p,q) instead of Z,q , but since we are talking about probability measures, these normalizations actually have to agree:
ZFK( p,q) = Z (p),q .

using the choice (1 p) = exp(2 ). This is exactly the Potts(q )-measure (13.2), just with bound-

exp 2

(x,y)E \ (x)=(y)

(1 p)

(x,y )

1{(x)=(y)}

(x,y )E \

1 { ( x ) = ( y ) }

(x,y )

1{(x)=(y)} ,

(13.9) {e.PottsFKpart}

This concludes the verication of the Edwards-Sokal coupling.

154

Well,

The FK(p, q ) model satises the FKG-inequality for q 1. First of all, why is q 1 important? PFK(p,q) (x, y ) |E \{e} = p if {x y } in E \ {e} otherwise ;

i.e., an existing connection increases the probability of an edge iff q > 1. Having noted this, the proof of the FKG inequality becomes very similar to the Ising case, Theorem 13.1: consider the FK heat-bath dynamics with independent exponential clocks on the edges, and updates following
(13.10). For q 1, this dynamics {i }i0 is attractive in the sense that if i i in the natural partial order on {0, 1}E , then P e i+1 i P e i +1 i , and hence we can maintain the

p p+(1p)q

(13.10) {e.FKFKG}

monotone coupling of the proof of Theorem 13.1 for all i 0, proving the FKG inequality.

For q < 1, there should be negative correlations, but this is proved only for the UST, which is a

determinantal process. This is an open problem that have been bugging quite a few people for quite a long time; see [BorBL09] for recent results on negative correlations. Exercise 13.4. For the FK(p, q ) model on any nite graph G(V, E ) with boundary V V , show the following two types of stochastic domination:

(b) Given any on V , if p p , then P FK(p,q) PFK(p ,q) on V .

(a) If on V , then P FK(p,q) PFK(p,q) on V .

(c) Conclude for the + limit Ising measure on any innite graph G(V, E ) that if E+ ,h [ (x)] > 0 for some x V , then the same holds for any > . Consequently, the uniqueness of the Ising limit measures is monotone in .

Exercise 13.5. Show a third type of stochastic domination for the FK(p, q ) model on a nite graph:
if p (0, 1) and 1 q q , then P FK(p,q ) PFK(p,q) .

Similarly to the Ising model, the FKG inequality implies that the fully wired boundary condition (where has just one part, V ) dominates all other boundary conditions, and the free boundary (where consists of singletons) is dominated by all other conditions. Therefore, in any innite graph, the limits of free and wired FK measures along any nite exhaustion exist and are unique, denoted by FFK(p, q ) and WFK(p, q ). On a transitive graph, they are translation invariant and ergodic. However, regarding limit measures, there is an important dierence compared to the Ising model. measure, the boundary conditioning is on all the connections in V \ U (just like in (13.10)), which In formulating the spatial Markov property over nite domains U V (G) for an innite volume

is not as local as it was for the Potts(q ) model. Therefore, it is not clear that any innite volume limit measure actually satises the spatial Markov property, neither that any Markov measure is a convex combination of limit measures. In fact, although these statements are expected to hold,

they have not been proved; see [Gri06, Chapter 4] for more information. Nevertheless, it is at least known that all Markov measures and all innite volume measures are sandwiched between FFK(p, q ) and WFK(p, q ) in terms of stochastic domination. Since we are interested in connections in the FK model, it is natural to dene the critical point pc (q ) in any innite volume limit measure as the inmum of p values with an innite cluster. 155

Fortunately, on any amenable transitive graph, there is actually only one pc (q ), independently of which limit measure is taken, because of the following argument, see [Gri06]. Using convexity, any of the limiting average free energy functions has only a countable number of singularities in p, which implies that, for any q , there is only a countable number of p values where FFK(p, q ) = WFK(p, q ).
F would dier for the entire interval p (pW c (q ), pc (q )), contradicting countability. W It is clear that pF c (q ) pc (q ), but if this was a strict inequality, then the free and wired measures

that (13.10) implies that P p) for p = FK(p,q) stochastically dominates Bernoulli(

We can easily show that 0 < pc (q ) < 1 on Zd , for any d 2 and q 1. The key is to notice
p p+(1p)q ,

and is

stochastically dominated by Bernoulli(p) bond percolation, for any , and then the claim follows from 0 < pc (Zd ) < 1 in Bernoulli percolation. The proof of (13.6) is also clear now: as we noticed above,
d the Ising expectation E+ (p),2 [ (0)] is exactly the probability of the connection {0 [n, n] } in

the wired FK measure FK(p, 2) on [n, n]d . But this probability is bounded from above and below by the probabilities of the same event in Bernoulli(p) and Bernoulli( p) percolation, respectively, and The random cluster model also provides us with an explanation where Onsagers value c (Z2 ) = 1 2) 0.440687 comes from. Recall that the percolation critical values pc (Z2 , bond) = 2 ln(1 + we are done.

pc (TG, site) = 1/2 came from planar self-duality, plus two main probabilistic ingredients: RSW

bounds proved using the FKG-inequality, and the Margulis-Russo formula; see Theorem 12.22. So, to start with: is there some planar self-duality in FK(p, q )? Consider the planar dual to a conguration on a box with free boundary, say: we get a conguration on a box with wired boundary. See Figure 13.2. What is the law of ?

Figure 13.2: The planar dual of an FK conguration on Z2 . Clearly, | | = |E | | |, and k ( ) equals the number of faces in (which is 2 in the gure). q y
| |

{f.FWFK}

So, by Eulers formula, |V | | | + k ( ) = 1 + k ( ). If we now let y = p/(1 p), then PFK(p,q) [ ] y


| | k ( )

| |

k( )+| |

q k ( ) .

Well, this is a random cluster model for , with the same q and y = q/y ! Or, in terms of p and

156

p , we get
q 1+ q

p p = q. (1 p) (1 p ) is the self-dual point on Z2 : the planar dual of the free measure

Therefore, p = psd (q ) =

becomes the wired measure at the same value psd . Just as in the case of percolation with p = 1/2, one naturally expects that this is also the critical point pc (q ). This was proved for all q 1 only recently, gives Onsagers value.
1 ln(1 p) by Vincent Beara and Hugo Duminil-Copin [BefDC10]. Substituting q = 2 and (p) = 2

The main obstacle Beara and Duminil-Copin needed to overcome was that the previously known RSW proofs for percolation (e.g., the one we presented in Proposition 12.21) do not work in the presence of dependencies: an exploration path has two sides, one having positive, the other having negative inuence on increasing events, so we cannot use FKG. Nevertheless, they found an only slightly more complicated argument, exploring crossings and gluing them, where the information revealed can be easily compared with symmetric domains with free-wired-free-wired boundary conditions, and hence one can use symmetry, just as in the percolation case. However, here comes another issue: the dual of the free measure is wired, so, what measure should we work in to have exact self-duality and the symmetries needed? The solution is to use periodic boundary conditions, i.e., to work on a large torus, and draw the rectangle that we want to cross inside there. Finally, the replacement for the sharp threshold results for product measures via the Margulis-Russo formula or the BKKKL theorem (see (12.9) and Theorem 12.17) is provided by [GraG06]. Finally, what about the uniqueness of innite volume measures for FK(p, q )? On Z2 , uniqueness is known for q = 2 and all p. Non-uniqueness at a single pc (q ) is expected for q > 4, and proved for q > 25.72. See [DumCS11, Proposition 3.10] and [Gri06]. Some further results and questions on the FK model will be mentioned in Subsection 14.2.

13.2

Bootstrap percolation and zero temperature Glauber dynamics

{ss.bootstrap}

Bootstrap percolation on an arbitrary graph has a Ber(p) initial conguration, and a deterministic spreading rule with a xed parameter k : if a vacant site has at least k occupied neighbors at a certain time step, then it becomes occupied in the next step. Complete occupation is the event that every vertex becomes occupied during the process. The main problem is to determine the critical probability p(G, k ) for complete occupation: the inmum of the initial probabilities p that make Pp [complete occupation] > 0. Exercise 13.6 ([vEnt87]). Show that p(Z2 , 2) = 0 and p(Z2 , 3) = 1. (Hint for k = 2: show that a single large enough completely occupied box (a seed) has a positive chance to occupy everything.) Exercise 13.7 ([Scho92]). * Show that p(Zd , k ) = 0 for k d and = 1 for k d + 1. (Hint: use the d = 2 idea, the threshold result Exercise 12.26, and induction on the dimension.) Exercise 13.8 ([BalPP06]). *
{ex.BPTd}

157

(a) Show that the 3-regular tree has p(T3 , 2) = 1/2. More generally, show that for 2 k d, p(Td+1 , k ) is the supremum of all p for which the equation P Binom(d, (1 x)(1 p)) d k = x has a real root x (0, 1). (b) Deduce from part (a) that for any constant [0, 1] and a sequence of integers kd with limd kd /d = ,
d

lim p(Td , kd ) = .

The previous exercises show a clear dierence between the behaviour of the critical probability on Zd and Td : on Zd , once k is small enough so that there are no local obstacles that obviously make p(Zd , k ) = 1, we already have p(Zd , k ) = 0. So, one can ask the usual question: Question 13.2. Is a group amenable if and only if for any nite generating set, the resulting r-regular Cayley graph has p(Gr , k ) {0, 1} for any k -neighbor rule? The answer is known to be armative for symmetric generating sets of Zd and for any nitely generated non-amenable group that contains a free subgroup on two elements. Exercise 13.9. *** Find the truth for at least one more group (that is not a nite extension of Zd , of course). See [Scho92, BalPP06, BalP07, Hol07, BalBM10, BalBDCM10] and the references there for more on this model, both on innite and nite graphs. Bootstrap percolation results have often been applied to the study of the zero temperature Glauber dynamics of the Ising model. This Glauber dynamics was already dened in Section 13.1, but the zero temperature case can actually be described without mentioning the Ising model at all. Given a locally nite innite graph G(V, E ) with an initial spin conguration 0 {+, }V , the rings at some time t > 0, then t (x) becomes the majority of the spins of the neighbours of x, or, if there is an equal number of neighbours in each state, then the new state of x is chosen uniformly at random. Now let px (G) be the inmum of p values for which this dynamics started from a Ber(p) initial conguration of + spins xates at + (i.e., the spin of each site x will be + from a nite time T (x) onwards) almost surely. is it equal to 1/2? It is non-trivial to prove that px (Z) = 1, see [Arr83]. From the symmetry of the two competing colours, it is clear that px (G) 1/2. For what graphs dynamics is that each site has an independent Poissonian clock, and if the clock of some site x V

Question 13.3. Is it true that px (Zd ) = 1/2 for all d 2? And px (Td ) = 1/2 for d 4? Here it is what is known. Using very rened knowledge of bootstrap percolation, [Mor10] proved that limd px (Zd ) = 1/2. On the other hand, px (T3 ) > 1/2 [How00]; the reason is that a density 1/2 for the phase is just barely subcritical for producing a bi-innite path, while the + 158

(somewhere in the huge non-amenable tree), making + xation impossible.

phase is just barely supercritical, so in a short time in the dynamics, bi-innite paths will form

Exercise 13.10. * Show that limd px (Td ) = 1/2. (This follows from [CapM06], but here is a hint for a simpler proof, coming from Rob Morris. Exercise 13.8 (b) says that for p > 1/2 + , if d is large enough, then the probability of everything becoming in d/2-neighbour -bootstrap is zero. Prove that, moreover, the probability that an initially + site ever becomes is tending to 0, as d . This implies that the probability that a given vertex xates at + is tending to 1.

But then, the majority of the neighbours of any given vertex xate at +, xing the state of that vertex, as well.) It is also interesting what happens at the critical density. By denition, non-xation can happen its state innitely often. For p = 1/2 on Z2 , it is known that every site changes its state innitely in two ways: either some sites xate at + while some other sites xate at , or every site changes often [NaNS00], while, for p = 1/2 on the hexagonal lattice, some sites xate at + while all other lattice. Of course, these results do not imply that px = 1/2 on these graphs. What happens on Td for d 4 at p = 1/2 is not known, either. sites xate at [HowN03]. The reason for the dierence is the odd degree on the hexagonal

13.3

Minimal Spanning Forests

{ss.MSF}

Our main references here will be [LyPS06] and [LyPer10]. While the Uniform Spanning Forests are related to random walks and harmonic functions, the Minimal Spanning Forests are related to percolation. The Minimal Spanning Tree (MST) on a nite graph is constructed by taking i.i.d. Unif [0, 1] labels U (e) on the edges, then taking the spanning tree with the minimal sum of labels. Note that this is naturally coupled to Ber(p) bond percolation for all p [0, 1] at once. Exercise 13.11. Give a nite graph on which MST = UST with positive probability. For an innite graph G, we again have two options: we can try to take the weak limit of the MST along any nite exhaustion, with free or wired boundary conditions. The limiting measures can be directly constructed. For any e E (G), dene ZF (e) := inf max{U (f ) : f } ,

where the inmum is taken over paths in G \ {e} that connect the endpoints of e, and dene ZW (e) := inf sup{U (f ) : f } ,

i.e., can also be a disjoint union of two half-innite paths, one emanating from each endpoint of e. Then, the Free and Wired Minimal Spanning Forests are FMSF := {e : U (e) ZF (e)} and WMSF := {e : U (e) ZW (e)} . 159

where the inmum is taken over generalized paths in G \ {e} that connect the endpoints of e,

The connection between WMSF and critical percolation becomes clear through invasion percolation. For a vertex v and the labels {U (e)}, let T0 = {v }, then, inductively, given Tn , let of v is then IT(v ) := Tn+1 = Tn {en+1 }, where en+1 is the edge in E Tn with the smallest label U . The Invasion Tree
n0

Tn . We now have the following deterministic result:

{ex.IT}

Exercise 13.12. Prove that if U : E (G) R is an injective labelling of a locally nite graph, then WMSF =
v V (G) IT(v ).

not use edges outside it. Furthermore, it is not surprising (though non-trivial to prove, see [H aPS99]) that for any transitive graph G and any p > pc (G), the invasion tree eventually enters an innite invasion percolation is a self-organized criticality version of critical percolation.

Once the invasion tree enters an innite p-percolation cluster C p := {e : U (e) p}, it will

p-cluster. Therefore, lim inf {U (e) : e IT(v )} = pc for any v V (G). This already suggests that
{ex.ITsparse}

Exercise 13.13. ** Show that for G transitive amenable, (pc ) = 0 is equivalent to IT(v ) having density zero, measured along any Flner exhaustion of G. Non-percolation at pc also has an interpretation as the smallness of the WMSF trees, which we state without a proof: Theorem 13.4 ([LyPS06]). On a transitive unimodular graph, (pc ) = 0 implies that a.s. each tree of WMSF has one end. By Benjamini-Lyons-Peres-Schramms Theorem 12.8, in the non-amenable case this gives that each tree of WMSF has one end. For the amenable case, the following two exercises, combined with the fact (obvious from Exercise 13.12) that all the trees of WMSF are innite, almost give this one end result: Exercise 13.14. Show that for any invariant spanning forest F on a transitive amenable G, the expected degree is at most 2. Moreover, if all trees of F are innite a.s., then the expected degree is exactly 2. Exercise 13.15. Show that for any invariant spanning forest F on a transitive amenable G, it is not possible that a.s. all trees of F have at least 3 ends. Moreover, if all the trees are innite, then each has 1 or 2 ends. (Hint: use the previous exercise.) Exercise 13.16. *** Show that all the trees in the WMSF on any transitive graph have one end almost surely. On the other hand, FMSF is more related to percolation at pu . On a transitive unimodular graph, each tree in it can intersect at most one innite cluster of pu -percolation in the standard coupling, moreover, if pu > pc , then each tree intersects exactly one innite pu -cluster. Furthermore, adding an independent Ber() bond percolation to FMSF makes it connected. Finally, we have the following:
{ex.end1} {ex.end2} {ex.deg2} {t.WMSFoneend}

160

{t.MSFpcpu}

Theorem 13.5 ([LyPS06]). On any connected graph G, we have a.s. at most one innite cluster graphs, pc = pu is equivalent to FMSF = WMSF a.s. Proof. Since G is countable, WMSF for almost all p [0, 1] if and only if FMSF = WMSF a.s. In particular, for transitive unimodular FMSF is equivalent to having some e E (G) with the

property that P[ ZW (e) < U (e) ZF (e) ] > 0. This probability equals to the probability of the event this is positive, then, by the independence of U (e) from other edges, there is a positive Lebesgue measure set of possibilities for this U (e), and for these percolation parameters we clearly have at least two innite clusters with positive probability. Conversely, if there is a positive measure set of p values for which there are more than one innite p-clusters with positive probability, then, by insertion tolerance, there is an edge e whose endpoints are in dierent innite p-clusters with positive probability, and hence, by the independence of U (e) from other edges, there is a positive probability for A(e). There are many open questions here; see [LyPS06] or [LyPer10]. For instance, must the number MSF a single tree? The answer is yes for d = 2, using planarity. On the other hand, for d 19, where the lace expansion shows (pc ) = 0 and many other results, one denitely expects innitely contradictory conjectures have been made: d = 6 [JaR09] and d = 8 [NewS96]. It might also be that both answers are right in some sense: for Zd , the critical dimension might be 8, but for the scaling limit of the MSF, which is a spanning tree of Rd in a well-dened sense (very roughly, for any nite collection of points there is a tree connecting them, with some natural compatibility relations between the dierent trees), the critical dimension might be 6 [AiBNW99]: in d = 7, say, there could be quite long connections between some points of Zd , escaping to innity in the scaling limit. Finally, on the triangular grid, the scaling limit of a version of the MST (adapted to site percolation) is known to exist, is rotationally and scale invariant, but is conjectured not to be conformally invariant [GarPS10a] a behaviour that goes against the physicists rule of thumb about conformal invariance. many trees. Regarding the critical dimension, where the change from one to innity happens, some of trees in the FMSF and the WMSF in a transitive graph be either 1 or a.s.? In which Zd is the A(e) that the two endpoints of e are in dierent innite clusters of U (e)-percolation on G \ {e}. If

13.4

Measurable group theory and orbit equivalence

{ss.MeGrTh}

by measure-preserving transformations. We will usually assume that the action is ergodic (i.e., if U X satises g (U ) = U for all g , then (U ) {0, 1}) and essentially free (i.e., {x action on bond or site percolation congurations under an ergodic probability measure, say Ber(p)

Consider a (right) action x g (x) = xg of a discrete group on some probability space (X, B , )

X : g (x) = x} = 0 for any g = 1 ). These conditions are satised for the natural translation percolation: g (h) := (gh), as we did in the second proof of Corollary 12.16. This action of a

161

group on {0, 1}, or S with a countable S , or [0, 1] , with the product of Ber(p) or {ps }sS or Leb[0, 1] measures, is usually called a Bernoulli shift. The obvious notion for probability measure preserving (abbreviated p.m.p.) actions of some groups i (i = 1, 2) on some (Xi , Bi , i ) being the same is that there is a group isomorphism : 1 2 and a measure-preserving map : X1 X2 such that (xg ) = (x)(g) for almost all x X1 . There is a famous theorem of Ornstein and Weiss [OW87] that two Bernoulli shifts of a
sS

given f.g. amenable group (in most cases) are equivalent in this sense iff the entropies (generalizing ps log(ps ) from the case of Z-actions suitably) are equal. We will consider here a cruder

equivalence relation, which is the natural measure-theoretical analogue of the virtual isomorphism of groups, exactly as quasi-isometry was the geometric analogue see Exercises 3.7 and 3.8. Denition 13.1. Two f.g. groups, 1 and 2 are measure equivalent if they admit commuting (not necessarily probability) measure preserving essentially free actions on some measure space (X, B , ), each with a positive nite measure fundamental domain. Another natural notion is the following: Denition 13.2. Two p.m.p. actions, i acting on (Xi , Bi , i ) for i = 1, 2, are called orbit equivx X1 . (Note that this is indeed an equivalence relation.)
{d.orbiteq} {d.measequiv}

alent if there is a measure-preserving map : X1 X2 such that (x1 ) = (x)2 for almost all

A small relaxation is that the actions are stably orbit equivalent: for each i = 1, 2 there

exists a Borel subset Yi Xi that meets each orbit of i and there is a measure-scaling isomorphism : Y1 Y2 such that (x1 Y1 ) = (x)2 Y2 for a.e. x X1 . Now, the point is that two f.g. groups are measure equivalent iff they admit stably orbit equivalent actions. The proof, similar to Exercise 3.7, can be found in [Gab02], which is a great introduction to orbit equivalence. Of course, if the two actions are equivalent in the usual sense, then they are also orbit equivalent. But this notion is much more exible, as shown, for instance, by the following actions of Z and Z2 . The adding machine action of Z on X is dened by the following recursive rule: for the generator a of Z, and any w X , (0w)a = 1w (1w)a = 0wa . Consider the set X = {0, 1}N of innite binary sequences, with the Ber(1/2) product measure.

(13.11) {e.adding}

Note that this denition can also be made for nite sequences w, and the actions on the starting nite segments of an innite word are compatible with each other, hence the denition indeed makes sense for innite words. For a nite word w = w0 w1 . . . wk , if we write (w) :=
k i=0

wi 2i , then this

action has the interpretation of adding 1 in binary expansion, (wa ) = (w) + 1, hence the name. (One thing to be careful about is that for a nite sequence w of all 1s this -interpretation breaks down: one needs to add at least one zero at the end of w to get it right.) The action of Z on X is clearly measure-preserving.

162

by the Z action is the set of all words with the same tail as w. (Note that we can apply a or a1 only nitely many times.) Similarly, the orbit of some (w, w ) X X is the set of all pairs with each coordinate having the correct tail. Now, the interlacing map : X X X dened
by (w0 w1 . . . , w0 w1 . . . ) = w0 w0 w1 w1 . . . is clearly measure-preserving and establishes an orbit-

coordinate-wise. Now, these two actions are orbit-equivalent. Why? The orbit of a word w X

We can similarly dene an action of Z2 on X X = {0, 1}NN, simply doing the Z-action

equivalence. An extreme generalization of the previous example is another famous result of Ornstein and Weiss from 1980, using some of the machinery of the entropy/isomorphism theorem, see [KecM04]: any two ergodic free p.m.p. actions of amenable groups are orbit equivalent to each other. A closely related probabilistic statement is the following: Exercise 13.17. Show that a Cayley graph G(, S ) is amenable iff it has a -invariant random spanning Z subgraph. (Hint: for one direction, produce the invariant Z using coarser and coarser quasi-tilings that come from Exercise 12.14; for the other direction, produce an invariant mean from the invariant Z.) On the other hand, as a combination of the work of several people, see [Gab10, Section 10], a recent result is that any non-amenable group has continuum many orbit-inequivalent actions. A key ingredient is the following theorem, which has a percolation-theoretical proof, to be discussed in a future version of these notes: Theorem 13.6 ([GabL09]). For any non-amenable countable group , the orbit equivalence relation of the Bernoulli shift action on ([0, 1], Leb) contains a subrelation generated by a free ergodic p.m.p. action of the free group F2 . In other words, the orbits of the Bernoulli shift can be decomposed into the orbits of an ergodic F2 action. Or more probabilistically, there is an invariant spanning forest of 4-regular trees on that is a factor of i.i.d. Unif [0, 1] labels on the vertices , and the trees are indistinguishable. A nice interpretation of this result is that any non-amenable group has an F2 randosubgroup, set by in the following sense. Consider the set SG,H := {f : G H, f (1) = 1}. The group acts on this f g (t) := f (g )1 f (gt) ; (13.12) {e.SGH^G}
{t.GaboLyo} {ex.amenZ}

it is easy to check that this is indeed an action from the right. A group-homomorphism is just a xed point of this -action. A randomorphism is a -invariant probability distribution on SG,H . So, is a randosubgroup of H if there is a -invariant distribution on injective maps in SG,H . (It cannot be called a random subgroup, since it is not a distribution on actual subgroups.) Exercise 13.18. A randosubgroup of an amenable group is also amenable. Now, what is the connection to orbit equivalence? If the orbit equivalence relation on X generated by is a subrelation of the one on Y generated by H , i.e., there is a p.m.p. map : X Y such that (x ) (x)H for almost all x X , then for a.e. x X and every g there is an 163

(x, g ) H such that (xg ) = (x)(x,g) . Moreover, if the H -action is free, then this (x, g ) is uniquely determined, and it satises the so-called cocycle equation (x, gh) = (x, g ) (xg , h) . (13.13) {e.cocycle}

By writing x (g ) = (x, g ), for a.e. x X we get a map x SG,H . Now, what is the ac1 tion of on such elements of SG,H ? By (13.12), g x (gt), which, by (13.13), is x (t) = x (g )

take a random point x X w.r.t. the -invariant probability measure , then we get a -invariant measure on {x : x X }, i.e., a randoembedding of into H . Given that all amenable groups are measure equivalent, in order to distinguish non-amenable groups from each other, one needs some non-trivial invariants. The 2 -Betti numbers and the cost of groups mentioned in Chapter 11 are such examples. A future version of these notes will hopefully discuss them in a bit more detail, but see [Gab02, Gab10] for now. Well, lets try. a countable set of edges = {i : Ai Bi }iI , which are measure-preserving isomorphisms Given a probability space (X, B , ), a graphing on X is simply a measurable oriented graph:

x (g )1 x (g ) xg (t) = xg (t). So, we have a -equivariant action on the set {x : x X }, and if we

between measurable subsets Ai and Bi . The cost of a graphing, which we can also think of as the average out-degree of a random vertex in X , is cost() :=
iI

(Ai ) =
X iI

1Ai (x) d(x) .

the connected components of , which is the graph obtained from by forgetting the orientations. The cost of an equivalence relation R X X is cost(R) := inf {cost() : generates R} . A usual way of obtaining a graphing is to consider the Schreier graph of a p.m.p. action of a group on X , with = {i : X X }iI given by a generating set {i }iI of . Then the of a group is dened as cost() := inf cost(R) : R is the orbit equiv. rel. of some free p.m.p. action the same cost, and it is not known whether all groups have a xed price. Here is a more tangible denition of the cost of a group for probabilists: cost() = 1 inf E [deg(o)] : is the law of a -invariant random spanning graph on . (13.15) {e.cost2} 2 X . (13.14) {e.cost1} equivalence relation generated by is the orbit equivalence relation of the action. Now, the cost

Any graphing generates a measurable equivalence relation R on X : the equivalence classes are

A group is said to have a xed price if the orbit equivalence relations of all free p.m.p. actions have

How is this the same cost as before? A -invariant random graph is a probability measure on diagonal action of . A corresponding graphing (which may be called the cluster graphing) is the 164 = {0, 1} that is concentrated on symmetric functions on and is invariant under the

following. Fix an element o , then let , be connected by if = and the edge from o to o is open in the graph . The domain A of this measurable edge consists of those in which the edge (o, o) is open, hence the cost of this graphing (measured in ) is the -expected to , since =
1

out-degree of o, or one half of the expected total degree. (Note here that 1 is an edge from and (o, o) = ( 1 o, o) = (o, 1 o).) This is a sub-graphing of the full graphing, in which , are connected by iff = , and which is just the Schreier graphing of the action of on . Clearly, the cluster graphing -almost surely generates the orbit equivalence relation of the action iff -almost surely the graph spans the entire . Thus, (13.15) is indeed the inmum of some costs; however, we considered here only some special p.m.p. actions (, ), and only some special graphings generating the corresponding orbit equivalence relations. And we do not even get that (13.15) is at least as big as (13.14), because some of these actions on (, ) might not be essentially free. But all these issues can be easily solved: If (X, ) is a free p.m.p. action, and is a graphing generating its orbit equivalence relation, then for each x X we can consider x = {0, 1} given by (g, h) is an open edge in x (xg , xh ) is an edge in . Then the pushforward of under this x x gives an invariant spanning graph on , and E [deg o]/2 = cost(). This shows that (13.15) is not larger than (13.14). For the other direction, given an invariant random spanning graph on , we can produce a free p.m.p. action by taking a direct product of (, ) with a xed free action (say, a Bernoulli shift) (Y, ), and considering the Proposition 29.5]. natural extension of the cluster graphing that we had before, to Y . For the details, see [KecM04, The notion of cost resembles the rank (i.e., the minimal number of generators) d() of a group. Here is an explicit formulation of this idea, which has several nice applications, for hyperbolic groups, etc, see [AbN07]. sequence, the right coset tree T has the root = 0 , and a coset n+1 y is a child of n x if Let = 0 1 . . . be a chain of nite index subgroups. Corresponding to such a subgroup

2 x2 . . . in T is the boundary T of the tree, equipped with the usual metrizable topology. If we the pronite completion of with respect to the series (n )n0 ; see e.g., [Wil98].

n+1 y n x. The number of children of n x is [n : n+1 ]. The set of rays = 0 x0 1 x1

have normal subgroups, n n, then T can be equipped with a group structure, and is called Let us now assume the so-called Farber condition on the sequence (n )n0 : the natural action

of on the boundary T (, (n )) of the coset tree, with the natural Borel probability measure, is essentially free. The rank gradient of the chain is dened by RG(, (n )) := lim Exercise 13.19. * (a) Show that the Farber condition is satised if each n is normal in and
n1

d(n ) 1 . | : n |

(13.16) {e.rankgrad}

n = { 1 } .

165

(b) Show that the limit in the denition of RG always exists. (c) Show that, for the free group on k generators, RG(Fk , (n )) = k 1, regardless of n . RG(, (n )) = cost(R) 1. Theorem 13.7 ([AbN07]). Let R denote the orbit equivalence relation of T (, (n )). Then

according to , and take its connected component (x) in . This random rooted graph will be unimodular (with a denition more general than what we gave before, which applies to non-regular graphs): by the i s in being measure-preserving, the equivalence relation R is also measurepreserving, i.e., for any measurable F : X X R, we have F (x, y ) d(x) =
X y R [x] X y R [x]

We can also obtain a random rooted graph from a graphing: pick a random root x X

F (y, x) d(x) ,

as the denition of the Mass Transport Principle hence unimodularity for the random rooted graph (x). See also Denition 14.1 and the MTP (14.1) in Section 14.1.

where R [x] = (x) is the equivalence class or connected component of x. Now, this can be taken

14

Local approximations to Cayley graphs


{s.local}

For many probabilistic models, it is easy to think that understanding the model in a large box of Zd is basically the same as understanding it on the innite lattice. Okay, sometimes the nite problem is actually harder (for instance, compare Conjecture 12.19 with Lemma 12.3), but they are certainly closely related. Why exactly is this so? In what sense do the boxes [n]d converge to Zd ?

14.1

Unimodularity and socity

{ss.sofic}

A sequence of nite graphs Gn is said to converge to a transitive graph G in the BenjaminiSchramm sense [BenS01] (also called local weak convergence [AldS04]) if for any > 0 and Gn have an r-neighbourhood isomorphic to the r-ball of G. r N+ there is an n0 (, r) such that for all n > n0 , at least a (1 )-proportion of the vertices of For instance, the cubes {1, . . . , n}d converge to Zd . On the other hand, if we take the balls

Bn (o) in the d-regular tree Td , then the proportion of leaves in Bn (o) converges to (d 2)/(d 1) as n , and more generally, the proportion of vertices at distance k N from the set of leaves (i.e., on the (n k )th level Lnk ) converges to pk := (d 2)/(d 1)k+1 . And, for a vertex in Lnk , the

sequence of its r-neighbourhoods in Bn (o) (for r = 1, 2, . . . ) is not at all the same as in an innite Td , or any other transitive graph. More generally, we have the following exercise:

regular tree, and depends on the value of k . Therefore, the limit of the balls Bn (o) is certainly not
{ex.amensofic}

Exercise 14.1. Show that a transitive graph G has a sequence Gn of subgraphs converging to it in the local weak sense iff it is amenable.

166

The sequence of balls in a regular tree does not converge to any transitive graph, but there is still a meaningful limit structure, a random rooted graph. Namely, generalizing our previous denition, we say that a sequence of nite graphs Gn converges in the local weak sense to a probability distribution on rooted bounded degree graphs (G, ), where V (G) is the root, if for any r N, taking a uniform random root n V (Gn ), the distribution we get on the r-neighbourhoods around n in Gn converges to the distribution of r-neighbourhoods around in G. There is a further obvious generalization, for graphs whose edges and/or vertices are labelled by elements of some nite set, or more generally, of some compact metric space: two such labelled rooted graphs are close if the graphs agree in a large neighbourhood of the root and all the corresponding labels are close to each other. These structures are usually called random rooted networks. So, for instance, we can talk about the local weak convergence of edge-labeled digraphs (i.e., directed graphs) to Cayley diagrams of innite groups (see Denition 2.3). Continuing the previous example, the balls in Td converge to the random rooted graph depicted
on Figure 14.1, which is a xed innite tree Td with innitely many leaves (denoted by level L0 ) and

the weights that we computed above. It does not matter how this probability is distributed among the vertices of the level; say, all of it could be given to a single vertex. The reason for this
is that the vertices on a given level lie in a single orbit of Aut(Td ), and our notion of convergence

one end, together with a random root that is on level Lk with probability pk = (d 2)/(d 1)k+1

looks only at isomorphism classes of r-neighbourhoods. So, in fact, the right abstract denition for our limiting random rooted graph is a Borel probability measure on rooted isomorphism classes of rooted graphs, denoted by G , equipped with the obvious topology.

p 3 =

d 2 (d1)4

L 3

p 2 = p 1 = p0 =

d 2 (d1)3 d 2 (d1)2 d 2 d 1

L 2 L 1 L0

Figure 14.1: The tree Td with a random root (now on level L1 ), which is the local weak limit of

the balls in the d-regular tree Td , for d = 3.

{f.2ary1end}

Td . However, there is also a tempting interpretation in terms of Td itself: if was chosen uniformly at random among all vertices of Td , then it should have probability pk to be on level Lk , since

It is clear how the probabilities pk for the root in Td arise from the sequence of nite balls in

167

evidently there are pk /p(k+1) = d 1 times more vertices on level Lk than on L(k+1) . Now,

even though with the counting measure on the vertices this argument does not make sense, there are ways to make it work: one should be reminded of the denition of unimodularity in Section 12.1, just after Theorem 12.8. In fact, one can make the following more general denition: Denition 14.1. (a) Given a d-regular random rooted graph (or network) (G, ) sampled from
{d.urn}

a measure , choose a neighbour of uniformly at random, call it , and consider the joint triples (G, x, y ), where G is a bounded degree graph and x, y V (G) are neighbours, equipped with the natural topology. Now take the step to and look back, i.e., take (G, , ), which is again a random rooted graph, with root , plus a neighbour . If the two laws are the same, then the random rooted graph (G, ), or rather the measure , is called unimodular. (b) For non-regular random graphs, if the degrees are -a.s. bounded by some d, one can add above denition of the step from to , consider the delayed random walk that stays put with probability (d deg(x))/d. (This is a natural denition for random subgraphs of a given transitive graph.) (c) If the degrees are not bounded, one still can take the following limiting procedure: take a large d, add the half-loops to each vertex with degree less than d to get Gd , and then require that the total variation distance between (Gd , , ) and (Gd , , ), divided by P[ = ] (with the randomness given by both sampling (G, ) and then making the SRW step; we do this to make up for the laziness we introduced) tends to 0 as d . Another possibility in the unbounded case is to truncate (G, ) in any reasonable way to have maximal degree d, and require that the resulting random graph be unimodular, for any d. One reasonable truncation is to remove uniformly at random the excess number of incident edges for each vertex with a degree larger than d. (Some edges might get deleted twice, which is ne, and we might get several components, which is also ne.) (d) For the case E deg() < , there is an alternative to using the delayed SRW. Let be the probability measure on (G, ) size-biased by deg(), i.e., with Radon-Nikodym derivative
d d (G, )

distribution of (G, , ) on G , which is the set of all (double-rooted) isomorphism-classes of

d deg(x) half-loops to each vertex x to make the graph d-regular. In other words, in the

= deg()/E deg(). Now, if we sample (G, ) from , and then take a non-delayed

SRW step to a neighbor , then (G, , ) is required to have the same distribution as (G, , ). It is easy to see that if we take a transitive graph G, then it will be unimodular in our old sense (|x y | = |y x| for all x y , where = Aut(G)) if and only if the Dirac measure on (G, o), with equivalence class (G, x, y ), with (x, y ) E (G), we will have P (G, , ) (G, x, y )

an arbitrary o V (G), is unimodular in the sense of Denition 14.1. Indeed, for a double-rooted where d is the degree of the graph, while P (G, , ) (G, x, y ) = |y x|/d, and these are equal = |x y |/d,

for all pairs (x, y ) E (G) iff G is unimodular.

168

The simplest possible example of a unimodular random rooted graph is any xed nite graph G with a uniformly chosen root . Here, unimodularity is the most transparent by part (d) of Denition 14.1: if we size-bias the root by its degree, then the edge (, ) given by non-delayed SRW is simply a uniform random element of E (G), obviously invariant under . The example of nite graphs is in fact a crucial one. It is not hard to see that the Benjamini-

Schramm limit of nite graphs is still a unimodular random rooted graph. For instance, the rooted
tree (Td , ) of Figure 14.1 was a local limit of nite graphs. Rather famous examples are the

Uniform Innite Planar Triangulation and Uniform Innite Planar Quadrangulation, which are the local weak limits of uniform planar maps with n triangle or quadrangle faces, respectively, see [Ang03] and [BenC10], and the references there. Note that here the sequence of nite graphs Gn is random, so we want that in the joint distribution of this randomness and of picking a uniform random vertex of Gn , the r-ball around this vertex converges in law to the r-ball of G. Exercise 14.2. If G is a transitive graph with a sequence of nite Gn converging to it locally, then it must be unimodular. Same for random rooted networks that are local weak limits. Unimodularity is clearly a strengthening of the stationarity of the delayed random walk, This strengthening is very similar to reversibility, and one could try an alternative description of looking back from and working with G : namely, one could require that the Markov chain e.g., if (G, ) is a deterministic transitive graph, then reversibility holds even without unimodularity. Nevertheless, we have the following simple equivalences: Exercise 14.3. Consider Ber(p) percolation with 0 < p < 1 on a transitive graph G. Show that the following are equivalent: (a) G is unimodular; (b) the cluster of a xed vertex is a unimodular random graph with as its root; (c) the delayed SRW on the cluster generates a reversible chain on G . Formulate a version for general invariant percolations on G. Let us remark that a random rooted graph (G, ) is called stationary in [BenC10] if (non-delayed) SRW generates a stationary chain on G , and is called reversible if the chain generated on G by the looking back procedure is stationary. In other words, using the inverse of Denition 14.1 (d): if we bias the distribution by 1/ deg(), then we get a unimodular random rooted graph. Exercise 14.4. (a) Show that the Galton-Watson tree with ospring distribution Poisson(), denoted by PGW(), rooted as normally, is unimodular. (b) Show that if we size-bias PGW() by deg(), we get a rooted tree given by connecting the roots of two i.i.d. copies of PGW() by an edge, then choosing the root uniformly from the two.
{ex.PGW} {ex.limisunimod}

i.e., that the Markov chain (G, ) (G, ) given by the delayed SRW is stationary on (G , ).

(G, ) (G, ) given by the delayed SRW be reversible on (G , ). This does not always work:
{ex.percunimod}

169

Before going further, let us state a last equivalent version of Denition 14.1: we may require that the Mass Transport Principle holds in the form that the probability measure on rooted networks has the property that for any Borel-measurable function F on G , we have F (G, , x) d(G, ) =
x V (G ) y V (G )

F (G, y, ) d(G, ) .

(14.1) {e.MTPgen}

How do we get back our old Mass Transport Principle (12.4) for transitive graphs? In some sense, the transitivity of G and the diagonal invariance of f is now replaced by considering rooted networks and functions on double-rooted isomorphism-classes. Namely, if f is a diagonally invariant random function on a transitive graph G, then, in (14.1), we can take to be a Dirac mass on (G, o), with an arbitrary o V (G), and F (G, x, y ) := Ef (x, y ). We have seen so far two large classes of unimodular random rooted networks: Cayley graphs (and their percolation clusters) and local weak limits of nite graphs and networks. We have also seen that amenable Cayley graphs are actually local weak limits of nite graphs. Here comes an obvious fundamental question: can we get all Cayley graphs and unimodular random rooted networks as local weak limits of nite graphs and networks? To start with, can we get regular trees as limits? Exercise 14.5. Show that for all R+ and k Z+ , the probability that the smallest cycle in the Erd os-R enyi random graph G(n, /n) is shorter than k tends to 0 as n . Exercise 14.6. Show that for all R+ , the local weak limit of the Erd os-R enyi random graphs G(n, /n) is the PGW() tree of Exercise 14.4.
{ex.bipgirth}

{ex.G(n,d/n)girth}

Exercise 14.7. Fix d Z+ , take d independent uniformly random permutations 1 , . . . , d on [n], and consider the bipartite graph V = [2n], E = {(v, n + i (v )) : 1 i d, 1 v n}. (a) Show that the number of multiple edges remains tight as n .

(b) Show that the local weak limit of these random bipartite graphs is the d-regular tree Td . It is also true that the sequence of uniformly chosen random d-regular graphs converges to the

d-regular tree; this follows from Corollary 2.19 of [Bol01]. This model is usually handled by proving things for the so-called conguration model (which is rather close to a union of d independent perfect matchings), and then verifying that the two measures are not far from each other from the point of view of what we are proving. Now that we know that amenable transitive graphs and regular trees are local weak approximable (Exercises 14.1 and 14.7), it is not completely ridiculous to ask the converse to Exercise 14.2: Question 14.1 ([Gro99, Wei00],[AldL07]). Is every f.g. group soc, i.e., does every Cayley diagram have a sequence of labelled nite digraphs Gn converging to it in the Benjamini-Schramm sense? In particular, is every Cayley graph a local weak limit of nite graphs? And more generally, is every unimodular random rooted graph or network a local weak limit? We need to clarify a few things about these three questions. First of all, a group being soc has a denition that is independent of its Cayley diagram, i.e., of the generating set considered. This 170
{q.sofic}

denition is that the group has a sequence of almost-faithful almost-actions on nite permutation groups: a sequence {i } i=1 of maps i : G Sym(ni ) with ni such that f, g : f = g : 1 ni 1 lim i ni
i

lim

1 p ni : pi (f )i (g) = pi (f g) 1 p ni : pi (f ) = pi (g) = 1.

= 1; (14.2) {e.soficperm}
{ex.sofic}

Exercise 14.8. Show that a f.g. group is soc in the almost-action by permutations sense (14.2) if and only if one or any Cayley diagram of it is local weak approximable by nite labelled digraphs. If we have a convergent sequence of labelled digraphs, we can just forget the edge orientations am Tim and labels, and get a convergent sequence of graphs. The converse is false: a result of Ad ar [Tim11] says that there is a sequence of graphs Gn converging to a Cayley graph G of the group (Z Z2 ) Z4 for which the edges of Gn cannot be oriented and labelled in such a way that the resulting networks converge to the Cayley diagram that gave G. The construction uses Theorem 14.8 below, due to Bollob as: although the vertex set of the 3-regular tree T3 , the standard Cayley graph of Z Z2 , can trivially be decomposed into two independent sets in an alternating manner, there exists a sequence {Hn } of nite 3-regular graphs with girth going to innity (hence locally converging to T3 ), for which the density of any independent set is less than 1/2 for some absolute constant of T3 , hence can be locally approximated by the graphs Hn C4 only if one forgets the labels; for a local approximation of the Cayley diagram, one needs to use nite 3-regular bipartite graphs with high girth, for which the alternating decomposition into two independent sets does exist. This example of Tim ar shows that going from the local approximability of graphs to the socity of groups might be a complicated issue. Indeed, answering both of the following questions with yes is expected to be hard: Question 14.2. (a) Does the local weak approximability of one Cayley graph of a group implies the local weak approximability of all its Cayley graphs? (Note that the answer to the same question for approximations of Cayley diagrams by labelled digraphs is an easy yes, by Exercise 14.8 above.) (b) Is it true that if all nitely generated Cayley graphs of a group are local weak approximable, then the group is soc? A probably simpler version of this question is the following: Exercise 14.9. *** By encoding edge labels by nite graphs, is it possible to show that every Cayley graph of every group being local weak approximable would imply that every group is soc? It is also important to be aware of the fact that local approximability by Cayley diagrams of nite groups is a strictly stronger notion than socity. This is called the LEF property (locally embeddable into nite groups), because, similarly to (14.2), it can be formalized as follows: for any nite subset F , there is a nite symmetric group Sym(n) and an injective map n : 171

> 0. Tim ar constructed a Cayley diagram of (Z Z2 ) Z4 that sees the alternating decomposition

{q.GraphsToGroups}

F Sym(n) such that n (f g ) = n (f )n (g ) whenever f, g, f g F . Now, there are groups that the Burger-Mozes group (a nice non-trivial lattice in Aut(Tm ) Aut(Tn ) [BurgM00]) is a nitely even be a non-soc group.) Exercise 14.10. (a) Show that any residually nite group (dened in Exercise 2.7) has the LEF property.

are known to be soc but not LEF: there exist solvable non-LEF groups [GoV97]. Furthermore, presented innite simple group, hence cannot be LEF by the following exercise. (In fact, it might

(b) Show that a nitely presented LEF group is residually nite. Conclude that a nitely presented innite simple group cannot have the LEF property. Exercise 14.11. (a) Let = 0 > 1 > . . . be a sequence of subgroups, and S a nite generating set of . Consider before (13.16). The levels of G(, T , S ) are the Schreier graphs Gn := G(, n , S ), as dened just the Schreier graphs G(, T , S ) of the action of on the corresponding coset tree T as dened just

before Exercise 2.10. Show that (Gn )n0 converges in the Benjamini-Schramm sense to the Cayley graph G(, S ) iff the sequence (n )n0 satises the Farber condition. converging locally to the tree T4 .

(b) Using the residual niteness of F2 (see Exercise 2.8), give a sequence of nite transitive graphs

Amenability and residual niteness are two very dierent reasons for socity. One can also combine them easily: Exercise 14.12. Dene a good notion of residual amenability, and show that residually amenable groups are soc. There are examples of soc groups that are not residually amenable [ElSz06]. Further sources of provably soc groups can be found in [Ele12]. Most people think that the answer to Question 14.1 is no (maybe everyone except for Russ Lyons, but he also has some good reasons). For instance, Gromov says that if a property is true for all groups than it must be trivial. (A counterexample to this meta-statement is the Ershler-Lee-Peres theorem about the c n escape rate, but maybe quantitative results do not count. But the (pc ) = 0 percolation conjecture is certainly a serious candidate. Maybe probability does not count? Or these questions are not really about groups, but, say, transitive graphs? Note that these two things are not the same see Chapter 16.) One reason that this is an important question is that there are many results known for all soc groups. See [AldL07] for local weak convergence and unimodularity and [Pes08] for socity. In the next subsection, we will discuss the relevance of local weak convergence to probability on groups.

14.2

Spectral measures and other probabilistic questions

{ss.spectral}

Although the question of local weak convergence is orthogonal to being quasi-isometric (in the sense that the former cares about local structure only, while the latter cares about global structure), it

172

still preserves a lot of information about probabilistic behaviour. There are also examples where the information preserved is not quite what one may rst expect. In fact, the Benjamini-Schramm paper where local weak convergence was introduced is about such an example: Theorem 14.3 ([BenS01]). Let Gn be a local weak convergent sequence of nite planar graphs with uniformly bounded degrees. Then the limiting unimodular random rooted graph (G, o) is almost surely recurrent. For instance, the balls in a hyperbolic planar tiling converge locally to a recurrent unimodular random graph that is somewhat similar to the one-ended tree of Figure 14.1. The proof of the theorem is based on two theorems on circle packings. The rst is Koebes theorem that any planar graph can be represented as a circle packing; moreover, for triangulated graphs the representation is unique up to M obius transformations. This helps normalize the circle packing representations of Gn such that we get a circle packing for the limit graph G. Then one needs to show that this limiting circle packing has at most one accumulation point of centres, which implies by a result of He and Schramm that the graph is recurrent; see Theorem 11.1. A generalization is that if that there exists a nite graph such that none of Gn contains it as a minor (e.g., planar graphs are characterized by not having either K5 or K3,3 ), then the limit is almost surely recurrent [AngSz]. The converse is false: Exercise 14.13. Construct a local weak convergence sequence of uniformly bounded degree nite graphs Gn such that for any nite graph F , if n > nF , then Gn contains F as a minor, but the limit is almost surely recurrent. (Hint 1: Show that any nite graph can be embedded into R3 without edge intersections, hence any nite graph is contained in a large enough lattice box [0, 1, . . . , k ]3 as a minor, and then construct the required sequence using boxes of random size. Hint 2: You get a probabilistically trivial but graph theoretically slightly harder example by showing that the Z2 lattice with the diagonals added as edges contains any nite graph as a minor.) An extension of Theorem 14.3 in a dierent direction is to relax the condition on uniformly bounded degrees to sequences where the degree of the uniform random root has at most an exponential tail [GuGN13]. The importance of this result is that it shows that the uniform innite planar triangulation and quadrangulation (UIPT and UIPQ, mentioned shortly after Denition 14.1) are recurrent. Now let us turn to the question of what properties local weak convergence preserves. Denition 14.2. Let p(G) be a graph parameter: a number or some more complicated object assigned to isomorphism classes of nite (or sometimes also innite) graphs. It is said to be locally approximable, or simply local, or testable, if it is continuous w.r.t. local weak convergence: whenever {Gn } is a convergent sequence of nite graphs, {p(Gn )} also converges. A simple but important example is the set of simple random walk return probabilities pk (o, o) =
G neighbourhood of o in G , and r is at least k/2, then obviously pG k (o, o) = pk (o , o ) in the two

{t.BSrec}

{d.testable}

P[ Xk = o | X0 = o ]: if the r-neighbourhood of a vertex o in a graph G is isomorphic to the r

173

graphs. In particular, if a sequence of nite graphs Gn , with a uniform random root n , converges to
n a unimodular random rooted graph (G, ), then, for any k Z+ xed, {pG k (n , n )}n=1 converges

in distribution to pG k (, ). A fancy reformulation of this observation is that the spectral measure of the random walk is locally approximable. Here is what we mean by this.

For any reversible Markov chain, e.g., simple random walk on a graph G(V, E ), the Markov operator P is self-adjoint w.r.t. (, ) , and hence we can take its spectral decomposition P = where E is a projection-valued measure on (V, ), a resolution of the identity, I =
2 1 1 1 1

t dE (t),

dE (t). (If

fi with 2 (V, )-norm 1, for each of them we have the projection Ei (f ) := (f, fi ) fi on the eigenline spanned by fi , and P =
n i=1

G is a nite graph on n vertices, then P has n eigenvalues 1 i 1, with orthogonal eigenvectors i Ei .) Now consider the unit vectors x := x / (x) for each x V ,

and dene, for any measurable S [1, 1],

x,y (S ) := (x , E (S )y ) =
S

d(x , E (t)y ) ,

(14.3) {e.Kestenmeas}

sometimes called the Kesten spectral measures or Plancherel measures. For x = y , these are probability measures on [1, 1], while, for x = y , are signed measures with zero total mass. Exercise 14.14. (a) If G(V, E ) is a nite graph on n vertices, it is natural to take the average of the above spectral measures: G :=
1 n

{ex.specmeas}

measure on the eigenvalues of the Markov operator,


2

x V

x,x . On the other hand, it is also natural to take the normalized counting
1 n n i=1 i ,

and call that the spectral measure

of G. Show that x,x ({i }) = fi (x) (x) with the unit eigenvectors fi , and deduce that the two denitions give the same spectral measure G . (b) ** One may also consider taking the weighted average G :=
x V

x,x (x), similarly to the

size-biasing in Denition 14.1 (d). Is there a nice way to express G using the eigenvalues {i }? Now notice that (x) (y ) pn (x, y ) = (x , P n y ) =
1 1

tn dx,y (t) ,

(14.4) {e.Kestenmom}

hence the return probabilities are given by the moments of the Kesten spectral measures x,x . Since these measures have compact support, they are determined by their moments, and the local approximability of the return probabilities (and of the ratio (x)/ (y ) = deg(x)/ deg(y )) implies that the spectral measures are also locally approximable. More precisely: Exercise 14.15. Let Gn be nite graphs converging to (G, ) in the local weak sense. Then the spectral measure of Gn , as dened in Exercise 14.14 (a), converges weakly (weak convergence of measures) to the Kesten spectral measure E, of (G, ), averaged w.r.t. the randomness in the limit (G, ). However, note that this does not mean that the supports of these measures also converge: for instance, each nite Gn has 1 in its spectrum, while, if G is a non-amenable transitive graph, then its spectral measure is bounded away from 1.

{ex.localspectral}

174

Exercise 14.16. * Give a sequence of nite non-expander Cayley graphs converging locally to the free group. (Hint: give rst a sequence of innite solvable groups converging locally to the free group.) A corollary to Exercise 14.15 is the following: Theorem 14.4 (L uck approximation for combinatorialists [L uc94]). Let {Gn } n=1 be a sequence of
{t.Luck}

nite graphs with degrees at most D, with edges labelled by integers from {D, . . . , D}. Assume that Adj(Gn ) be the adjacency matrices with the labels being the entries. Then dim kerQ An /|V (Gn )|

{Gn } converges in the local weak sense (with the obvious generalization handling the labels.) Let An =

converges.

An . Consider now all the nonzero eigenvalues i R of An . Their product is a coecient of the D2 , hence not too many of them can be very small. More precisely, the spectral measure of An

Proof. First of all, note that dim kerQ An /|V (Gn )| = n ({0}), where n is the spectral measure of

characteristic polynomial, hence is a nonzero integer. But all the absolute values |i | are at most

we are done.

with the weak convergence of the spectral measures, this implies that n ({0}) also converges, and The spectral measure of random walks on innite graphs is a widely studied area [MohW89]. A

satises n [, ] \ {0} < D / log(1/) for any > 0, with some D depending on D. Together

basic result is that the spectral measure x,x of the k -regular tree Tk is supported on [ 2

k 1 2 k 1 , ], k k

ized counting measure on the eigenvalues of the n n classical = 1, 2, 4 random matrix ensembles. Exercise 14.17. For SRW on Z, the Kesten spectral measure is dx,x (t) = (Hint: you could do this in at least two ways: either from the spectrum of the cycle Cn , i.e., a combination of Exercises 7.4 and 14.15, or from (14.4) and arguing that the spectral measure is determined by its moments.)
Gi Exercise 14.18. Show that for the spectral measures x,y associated to the adjacency matrices (as 1 1 (t) dt. 1t2 [1,1]

and, as k , it converges to Wigners semicircle law, which is also the n limit of the normal{ex.Zspec}

{ex.prodspec}

opposed to the Markov operators) of two graphs Gi , i = 1, 2, if u = (u1 , u2 ) and v = (v1 , v2 ) are two
G 1 G 2 G1 G2 vertices in the direct product G1 G2 , then u,v = u u , a convolution of measures. 1 ,v1 2 ,v2

Note that this can be easily translated to the spectral measures of the Markov operators only when amenable.

both Gi are regular graphs. Deduce, for instance, that the direct product of two amenable groups is

Exercise 14.19 (Kaimanovich). * Show that for any symmetric nitely supported on an amenable group, the associated random walk has not only (P ) = 1, but 1 also lies in the support of the spectral measure dx,x (t). In fact, for any h < 1 and > 0, x,x [1 h, 1] 1 2/h2 , |A |

{ex.amenspecmeas}

where A is any Flner set with |A g A | < |A | for all g supp .

175

For a long while, it was not known if simple random walk on a group can have a spectral measure not absolutely continuous with respect to Lebesgue measure. Discrete spectral measures are usually associated with random walk like operators on random underlying structures, e.g., with the adjacency or the transition matrices of random trees [BhES09], or with random Schr odinger operators + diag(Vi ) on Zd or Td , with Vi being i.i.d. random potentials on the vertices [Kir07]. Note that the latter is a special case of the former in some sense, since the Vi can be considered as loops with random weights added to the graph. Here, Anderson localization is the phenomenon that for large enough randomness in the potentials, the usual picture of absolutely continuous limiting spectral measure, with repulsing (or even lattice-like) eigenvalues and spatially extended eigenfunctions in the nite approximations, disappears, and Poisson statistics for the eigenvalues appears, with localized eigenfunctions, giving L2 -eigenfunctions, hence atoms, in the limiting spectral measure. However, the exact relationship between the behaviour of the limiting spectral measure and the eigenvalue statistics of the nite approximations has only been partially established, and it is also unclear when exactly localization and delocalization happen. Assuming that the random potentials have a nice distribution, localization is known for arbitrary small (but xed) variance on Z, and very large variance on other graphs, while delocalization is conjectured for small variance on Zd , d 3, proved on Td . The case of Z2 is unclear even conjecturally. In the context of our present notes, it looks like a strange handicap for the subject that the role of the Benjamini-Schramm convergence was discovered as a big surprise only in [AiW06], but still without realizing that this local convergence has a well-established theory, e.g., for random walks. Nevertheless, deterministic groups can also exhibit discrete spectrum: Grigorchuk and Zuk showed that the lamplighter group Z2 Z has a self-similar action on the innite binary tree (see Subsection 15.1, around (15.6)), and, with the corresponding generating set, the resulting Cayley graph G has a pure discrete spectral measure. They used the nite Schreier graphs Gn for the action on the nth level of the tree converging locally to G. See [GriZ01]. An alternative proof was found in [DicS02], later extended by [LeNW07], which interpret the return probabilities of SRW on the lamplighter group F Zd as the averaged return probabilities of a SRW on a p-percolation conguration on Zd , with parameter p = 1/|F |, where the walk is killed when it wants to exit the underlying structure is somehow still there.) On the other hand, the following is still open. For a rough intuitive denition of the von Neumann dimension dim , see the paragraph before Theorem 11.3. Conjecture 14.5 (Atiyah 1976). Simple random walk on a torsion-free group cannot have atoms in its spectral measure dx,x , and more generally, any operator on 2 () given by multiplication by a non-zero element of the group algebra C has a trivial kernel. Even more generally, the kernel of any matrix A Mnk (C), as a linear operator 2 ()n 2 ()k , has integer -dimension
n

cluster of the starting vertex. (So, in a certain sense, the picture of a random walk on a random

{c.Atiyah}

dim (ker A) :=
i=1

(ker A ei , ei )2 ()n ,

176

where H is the orthogonal projection onto the subspace H 2 ()n and ei is the standard basis vector of 2 ()n , with a Kronecker e (g ) function (where e is the identity) in the ith coordinate. How is the claim about the trivial kernel a generalization of the random walk spectral measure being atomless? The Markov operator P , generated by a nitely supported symmetric measure , can be represented as multiplication of elements = Now, because of group-invariance, having an atom in dx,x (t) at t0 [1, 1] is equivalent to having t0 . Indeed, dim (ker( t0 e)) is exactly the size x,x ({t0 }) of the atom.
g

(g )g 2 () by =

(g )g R0 .

an atom at dE (t0 ), and that means there is a non-trivial eigenspace in 2 () for P with eigenvalue The conjecture used to have a part predicting the sizes of atoms for groups with torsion; a strong version was disproved by the lamplighter result [GriZ01], while all possible weaker versions have

recently been disproved by Austin and Grabowski [Aus09, Gra10]. The proof of Austin is motivated by the above percolation picture. Here is a nice proof of the Atiyah conjecture for Z that I learnt from Andreas Thom (which might well be the standard proof). The Fourier transform a = (an )nZ a(t) = an exp(2itn), t S 1

nZ

is a Hilbert space isomorphism between 2 (Z) and L2 (S 1 ), and also identies multiplication in CZ (which is a convolution of the coecients) with pointwise multiplication of functions on S 1 . So, the kernel Ha for multiplication by a CZ in 2 (Z) is identied with Ha = {f (t) L2 (S 1 ) : a(t)f (t) = a(t). Then, by the Hilbert space isomorphism, 0}, and projection on Ha is just multiplication by 1Za (t), the indicator function of the zero set of

dimZ (Ha ) = (Ha e, e)2 (Z) =

1Za (t) dt = Leb(Za ) = 0 ,


S1

(14.5) {e.LebZa}

since a(t) is a nonzero trigonometric polynomial, and we are done. Exercise 14.20. ** A variable X [0, 1] follows the arcsine law if P[ X < x ] = or, in other words, has density (
2

arcsin( x),

{ex.arcsine}

Brownian motion Bt on R, the scaling limit of SRW on Z: the location of the maximum of {Bt : have this distribution. Is there a direct relation to the spectral measure density in Exercise 14.17? A possibly related question: is there a quantitative version of (14.5) relating the return probabilities but can you formulate it using projections? See [M oP10] for background on Brownian motion. n1/2 on Z to the dimension 1/2 of the zeroes of Brownian motion? This relationship is classical, To see a probabilistic interpretation of atoms in the spectral measure, note that for SRW on a estimate has been proved using random walks and harmonic functions, even in a stronger form: Theorem 7.8 of [Woe00], a result originated in the work of Guivarch and worked out by Woess, says that whenever a group is transient (i.e., not quasi-isometric to Z or Z2 ), then it is also -transient, 177 group , by (14.4), there is no atom at the spectral radius (P ) iff pn (x, x) = o(n ). And this t [0, 1]}, the location of the last zero in [0, 1], and the Lebesgue measure of {t [0, 1] : Bt > 0} all

x(1 x))1 . This distribution comes up in several ways for

meaning that G(x, y | 1/) < , for Greens function evaluated at the spectral radius = (P ). This clearly implies that there is no atom at . Let us turn for a second to the question what types of spectral behaviour are robust under quasiisometries, or just under a change of generators. The spectral radius being less than 1 is of course robust, since it is the same as non-amenability. On the other hand, the polynomial correction in pn (x, x) = o(n ) can already be sensitive: for the standard generators in the free product Zd Zd , for d 5, we have pn (x, x) n5/2 n , while, if we take a very large weight for one of the generators

in each factor Zd , then random walk in the free product will behave like random walk on a regular Cartwright, see [Woe00, Section 17.B]. tree, giving pn (x, x) n3/2 n , see Exercise 1.5. This instability of the exponent was discovered by

Exercise 14.21. Explain why the strategy of Exercises 1.3, 1.4, 1.5 to prove the exponent 3/2 in the free group does not necessarily yield the same 3/2 in Zd Zd . In a work in progress, Grabowski and Vir ag show that the discreteness of the spectral measure can be completely ruined by a change of variables: in the lamplighter group, by changing the generators, they can set the spectrum to be basically anything, from purely discrete to absolutely continuous. The locality of the spectral measure suggests that if a graph parameter can be expressed via simple random walk return probabilities on the graph, then it might also be local. We have seen in Subsection 11.2 that the Uniform Spanning Tree measure of a nite graph is closely related to random walks. This motivates the following result of Russ Lyons: Theorem 14.6 (Locality of the tree entropy [Lyo05]). For any nite graph G(V, E ), let (G) be the number of its spanning trees, and let htree (G) := in the Benjamini-Schramm sense to the unimodular random rooted graph (G, ), then, under mild conditions on the unimodular limit graph (e.g., having bounded degrees suces),
n log (G) | V (G )|

{t.treeent}

be its tree entropy. If Gn converges

lim htree (Gn ) = E log deg()

k 1

pG k (, ) , k

(14.6)

{e.treeentlim}

where pG k (, ) is the SRW return probability on G. Sketch of the proof for a special case. Let LG = DG AG be the graph Laplacian matrix: DG is d times our usual Markov Laplacian I P .) The Matrix-Tree Theorem says that (G) equals det(Lii G ), where the superscript ii means that the ith row and column are erased. By looking at the characteristic polynomial of LG , it is easy to see that this truncated determinant is the same as
1 n n i=2

the diagonal matrix of degrees and AG is the adjacency matrix. (For a d-regular graph, this is just

that each Gn is d-regular. Then i = d(1 i ), where 1 n 1 = 1 are the eigenvalues of the Markov operator P . Thus log n n 1 1 log (Gn ) = + log d + n n n n 178
n i=2

Assume for easier notation that |V (Gn )| = n. A less trivial simplication is that we will assume

i , where |V (G)| = n and 0 = 1 n are the eigenvalues of LG .

log(1 i ) .

(14.7) {e.treeentn}

Consider the Taylor series log(1 ) =


n E pG k (, ) =

the uniform random root in Gn , the invariance of trace w.r.t. the choice of basis implies that
1 n n i=1

1 k k 1 k ,

for bounded away from 1. Recall that, for

k i . Putting these ingredients together, 1 n


n i=2

log(1 i ) =

k 1

1 k

n E pG k (, )

1 n

(14.8) {e.treeentm}

We are on the right track towards formula (14.6) just have to address how to interchange the innite sum over k and the limit n . For lazy SRW in a xed graph, the distribution after a large k number of steps converges to the

stationary one, which is constant 1/n in a d-regular graph with n vertices. Let us therefore consider = (I + P )/2 and return probabilities p the lazy walk in Gn , with Markov operator P Gn (, ). Any
k

connected graph is at least 1-dimensional in the sense that any nite subset of the vertices that is not the entire vertex set has at least one boundary edge. Thus, Theorem 8.2 implies that
n Ep G k (, )

1 Cd , n k

(14.9) {e.pkspeed}

for all n. So, if we had p k instead of pk on the RHS of (14.8), then we could use the summability of k 3/2 to get a control in (14.8) that is uniform in n. But how could we relate the lazy SRW to the original walk? n , then SRW on this If we add d half-loops at each vertex, so that we get a 2d-regular graph G graph is the same as the lazy SRW on Gn , hence we can again write down the identities (14.7) n and p n ) = (Gn ). Thus, and (14.8), now with G k . On the other hand, we obviously have (G log (Gn ) log n n 1 = + log(2d) n n n 1 k
n Ep G k (, )

k 1

1 n

(14.10) {e.treeentlazy}

Now (14.9) implies that for any > 0, if K and n are large enough, then log (Gn ) log(2d) n
K

k=1

1 k

n Ep G k (, )

1 n

< .

n Take n and then K . For each xed k , we have E p G G k (, ), which yields k (, ) E p

log (Gn ) lim = log(2d) n n

k=1

1 Ep G k (, ) . k

Note here that the innite sum is nite either by taking the limit n in (14.9) or by the innite chain version of the same Theorem 8.2. This already shows the locality of the tree entropy, but we still would like to prove formula (14.6). This is accomplished by Exercise 14.22 below, the innite graph version of the identity log 2 1 k
n Ep G k (, )

k 1

1 n

k 1

1 k

n E pG k (, )

1 n

that we get for any nite graph Gn by comparing (14.7, 14.8) with (14.10).

179

{ex.pkqk}

Exercise 14.22. Let P be the transition matrix of any innite Markov chain. For [0, 1], dene probabilities to x after k steps in the two chains. Then (Hint: For any z (0, 1), write then let z 1.)
k qk (x)z k k 1

the lazy transition matrix Q := I + (1 )P . For a state x, let pk (x) and qk (x) denote the return /k as an inner product using the operator log(I zQ), q (x)/k = log(1 ) +
k 1

p(x)/k .

The tree entropy limn htree (Gn ) = htree (G, ) can actually be calculated sometimes. From the connection between spanning trees and domino tilings in planar bipartite graphs, one can show that htree (Z2 ) = 4G/ 1.166, where G := after Theorem 6.1 of [BurtP93]. For the 4-regular tree, for instance, from Theorem 14.6 and the trees Green function (1.5), Lyons deduced htree (T4 ) = 3 log(3/2) in [Lyo05]. A similarly dened notion of entropy is the q -colouring entropy hq-col (G) :=
log ch(G,q) | V (G )| , k 2 k0 (1) /(2k +1)

is Catalans constant; see the examples

where

ch(G, q ) is number of proper colourings of G with q colours (i.e., colourings of V (G) such that neighbours never get the same colour). This ch(G, q ) is called the chromatic polynomial of G, for the following reason: Exercise 14.23. where G \ e is the graph obtained from G by deleting e, and G/e is obtained by gluing the endpoints of e and erasing the resulting loops. (a) Show that, for any q Z+ and any e E (G), we have ch(G, q ) = ch(G \ e, q ) ch(G/e, q ),
{ex.chpoly}

(c) Show that induced by S .

|V (G)| = n.

(b) Deduce that ch(G, q ) is a polynomial in q , of the form q n + an1 (G)q n1 + + a1 (G)q , where
S V (G) ch(G[S ], x) ch(G[V

\ S ], y ) = ch(G, x + y ), where G[S ] is the subgraph of G


n i=1 (q

The roots of the chromatic polynomial ch(G, q ) =

of G(V, E ), and (analogously to the spectral measure) the counting measure on the roots normalized to have total mass 1 is called the chromatic measure col G . We can express the q -colouring entropy using this measure: hq-col (G) = 1 log ch(G, q ) = n log(q z ) dcol G (z ) , (14.11) {e.qcolorint}

i ) are called the chromatic roots

there exists an absolute constant C < 8, such that if G has maximal degree d, then all roots of ch(G, q ) in C are contained in the ball of radius Cd. There is a huge body of work on the chromatic polynomial by Sokal and coauthors (just search the arXiv). What about locality? It was proved in [BorChKL13] that hq-col (G) is local for graphs with degrees bounded by d, whenever q > 2d. They used the Dobrushin uniqueness theorem: with this many colours, the eect of any boundary condition decays exponentially fast with the distance, and hence there is only one Gibbs measure in any innite volume limit (i.e., on any limiting unimodular random

ch(G, q ) > 0 for all integers q 4. A certain generalization has been proved by Alan Sokal [Sok01]:

whenever q {1 , . . . , n }. Note that the 4-colour theorem says that any planar graph G satises

180

rooted graph). It should not be surprising that this has to do with the locality of how many dierent colourings are possible, but the proof is somewhat tricky. Now, in light of (14.11) and the locality of the random walk spectral measure (Exercise 14.15), it is natural to ask about the locality of the chromatic measure. This was addressed by Ab ert and Hubai [AbH12]; then Csikv ari and Frenkel [CsF12] found a simpler argument that generalizes to a wide class of graph polynomials. We start with the denitions needed. A graph polynomial f (G, z ) =
n k=0

ak (G)z k is called isomorphism invariant if it depends

only on the isomorphism class of G. It is called multiplicative if f (G1 G2 , z ) = f (G1 , z ) f (G2 , z ),

cise 14.23 (c). It is of bounded exponential type if there is a function R : N R0 not depending on G such that, for any graph G with all degrees at most d, any v V (G), and any t 1, we have |a1 (G[S ])| R(d)t1 .

where denotes disjoint union. It is of exponential type if it satises the identity of Exer-

(14.12) {e.boundedexp}

v S V (G ) | S | =t

Besides the chromatic polynomial, here are some further examples that satisfy all these properties (see [CsF12] for proofs and references): (1) The Tutte polynomial of G(V, E ) is dened as T (G, x, y ) :=
E

(x 1)k()k(E ) (y 1)k()+|||V | ,

(14.13) {e.Tutte}

where k ( ) is the number of connected components of (V, ). This is almost the same as the partition function ZFK (p, q ) of the FK model in (13.7); the exact relation is T (G, x, y ) = y |E | ZFK (x 1)k(E ) (y 1)|V |
E

y1 , (x 1)(y 1) . y

(14.14) {e.TutteFK}

A third common version is F (G, q, v ) :=

q k() v || . The variable v corresponds to

p/(1 p) in FK(p, q ). This third form is the best now, since its degree in q is |V |. In fact, it satises the above properties for any xed v . Note that ch(G, q ) = F (G, q, 1). Laplacian matrix LG = DG AG , featured in the proof of Theorem 14.6. (3) The (modied) matching polynomial is M (G, z ) := z n m1 (G)z n1 + m2 (G)z n2 m3 (G)z n3 + . . . , where mk (G) is the number matchings of size k . Sokals theorem above about the chromatic roots of bounded degree graphs generalizes rather easily to graph polynomials of bounded exponential type: Theorem 1.6 of [CsF12] says that if f (G, z ) has a bounding function R in (14.12), and G has degrees less than d, then the absolute value of any root is less than cR(d), where c < 7.04. In other words, the polynomial then has bounded roots. Now, the main result of [CsF12], generalizing the case of chromatic polynomials from [AbH12] is the following: 181 (2) The Laplacian characteristic polynomial L(G, z ) is the characteristic polynomial of the

Theorem 14.7 (Csikv ari-Frenkel [CsF12]). Let f (G, z ) be an isomorphism-invariant monic multiplicative graph polynomial of exponential type. Assume that it has bounded roots. Let Gn be a sequence that converges in the Benjamini-Schramm sense, and K C a compact domain that contains all the roots of f (Gn , z ). Let f G be the uniform distribution on the roots of f (G, z ). Then
K

the holomorphic moments

f -entropy or free energy at C \ K by hf, (G) :=

z k df Gn (z ) converge for all k N. In particular, if we dene the 1 log |f (G, )| = |V (G)| | log( z )| df G (z ) ,

(14.15)

{e.fentint}

then the Taylor series of log |z | shows that hf, (Gn ) converges to a harmonic function locally uniformly on C \ K . That is, for C \ K , the f -entropy at is local. As opposed to moments of a compactly supported measure on R, the holomorphic moments do not characterize uniquely a compactly supported measure on C. And, somewhat surprisingly, the chromatic measure itself is in fact not local, as shown by the following exercise: Exercise 14.24. the chromatic polynomial of the cycle on n vertices is ch(Cn , q ) = (q 1)n + (1)n (q 1). (a) Show that the chromatic polynomial of the path on n vertices is ch(Pn , q ) = q (q 1)n1 , while (b) Deduce that the chromatic measure of Cn converges weakly to the uniform distribution on the mass at z = 1. (c)*** Find a good denition according to which one of the limiting measures is the canonical one. Non-trivially, but it follows from the locality of the matching polynomial root moments that the matching ratio, i.e., the maximal size of a disjoint set of edges, divided by the number of vertices is a local parameter. See also [ElL10]. Surprisingly at rst sight, the analogous notion with independent subsets of vertices instead of edges behaves very dierently. The independence ratio of a nite graph G(V, E ) is the size of the largest independent set (i.e., a subset of the vertices such that no two of them are neighbours) divided by |V |. For instance, the independence ratio of a balanced bipartite graph (i.e., the two parts have the same size) is 1/2.
{t.indepratio} {ex.chcycle}

circle of radius 1 around z = 1, while the chromatic measure of Pn converges weakly to the point

and a sequence of d-regular graphs with girth tending to innity (i.e., converging locally to Td ) for

Theorem 14.8 (The independence ratio is not local [Bol81]). For any d 3, there exists an > 0

uniformly random d-regular balanced bipartite graphs also converge to Td in the local weak sense (see Exercise 14.7), this shows that the independence ratio of d-regular graphs is not local. What is then the independence ratio of random d-regular graphs? This seems to be intimately

which the independence ratio is less than 1/2 . Basically, random d-regular graphs will do. Since

related to the question of how dense an invariant independent set can be dened on the d-regular tree Td that is a factor of an i.i.d. process. One direction is clear: given any measurable function of an i.i.d. process on Td that produces an independent set, we can approximate that by functions depending only on a bounded neighbourhood, then we can apply the same local functions on any 182

sequence of nite graphs with girth going to innity. Several people (Bal azs Szegedy, Endre Cs oka, maybe David Aldous) have independently arrived at the conjecture that on random d-regular graphs, the optimal density (in fact, all possible limit densities) can be achieved by such a local construction: Conjecture 14.9 (Bal azs Szegedy). The possible values for the densities of independent sets in random d-regular graphs coincides with the possible densities of independent sets as i.i.d. factors in d-regular trees. (This is part of a much more general conjecture on the so-called local-global limits that I plan to discuss in a later version; see [HatLSz12] for now.) The ideology is that random d-regular graphs have no global structure, hence all limit processes can be constructed locally (i.e., as a factor of i.i.d.) on the local limit graph (the regular tree). It is an important question what kind of processes can be realized as a factor of an i.i.d. process. See [LyNaz11, HatLSz12, Mes11], for instance. And here is another beautiful conjecture: Conjecture 14.10 (Ab ert-Szegedy). Let Gn be a Benjamini-Schramm-convergent sequence of nite graphs, n an i.i.d. process on Gn , and F (Gn , n ) a factor process. Let h(F, Gn ) be the entropy of the resulting measure. (For simplicity, we can assume that there are nitely many possible congurations of F (Gn , n ).) Then limn h(F, Gn )/|Gn | exists. We have mentioned in Subsection 13.4 that Ornstein-Weiss have developed a very good entropy theory for amenable groups. An armative answer to Conjecture 14.10 would say that there is a good entropy notion also for i.i.d. factor processes on all soc groups. Another interesting corollary would be the (mod p) version of the L uck Approximation Theorem 14.4: if An s are the adjacency matrices of a convergent sequence of nite graphs Gn , then dim kerFp An /|V (Gn )| converges. Indeed, this is equal to 1 dim ImFp An /|V (Gn )|, and the normalized dimension of the image, times log p, is the normalized entropy of the uniform measure on the image space. And we can easily get this uniform distribution as a factor of an i.i.d. process: assign i.i.d. uniform random labels to the vertices of Gn from {0, 1, . . . , p 1}, then write on each vertex the (mod p) sum of its neighbouring labels. Let us turn to a locality question that is quite dierent from spectral measures and entropies: the critical parameter in Bernoulli percolation: Conjecture 14.11 (Locality of pc , O. Schramm). If Gn are innite transitive graphs locally converging to G, with supn pc (Gn ) < 1, then pc (Gn ) pc (G). See [BenNP11] for partial results. As mentioned in Section 12.4, it would follow from Questions 12.27 or 12.28. Exercise 14.25. Show that the supn pc (Gn ) < 1 condition in Conjecture 14.11 is necessary. Of course, one can also ask about the locality of pc (q ) in the FK(p, q ) random cluster measures. But there is an even more basic question:
{c.pcloc} {c.AbSzeg}

183

Question 14.12. If Gn converges to a non-amenable transitive G in the local weak sense, is it true for all FK(p, q ) random cluster models (especially for the much more accessible q > 1 case) that the limit measure from the Gn s is the wired measure on G? This is easy to see for the WUSF (the q = p = 0 case) using Wilsons algorithm, and also the q = 2 Ising case is known when G is a regular tree [MonMS12]. In the amenable case, it is clear that both the free and the wired measures can be achieved by local limits.

15

Some more exotic groups


{s.exotic}

We now present some less classical constructions of groups, which have played a central role in geometric group theory in the last two decades or so, and may provide or have already provided exciting examples for probability on groups.

15.1

Self-similar groups of nite automata

{ss.automata}

This section shows some natural ways to produce group actions on rooted trees, which come up independently in complex and symbolic dynamical systems and computer science. A key source of interest in this eld is Grigorchuks construction of groups of intermediate volume growth (1984). The main references are [GriNS00, BarGN03, Nek05]. In Section 13.4 we already encountered the adding machine action of Z: the action of the group on nite and innite binary sequences was (0w)a = 1w (1w)a = 0wa . (15.1) {e.adding2}

By representing nite and innite binary sequences as vertices and rays in a binary tree, we clearly get an action by tree automorphisms. The rst picture of Figure 15.1 shows this action, with the Schreier graphs of the rst few levels. The second picture is called the prole of the action of the generator a; the picture should be self-explanatory once one knows that the switches of the subtrees are to be read o from bottom to top, towards the root. Finally, the third picture of Figure 15.1 is called the Moore diagram of the automaton generating the group action. This automaton is a directed labeled graph G(V, E ), whose vertices for any innite binary sequence x1 x2 . . . the automaton produces a new sequence y1 y2 . . . : from s0 (the states) correspond to some tree automorphisms, as follows. Given any initial state s0 V , we follow the arrow whose rst label is x1 , we output the second label, this will be y1 , and then we continue with the target state of the arrow (call it s1 ) and the next letter x2 , and so on. The group generated by the tree automorphisms given by the states V is called the group generated sequences, generating an action on the b-ary tree Tb . by the automaton. There is of course a version with labels from {0, 1, . . . , b 1} instead of binary
{d.ss}

any g , any letter x {0, 1, . . . , b 1}, and any nite or innite word w on this alphabet, there 184

Denition 15.1. The action of a group on the b-ary tree Tb (b 1) is called self-similar if for

0/0 0 1 0 1 a1 00 01 11 1/1 1/0 id 0/1 a 1/0

000

010

111

Figure 15.1: The adding machine action of Z: (a) the Schreier graphs on the levels (b) the prole of the generator as action (c) the Moore diagram of the automaton is a letter y and h such that (xw)g = y (wh ). If S generates as a semigroup, and s S
{f.adding}

and word xw there is a letter y and t S such that (xw)s = y (wt ), then S is called a self-similar generating set. Then the group can clearly be generated by an automaton with states S . If there is a nite such S , then is called a nite-state self-similar group. For a self-similar action by , for any g and nite word v there is a word u of the same

length and h such that (vw)g = u(wh ) for any word w. This h is called the restriction h = g |v , of Tb is of course self-similar, and there is the obvious wreath product decomposition Aut(Tb ) Aut(Tb ) Symb ,

and we get an action of on the subtree starting at v . The action of the full automorphism group

(15.2) {e.treathA}

corresponding to the restriction actions inside the b subtrees at the root and then permuting them. (But, of course, Aut(Tb ) is not nitely generated.) For a general self-similar action by Aut(Tb ), the isomorphism (15.2) gives an embedding Symb . (15.3) {e.treathG}

Using this embedding, we can write the action (13.11) very concisely as a = (1, a), where (g, h) is the tree-automorphism acting like g on the 0-subtree and like h on the 1-subtree, is switching these two subtrees, and the order of the multiplication is dictated by having a right action on the tree. In particular, (g, h)(g , h ) = (gg , hh ) and (g, h) = (h, g ). There is also a nice geometric way of arriving at the adding machine action of Z. Consider : x 2x, an expanding automorphism of the Lie group (R, +). Since (Z) = 2Z Z, we can consider : R/Z R/Z, a twofold self-covering of the circle S 1 . Pick a base point x S 1 , and a loop : [0, 1] S 1 starting and ending at x, going around S 1 once. Now, 1 (x) consists of two 185

points, call them x0 and x1 . Using Proposition 2.7, we can lift starting from either point, getting 0 and 1 , respectively. Clearly, i ends at 1i , i = 0, 1. Following these i s we get a permutation on 1 (x), the transposition (x0 x1 ) in the present case. (For a general, possibly branched, b-fold get an action of 1 (X ) on 1 (x), as in Section 2.2.) We can now iterate this procedure, taking covering : X X , we should start with one for each generator of 1 (X ), and then would

the preimage set 2 (x) = 1 (x0 ) 1 (x1 ) = {x00 , x01 , x10 , x11 }, etc., and we get an action of representations of the edges in the Schreier graph of the adding machine action. For a general b-fold

1 (S 1 ) = Z on the entire binary tree. It is easy to see that the lifts 0 , 1 , 00 etc. will be geometric

covering, it is possible that the resulting action on the b-ary tree is not faithful, so the actual group of tree-automorphisms that we get will be a factor of 1 (X ). This is called the Iterated Monodromy Group IMG() of the covering map. expanding partial b-fold self-covering map for some X1 X (we typically get X1 by removing the A particularly nice case is when X is a Riemannian manifold, and : X1 X is a locally

branch points from X in the case of a branched covering). Then the resulting action of = IMG() every g there is a k N such that the restriction g |v is in N for all vertices v Tb of depth at on Tb is nite state self-similar, moreover, contracting: there is a nite set N such that for

least k . Moreover, if in addition, X1 = X is a compact Riemannian manifold, like X = S 1 in our case, then the action of 1 (X ) is faithful, so IMG() = 1 (X ). See [Nek03] or [Nek05] for proofs. Exercise 15.1. Show that, for any contracting self-similar action of some on Tb , there is a unique Show that N is a self-similar generating set of N . Give an example where N = . minimal set N giving the contraction property. It is called the nucleus of the self-similar action. At this point, the Reader might feel a bit uneasy: we have been talking about countable groups acting on trees, so where do these continuum objects R and S 1 and Riemannian manifolds come from? Well, their appearance in the story is not accidental: it is possible to reconstruct X and , at least topologically, from the action itself. Given any contracting action by on Tb , one can dene the limit space J as the quotient of the set of left-innite sequences {0, . . . , b 1}N by the following asymptotic equivalence relation: the sequences (. . . , x1 , x0 ) and (. . . , y1 , y0 ) are equivalent iff there exists a nite subset K such that for all k N there is some gk K with (xk , . . . , x0 )gk =
{ex.nucleus}

in the same -orbit.) A similar notion is the limit solenoid S , the quotient of {0, . . . , b 1}Z by such that k N gk K with (xk , xk+1 , . . . )gk = (yk , yk+1 , . . . ) in Tb . On J we take the topology to be the image of the product topology under the equivalence quotient map, while on S

level, etc. (In particular, this equivalence is very dierent from two rays in Tb = {0, . . . , b1}N being

(yk , . . . , y0 ) with the action of on Tb , i.e., with xk on the rst level, (xk , xk+1 ) on the second

the equivalence relation that (. . . , x1 , x0 , x1 , . . . ) (. . . , y1 , y0 , y1 , . . . ) iff there is a nite K

. Now, discrete on the right; to emphasize the asymmetric topology, we can denote this space by S

it will be the image of the topology on {0, . . . , b 1}Z that is product topology on the left tail but given a nite word w = xk xk+1 . . . x0 , with k N, we can consider the tile Tw :=

. . . xk2 xk1 w : xki {0, 1, . . . , b 1}, i 1 , 186

a subset of J after the factorization. Similarly, given an innite word w = xk xk+1 . . . , k Z, the

tile Tw will be a subset of S . Because of the factorization, these tiles are not at all disjoint for

dierent ws with the same starting level k = k (w). However, if the action of is nice enough,

then their interiors are disjoint, and the intersections of the boundaries are given by the action of N : Twg Twh = iff g 1 h N . In particular, if N = , then the adjacencies are given by the Schreier graph on that level. In J , as we take the level k , the tiles are getting smaller and
and more. The situation in S is a bit more complicated: it is a highly disconnected space, so we g corresponding to -orbits O(w) Twg S

smaller, and hence the Schreier graphs, drawn on the tiles, approximate the structure of J more need to restrict our attention to the leaves LO(w) :=

in Tb . Finally, consider the shift action s on {0, . . . , b 1}N that deletes the last (the 0th ) letter, hence is b-to-1, or that moves the origin in {0, . . . , b 1}Z to the left. In both cases, s preserves the
asymptotic equivalence relation, and thus we get the dynamical systems (JG , s) and (S , s).

The upshot (proved by Nekrashevych) is that when the contracting action is obtained from a

locally expanding map : X1 X , then, under mild additional assumptions, (JG , s) is topologithe actions), where
n=0

cally conjugate to (J (X, ), ) (i.e., there is a homemorphism between the spaces that interchanges J (X, ) := n (x) X

is the Julia set of , easily shown to be independent of x X . For instance, if X1 = X is a = IMG() and the limit space construction X = J are true inverses of each other.

compact Riemannian manifold, then J (X, ) = X . Recall that 1 (X ) = IMG() in this case, so,

Exercise 15.2. For the adding machine action, give a homeomorphism between JZ and S 1 that homeomorphic to R. interchanges the actions x 2x on S 1 and s on JZ . Show that for any w T2 , the leaf LO(w) is

Exercise 15.3. Show that the limit set J for the self-similar action a = (a, 1, 1)(12) b = (1, b, 1)(02) c = (1, 1, c)(01)

on the alphabet {0, 1, 2} is homeomorphic to the Sierpi nski gasket. Expanding endomorphisms of the real and integer Heiseinberg group H3 (R) and H3 (Z) were used in [NekP09] to produce nice contracting self-similar actions, with the tiles and the shift map s leading to scale-invariant tilings in some Cayley graphs of H3 (Z), as dened in Question 12.26. The same paper used the self-similar actions of several groups of exponential growth to show that they are scale-invariant (see Theorem 4.12): the lamplighter group Z2 Z, the solvable Baumslag-Solitar groups BS(1, m), and the ane groups Zd GL(d, Z). When a group turns out to have a nite state self-similar action, a huge box of great tools opens up but it is far from obvious if a given group has such an action. The lamplighter group G = Z2 Z is a famous unexpected example, whose self-similarity was rst noticed and proved by Grigorchuk and Zuk in [GriZ01], but there is a much simpler proof, found by [GriNS00], using 187

Z2 [[t]], which we will present below. The self-similar action of G was used in [GriZ01] to show that the spectrum of the simple random walk on the Cayley graph generated by the self-similar generators a and b below is discrete see Section 14.2 for a bit more on this. The scale-invariance of G proved in [NekP09] is closely related to the way how the spectrum was computed, even though the way we found it followed an orthogonal direction of thought, namely, that the Diestel-Leader graph DL(2, 2) is the Cayley graph of G with the generators R, Rs on one hand, and of the index two subgroup 1 = Rs, sR on the other. We now show G can be generated by a nite automaton. Let H be the additive subgroup of the group Z2 [[t]] of formal power series over Z2 consisting of nite Laurent polynomials of (1 + t), and consider the injective endomorphism (F (t)) := tF (t) for F (t) Z2 [[t]]. Since tF (t) = (1 + t)F (t) F (t), we have that (H ) H . Observe that power series divisible by t, with index [H : H1 ] = 2. We then let Hn := n (H ), a nested sequence Consider now the right coset tree T corresponding to the subgroup sequence (Hn )n0 , as (1 + t)k 1 (H ) for any k Z; this easily implies that (H ) is exactly the subgroup of H of

of nite index isomorphic subgroups. dened before (13.16). We have T = T2 , and the boundary T is a topological group: the pronite additive group Z2 [[t]], via the identication : x1 x2 . . . xi ti1 ,
i1

(15.4) {e.Phi}

where x = x1 x2 . . . is the shorthand notation for the ray H = H0 x0 H1 x1 H2 x2 . . . in T . product G = A H is the group of the following transformations of Z2 [[t]]: F (t) (1 + t)m F (t) + f (k )(1 + t)k ,
k Z

Let A be the cyclic group Z acting on H by multiplication by (1 + t). Thus the semidirect

(15.5) {e.trafo}

where m Z and f : Z Z2 is any function with nitely many non-zero values. This group G =

conguration of the lamps. We can represent f by the nite set supp f Z. The usual wreath product generators are s and R, representing switch and Right; we will also use L = R1 . So, for example, Rs = (1, {1}). In terms of the representation (15.5), the action of s is F (t) F (t) + 1, while the action of R is F (t) (1 + t)F (t). The action of G on the innite binary tree T can now be described by the combination of (15.4)

(m, f ), one can think of m Z as the position of the lamplighter, while f : Z Z2 is the

Z Z2 Z = Z2 Z is the standard lamplighter group; for each element

and (15.5), and it turns out to be a nite-state self-similar action. Namely, consider the following new generators of the lamplighter group: a = Rs, b := R. Note that s = b1 a = a1 b. Then, the action of these generators on the binary tree T can be easily checked to be (0w)a = 1wb (1w)a = 0wa (0w)b = 0wb (1w)b = 1wa , (15.6) {e.GZ}

for any nite or innite {0, 1} word w. Hence {a, b} is a nite self-similar generating set. Another 188

usual notation for this self-similar action, using (15.3), is a = (b, a) , b = (b, a) , (15.7) {e.LLselfsim}

We note that in the literature there are a few slightly dierent versions of (15.7) to describe the lamplighter group. This is partly due to the fact that interchanging the generators a and b induces an automorphism of G, see e.g. [GriZ01]. One can easily write down a couple of formulas like (15.7), but the resulting group might be very complicated. We will briey discuss two famous examples, Grigorchuks group of intermediate growth and the Basilica group. Grigorchuks rst group G is dened by the following self-similar action on the binary tree: a = , b = (a, c) , c = (a, d) , d = (1, b) . (15.8) {e.chuk}

If this looks a bit ad hoc, writing down the proles of the generators will make it clearer, see Figure 15.2.

Figure 15.2: The proles of the generators b, c, d in Grigorchuks group. Now, we have the following easy exercise: Exercise 15.4. For Grigorchuks group (15.8), check that the stabilizer Gv of any vertex in the binary tree is isomorphic to the original group, hence G has G G as an index 2 subgroup. Moreover, using the third level stabilizers, one can show that there is an expanding virtual isomorphism from the direct product of eight copies of G to itself, hence, by Lemma 4.16, G has intermediate growth. See [GriP08] for more details. Most of our course has been about how algebraic and geometric properties of a group inuence the behaviour of SRW on it, though Kleiners proof of Gromovs theorem was already somewhat in the other direction, probabilistic ideas giving algebraic results, see Section 10.1. But the rst example of SRW applied to a group theory problem was by Bartholdi and Vir ag [BarV05]: they showed that the so-called Basilica group is amenable by nding a nite generating system on it for which they were able to compute that the speed is zero. At that point this was the furthest

{f.chuk}

189

known example of an amenable group from abelian groups, namely, it cannot be built from groups of subexponential growth via group extensions. This group is again generated by a nite automaton: a = (1, b) , b = (1, a) . (15.9) {e.Basil}

Continuations of this Basilica work include [BarKN10] and [AmAV09]. The activity growth ActG (n) of a nite state self-similar group is the number of length n words w such that the section g |w is not the identity for some of the self-similar generators g . Exercise 15.5. Show that any nite state self-similar group with bounded activity growth is contracting (as dened just before Exercise 15.1). Exercise 15.6. Show that the activity growth ActG (n) is either polynomial or exponential. Sidki [Sid00] showed that a polynomial activity self-similar group cannot contain a free subgroup F2 . Are they always amenable? [BarKN10] showed this for bounded activity groups, while [AmAV09] for linear activity groups. For at most quadratic activity, the Poisson boundary is conjectured to be trivial (proved for at most linear growth), but not for larger activity. This is quite analogous to the case of lamplighter groups Z2 Zd . Furthermore, it is not known if all contracting groups are amenable.

15.2 15.3

Constructing monsters using hyperbolicity Thompsons group F

{ss.monsters} {ss.ThompF}

A very famous example of a group whose amenability is not known is the following. See [CanFP96] for some background, and [Cal09] for more recent stories. Denition 15.2. Consider the set F of orientation preserving piecewise linear homeomorphisms of [0, 1] to itself whose graphs satisfy the conditions that a) All slopes are of the form 2a , a Z. b) All break points have rst coordinate of the form
k 2n ,

{d.Thompson}

Clearly, F is a group with composition of maps as a multiplication operation and this group is called Thompsons group F . Question 15.1. Determine whether Thompsons group F is amenable or not. Kaimanovich has proved that SRW has positive speed on Thompsons group F for some generating set, hence this probabilistic direction of attack is not available here.

k, n N.

16

Quasi-isometric rigidity and embeddings


{s.qirigid}

It is a huge project posed by Gromov (1981) to classify groups up to quasi-isometries. One step towards such a classication would be to describe all self-quasi-isometries of a given group. Certain

190

groups, e.g., fundamental groups of compact hyperbolic manifolds of dimension n 3) are quite result is: Theorem 16.1 (Mostow rigidity 1968). If two complete nite volume hyperbolic manifolds M, N isometry of Hn . have 1 (M ) 1 (N ), then they are isometric, moreover, the group isomorphism is induced by an

rigid: all quasi-isometries come in some sense from group automorphisms. A similar, more classical,
{t.Mostow}

Let us give here the rough strategy of the proof. The group isomorphism induces a quasiisometry of Hn , which then induces a quasi-conformal map on the ideal boundary S n1 . This turns out (because of what?) to be a M obius map, i.e., it comes from an isometry of Hn . An application of the Mostow rigidity theorem that is interesting from a probabilistic point of view, related to Theorem 11.1, was by Thurston, who proved that any nite triangulated planar graph has an essentially unique circle packing representation. However, there is also a simple elementary proof, due to Oded Schramm, using a maximal principle argument [Wik10a]. On the other end, the quasi-isometry group of Z is huge and not known. Here is a completely probabilistic problem of the same avor. See [Pel07] for details. Question 16.2 (Bal azs Szegedy). Take two independent Poisson point processes on R. Are they quasi-isometric to each other a.s.? This is motivated by the probably even harder question of Mikl os Ab ert (2003): are two independent innite clusters of Ber(p) percolation on the same transitive graph quasi-isometric almost surely? Gromov conjectured that if two groups are quasi-isometric to each other, then they are also bi-Lipschitz equivalent, i.e., there is a bijective quasi-isometry. The following proof of a special case I learnt from G abor Elek: Exercise 16.1. * Using wobbling paradoxical decompositions, show that if two non-amenable groups are quasi-isometric to each other, then they are also bi-Lipschitz equivalent. However, the conjecture turned out to be false for solvable groups [Dym10]. But it is very much open for nilpotent groups, a favourite question of mine: Question 16.3. Are quasi-isometric nilpotent groups also bi-Lipschitz equivalent? I think the answer is yes, Bruce Kleiner thinks its no. It is known to be yes if one of the groups is Zd , see [Sha04]. For a long while it was not known if all transitive graphs are quasi-isometric to some Cayley graph. For instance, transitive graphs of polynomial growth are always quasi-isometric to a nilpotent group. This is an interesting question from the percolation point of view, since we expect most properties to be quasi-isometry invariant, while a key tool, the Mass Transport Principle (12.4), works only for unimodular ones (including Cayley graphs). So, an armative answer would mean that percolation

191

theory is somehow on the wrong track. Fortunately, Eskin-Fisher-Whyte proved as a byproduct of their work [EsFW07] on the quasi-isometric rigidity of the lamplighter groups F Z, where F is a nite their denition, are counterexamples. Question 16.4. Is every unimodular transitive graph quasi-isometric to a Cayley graph? The Eskin-Fisher-Whyte proof introduces something called coarse metric dierentiation, a technique similar to the one used by Cheeger and Kleiner to prove that the Heisenberg group does not have a Lipschitz embedding into L1 , see [ChKN09], and by Lee and Raghavendra to show that that a universal constant suces for all planar graphs. there are nite planar graphs needing at least a Lipschitz constant 2 [LeeRa10]. It is conjectured In general, it is a huge subject what nite and innite metric spaces embed into what Lp space with how much metric distortion. One motivation is from theoretical computer science: analyzing large data sets (equipped with a natural metric, like the number of disagreements in two DNA sequences) is much easier if the data set is a subset of some nice space. We have also used nice harmonic embeddings into L2 to gain algebraic information (in Kleiners proof of Gromovs theorem) and to analyze random walks (in the Ershler-Lee-Peres results). There are a lot of connections between random walks and embeddings, see [NaoP08]. The target case of L2 is easier, L1 is more mysterious. Exercise 16.2. Show that any nite subset of L2 embeds isometrically into L1 . Abelian group, that the non-unimodular Diestel-Leader graphs DL(k, ) with k = , see [Woe05] for

References
[AbH12] M. Ab ert and T. Hubai. Benjamini-Schramm convergence and the distribution of chromatic roots for sparse graphs. Preprint, arXiv:1201.3861 [math.CO] [AbN07] M. Ab ert and N. Nikolov. Rank gradient, cost of groups and the rank versus Heegaard genus problem. J. Eur. Math. Soc., to appear. [arXiv:math.GR/0701361] [AiBNW99] M. Aizenman, A. Burchard, C. M. Newman and D. B. Wilson. Scaling limits for minimal and random spanning trees in two dimensions. Random Structures Algorithms 15 (1999), 319367. [arXiv:math.PR/9809145] [AiW06] M. Aizenman and S. Warzel. The canopy graph and level statistics for random operators on trees. Math. Phys. Anal. Geom. 9 (2006), no. 4, 291333 (2007). [arXiv:math-ph/0607021] [AjKSz82] M. Ajtai, J. Koml os and E. Szemer edi. Largest random component of a k-cube. Combinatorica 2 (1982), 17. [AjKSz87] M. Ajtai, J. Koml os, and E. Szemer edi. Deterministic simulation in LOGSPACE. In Proc. of the 19th Ann. ACM Symp. on Theory of Computing, pp. 132140, 1987. 192

[Ald97]

D. the

Aldous.

Brownian

excursions, Ann.

critical Probab.

random 25

graphs

and

multiplicative

coalescent.

(1997),

812854.

http://www.stat.berkeley.edu/~aldous/Papers/me73.ps.Z [AldL07] D. Aldous and R. Lyons. Processes on unimodular random networks. Electron. J. Probab. 12 (2007), Paper 54, 14541508. http://128.208.128.142/~ejpecp/viewarticle.php?id=1754 [AldS04] D. Aldous and J. M. Steele. The objective method: probabilistic combinato-

rial optimization and local weak convergence. In Probability on Discrete Structures, vol. 110 of Encyclopaedia Math. Sci., pages 172. Springer, Berlin, 2004. http://www.stat.berkeley.edu/~aldous/Papers/me101.pdf [Alo86] N. Alon. Eigenvalues and expanders. Combinatorica 6 (1986), 8396.

[AloBS04] N. Alon, I. Benjamini and A. Stacey. Percolation on nite graphs and isoperimetric inequalities. Ann. Probab. 33 (2004), 17271745. [AloM85] N. Alon and V. D. Milman. 1 , isoperimetric inequalities for graphs, and superconcentrators. J. Combin. Theory Ser. B 38 (1985), no. 1, 7388. [AloS00] N. Alon and J. H. Spencer. The Probabilistic Method. 2nd edition. Wiley, 2000.

[AmAV09] G. Amir, O. Angel and B. Vir ag. Amenability of linear-activity automaton groups. Preprint, arXiv:0905.2007 [math.GR]. [Ang03] O. Angel. Growth and percolation on the uniform in?nite planar triangulation. Geom. Funct. Anal. 13 (2003), no. 5, 935974. [arXiv:math.PR/0208123] [AngB07] O. Angel and I. Benjamini. A phase transition for the metric distortion of percolation on the hypercube. Combinatorica 27 (2007), 645658. [arXiv:math.PR/0306355] [AngSz] O. Angel and B. Szegedy. On the recurrence of the weak limits of discrete structures. In preparation. [AntP96] P. Antal and A. Pisztora. On the chemical distance in supercritical Bernoulli percolation. Ann. Probab. 24 (1996), 10361048. [Arr83] R. Arratia. Site recurrence for annihilating random walks on Zd . Ann. Probab. 11 (1983), 706713. [ArzBLRSV05] G. N. Arzhantseva, J. Burillo, M. Lustig, L. Reeves, H. Short and E. Ventura. Uniform non-amenability. Adv. Math. 197 (2005), 499522. [Aus09] T. Austin. Rational group ring elements with kernels having irrational dimension. Preprint, arXiv:0909.2360v2 [math.GR].

193

[BabSz92] L. Babai and M. Szegedy. Local expansion of symmetrical graphs. Combin. Probab. & Computing 1 (1992), 111. [BabB99] E. Babson and I. Benjamini. Cut sets and normed cohomology, with applications to percolation. Proc. Amer. Math. Soc. 127 (1999), 589597. [BaliBo] P. Balister and B. Bollob as. Projections, entropy and sumsets. Preprint,

arXiv:0711.1151v1 [math.CO] [BalBDCM10] J. Balogh, B. Bollob as, H. Duminil-Copin, R. Morris. The sharp threshold for bootstrap percolation in all dimensions. Trans. Amer. Math. Soc., to appear. arXiv:1010.3326 [math.PR] [BalBM10] J. Balogh, B. Bollob as and R. Morris. Bootstrap percolation in high dimensions. Combinatorics, Probability & Computing 19 (2010), 643692. arXiv:0907.3097 [math.PR] [BalP07] J. Balogh and B. Pittel. Bootstrap percolation on random regular graphs. Random Struc. & Alg. 30 (2007), 257286. [BalPP06] J. Balogh, Y. Peres and G. Pete. Bootstrap percolation on innite trees and non-amenable groups, Combin. Probab. & Computing 15 (2006), no. 5, 715730. [arXiv:math.PR/0311125] [BarE11] L. Bartholdi and A. Erschler. Poisson-Furstenberg boundary and growth of groups. Preprint, arXiv:1107.5499 [math.GR] [BarGN03] L. Bartholdi, R. Grigorchuk and V. Nekrashevych. From fractal groups to fractal sets. Fractals in Graz (P. Grabner and W. Woess, eds.), Trends in Mathematics, Birkh auser Verlag, Basel, 2003, pp. 25118. [arXiv:math.GR/0202001v4] [BarKN10] L. Bartholdi, V. Kaimanovich and V. Nekrashevych. On amenability of automata groups. Duke Math. J. 154 (2010), no. 3, 575598. arXiv:0802.2837 [math.GR] [BarV05] L. Bartholdi and B. Vir ag. Amenability via random walks. Duke Math. J. 130 (2005), 3956. [arXiv:math.GR/0305262] [BefDC10] V. Beara and H. Duminil-Copin. The self-dual point of the two-dimensional randomarXiv:1006.5073 [math.PR] cluster model is critical for q 1. Probab. Theory Related Fields, to appear.

[BekdHV08] B. Bekka, P. de la Harpe and A. Valette. Kazhdans property (T). Cambridge University Press, 2008. http://perso.univ-rennes1.fr/bachir.bekka/KazhdanTotal.pdf [BekV97] M. Bekka and A. Valette. Group cohomology, harmonic functions and the rst L2 -Betti number. Potential Anal. 6 (1997), no. 4, 313326.

194

[Bel03]

I. Belegradek. On co-Hopan nilpotent groups. Bull. London Math. Soc. 35 (2003) 805 811.

[Ben91]

I. Benjamini. Instability of the Liouville property for quasi-isometric graphs and manifolds of polynomial volume growth. J. Theoret. Probab. 4 (1991), no. 3, 631637.

[Ben08]

I. Benjamini. Discussion at an American Institute of Mathematics workshop, Palo Alto, May 2008.

[BenC10] I. Benjamini and N. Curien. Ergodic theory on stationary random graphs. Preprint, arXiv:1011.2526 [math.PR] [BenKPS04] I. Benjamini, H. Kesten, Y. Peres and O. Schramm. Geometry of the uniform spanning forest: transitions in dimensions 4, 8, 12, . . . . Ann. of Math. (2) 160 (2004), 465491. [BenK10] I. Benjamini and G. Kozma. Nonamenable Liouville graphs. Preprint,

arXiv:1010.3365 [math.MG] [BLPS99a] I. Benjamini, R. Lyons, Y. Peres and O. Schramm. Group-invariant percolation on graphs. Geom. Funct. Anal. 9 (1999), 2966. [BLPS99b] I. Benjamini, R. Lyons, Y. Peres and O. Schramm. Critical percolation on any nonamenable group has no innite clusters. Ann. Probab. 27 (1999), 13471356. [BLPS01] I. Benjamini, R. Lyons, Y. Peres and O. Schramm. Uniform spanning forests. Ann. Probab. 29 (2001), 165. [BenLS99] I. Benjamini, R. Lyons and O. Schramm. Percolation perturbations in potential theory and random walks, In: Random walks and discrete potential theory (Cortona, 1997), Sympos. Math. XXXIX, M. Picardello and W. Woess (eds.), Cambridge Univ. Press, Cambridge, 1999, pp. 5684. [arXiv:math.PR/9804010] [BenNP11] I. Benjamini, A. Nachmias and Y. Peres. Is the critical percolation probability local? Probab. Theory Related Fields 149 (2011), 261269. arXiv:0901.4616 [math.PR] [BenS96a] I. Benjamini and O. Schramm. Harmonic functions on planar and almost planar graphs and manifolds, via circle packings. Invent. Math. 126 (1996), 565587. [BenS96b] I. Benjamini and O. Schramm. Random walks and harmonic functions on innite planar graphs using square tilings. Ann. Probab. 24 (1996), 12191238. [BenS96c] I. Benjamini and a and few O. Schramm. Elect. Percolation Commun. beyond Probab. Zd , 1 many (1996), ques7182.

tions

answers.

http://www.math.washington.edu/~ejpecp/ECP/viewarticle.php?id=1561&layout=abstract [BenS01] I. Benjamini and O. Schramm. Recurrence of distributional limits of nite planar graphs. Electron. J. Probab. 6, no. 23, 13 pp. (electronic). [arXiv:math.PR/0011019] 195

[BerKMP05] N. Berger, C. Kenyon, E. Mossel and Y. Peres. Glauber dynamics on trees and hyperbolic graphs. Probab. Theory Related Fields 131 (2005), no. 3, 311340. [arXiv:math.PR/0308284] [Bergm68] G. Bergman. On groups acting on locally nite graphs. Ann. Math. 88 (1968), 335340. [BhES09] S. Bhamidi, S. N. Evans and A. Sen. Spectra of large random trees. J. Theoret. Probab., to appear. arXiv:0903.3589v2 [math.PR]. [Bil86] P. Billingsley. Probability and measure. Second edition. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons, Inc., New York, 1986. [BobT06] S. ties G. in Bobkov discrete and P. Tetali. Modied logarithmic 19 (2006), Sobolev no. 2, inequali289336.

settings.

J. Theoret. Probab.

http://people.math.gatech.edu/~tetali/PUBLIS/BT_ENT.pdf [Bol81] B. Bollob as. The independence ratio of regular graphs. Proc. Amer. Math. Soc. 83 (1981), 433436. http://www.jstor.org/stable/2043545 [Bol01] [BolL91] B. Bollob as. Random graphs. 2nd edition. Cambridge University Press, 2001. B. Bollob as and I. Leader. Edge-isoperimetric inequalities in the grid. Combinatorica 11 (1991), 299314. [BolT87] B. Bollob as and A. Thomason. Threshold functions. Combinatorica 7 (1987), 3538.

[BorBL09] J. Borcea, P. Br and en and T. M. Liggett. Negative dependence and the geometry of polynomials. J. Amer. Math. Soc. 22 (2009), 521567. arXiv:0707.2340 [math.PR] [BorChKL13] C. Borgs, J. Chayes, J. Kahn and L. Lov asz. Left and right convergence of graphs with bounded degree. Random Structures & Algorithms 42 (2013), 128. arXiv:1002.0115 [math.CO] [BouKKKL92] J. Bourgain, J. Kahn, G. Kalai, Y. Katznelson and N. Linial. The inuence of variables in product spaces. Israel J. Math. 77 (1992), 5564. [BurgM00] M. Burger and S. Mozes. Lattices in products of trees. Publ. Math. I.H.E.S. 92 (2000), 151194. [BurtK89] R. M. Burton and M. Keane. Density and uniqueness in percolation. Comm. Math. Phys. 121 (1989), 501505. [BurtP93] R. M. Burton and R. Pemantle. Local characteristics, entropy and limit theorems for spanning trees and domino tilings via transfer-impedances. Ann. Probab. 21 (1993), 1329 1371. [arXiv:math.PR/0404048]

196

[Cal09]

D. Calegari. Amenability of Thompsons group F ? Geometry and the imagination, weblog. http://lamington.wordpress.com/2009/07/06/amenability-of-thompsons-group-f/

[CanFP96] J. W. Cannon, W. J. Floyd, and W. R. Parry. Introductory notes on Richard Thompsons groups. Enseign. Math. (2), 42 (1996), No. 3-4, 215256. [CapM06] P. Caputo and F. Martinelli. Phase ordering after a deep quench: the stochastic Ising and hard core gas models on a tree. Probab. Theory Relat. Fields 136 (2006), 3780. [arXiv:math.PR/0412450] [Car85] T. K. Carne. A transmutation formula for Markov chains. Bull. Sci. Math. (2) 109 (1985), 399405. [Che70] J. Cheeger. A lower bound for the smallest eigenvalue of the Laplacian. In: R.C. Gunning,editor, Problems in Analysis, pages 195199, Princeton Univ. Press, Princeton, NJ, 1970. A symposium in honor of Salomon Bochner, Princeton University, 13 April 1969. [ChKN09] J. Cheeger, B. Kleiner and A. Naor. Compression bounds for Lipschitz maps from the Heisenberg group to L1 . arXiv:0910.2026 [math.MG] [ChPP04] D. Chen and Y. Peres, with an appendix by G. Pete. Anchored expansion, percolation and speed. Ann. Probab. 32 (2004), 29782995. [arXiv:math.PR/0303321] [ColM97] T. H. Colding and W. P. Minicozzi II. Harmonic functions on manifolds. Ann. of Math. (2) 146 (1997), no. 3, 725747. [Cou00] T. Coulhon. Random walks and geometry on innite graphs. Lecture notes on analysis on metric ed., spaces, Trento, C.I.R.M., di 1999. Pisa, L. Ambrosio, pp. F. Serra Cassano, Scuola Normale Superiore 2000, 530.

http://www.u-cergy.fr/rech/pages/coulhon/trento.ps [CouGP01] T. Coulhon, A. Grigoryan and C. Pittet. A geometric approach to on-diagonal heat kernel lower bounds on groups. Ann. Inst. Fourier (Grenoble) 51 (2001), 17631827. [CouSC93] T. Coulhon and L. Salo-Coste. Isop erim etrie pour les groupes et les vari et es. Rev. Mat. Iberoamericana 9 (1993), 293314. [CovT06] T. M. Cover and J. A. Thomas. Elements of information theory. Second edition. John Wiley and Sons, New York, 2006. [CsF12] P. Csikv ari, P. E. Frenkel. BenjaminiSchramm continuity of root moments of graph polynomials. Preprint, arXiv:1204.0463 [math.CO] [Dei07] P. Deift. Universality for mathematical and physical systems. In International Congress of Mathematicians (Madrid, 2006), Vol I., pp. 125152. Eur. Math. Soc., Z urich, 2007. http://www.icm2006.org/proceedings/Vol_I/11.pdf 197

[Del99]

T. Delmotte. Parabolic Harnack inequality and estimates of Markov chains on graphs. Revista Matematica Iberoamericana 15 (1999), no. 1, 181232.

[Der76]

Y. Derriennic. Lois z ero ou deux pour les processus de Markov. Applications aux marches al eatoires. Ann. Inst. H. Poincar e Sect. B (N.S.) 12 (1976), no. 2, 111129.

[DiaF90]

P. Diaconis and J. A. Fill. Strong stationary times via a new form of duality. Ann. Probab. 18 (1990), 14831522.

[DicS02]

W. Dicks and T. Schick. The spectral measure of certain elements of the complex group ring of a wreath product. Geom. Dedicata, 93 (2002), 121137.

[Die00]

R. Diestel Graph Theory. 2nd Edition. Graduate Texts in Mathematics, vol. 173. Springer, New York, 2000.

[DieL01]

R. Diestel and I. Leader. A conjecture concerning a limit of non-Cayley graphs. J. Alg. Combin. 14 (2001), 1725.

[Dod84]

J. Dodziuk. Dierence equations, isoperimetric inequality and transience of certain random walks. Trans. Amer. Math. Soc. 284 (1984), 787794.

[Doo59]

J. L. Doob. Discrete potential theory and boundaries. J. Math. Mech. 8 (1959), 433458; erratum 993.

[Doy88]

P. Doyle. Electric currents in innite networks (preliminary version). Unpublished manuscript, available at [arXiv:math.PR/0703899].

[DrK09]

C.

Drut u

and

M.

Kapovich:

Lectures

on

geometric

group

theory.

http://www.math.ucdavis.edu/~kapovich/EPR/ggt.pdf [DumCS11] H. Duminil-Copin and S. Smirnov. Conformal invariance of lattice models. Lecture notes for the Clay Mathematical Institute Summer School in Buzios, 2010. arXiv:1109.1549 [math.PR] [Dur96] [Dym10] R. Durrett. Probability: theory and examples. Second edition. Duxbury Press, 1996. T. Dymarz. Bilipschitz equivalence is not equivalent to quasi-isometric equivalence for nitely generated groups. Duke Math. J. 154 (2010), 509526. arXiv:0904.3764 [math.GR] [EdS88] R. G. Edwards and A. D. Sokal. Generalization of the Fortuin-Kasteleyn-Swendsen-Wang representation and Monte Carlo algorithm. Phys. Rev. D (3) 38 (1988), no. 6, 20092012. [Ele12] [ElL10] G. Elek. Full groups and socity. Preprint, arXiv:1211.0621 [math.GR] G. Elek and G. Lippner. Borel oracles. An analytical approach to constant-time algorithms. Proc. Amer. Math. Soc. 138 (2010), 29392947. arXiv:0907.1805 [math.CO] 198

[ElSz06]

G. Elek and E. Szab o. On soc groups. J. of Group Theory 9 (2006), 161171. [arXiv:math.GR/0305352]

[ElTS05]

G. Elek and V. T. S os. Paradoxical decompositions and growth conditions. Combin., Probab. & Comput. 14 (2005), 81105.

[vEnt87]

A. C. D. van Enter. Proof of Straleys argument for bootstrap percolation. J. Stat. Phys. 48 (1987), 943945.

[ErdR60] P. Erd os and A. R enyi. On the evolution of random graphs. Mat. Kutat o Int. K ozl. (1960) 5, 1760. [Ers04a] A. Erschler. Liouville property for groups and manifolds. Invent. Math. 155 (2004), 55 80. [Ers04b] A. Erschler. Boundary behavior for groups of subexponential growth. Ann. of Math. (2) 160 (2004), no. 3, 11831210. [EsFW07] A. Eskin, D. Fisher and K.Whyte. Quasi-isometries and rigidity of solvable groups. Pure and Applied Mathematics Quarterly 3 (2007), 927947. [arXiv:math.GR/0511647] [Eva98] L. C. Evans. Partial dierential equations. Graduate Studies in Mathematics, 19. American Mathematical Society, Providence, RI, 1998. [Fl55] [ForK72] E. Flner. On groups with full Banach mean values. Math. Scand. 3 (1955), 243254. C. M. Fortuin and P. W. Kasteleyn. On the random-cluster model. I. Introduction and relation to other models. Physica 57 (1972), 536564. [FriB99] E. Friedgut, with an and Appendix the k-sat by J. Bourgain. Jour. Sharp thresholds Soc. of 12 and

graphs (1999),

properties,

problem.

Amer.

Math.

10171054.

http://www.ma.huji.ac.il/~ehudf/docs/thre.ps

http://www.ma.huji.ac.il/~ehudf/docs/bo.ps [GaGa81] O. Gabber and Z. Galil. Explicit constructions of linear-sized superconcentrators. J. Comput. System Sci. 22 (1981), no. 3, 407420. [Gab02] D. Gaboriau. On orbit equivalence of measure preserving actions. Rigidity in dynamics and geometry (Cambridge, 2000), pp. 167186, Springer, Berlin, 2002. http://www.umpa.ens-lyon.fr/~gaboriau/Travaux-Publi/Cambridge/Cambridge.html [Gab05] D. Gaboriau. Invariant percolation and harmonic Dirichlet functions. Geom. Funct. Anal. 15 (2005), no. 5, 10041051. [arXiv:math.PR/0405458] [Gab10] D. Gaboriau. Orbit equivalence and measured group theory. In: Proceedings of the 2010 Hyderabad ICM, to appear. arXiv:1009.0132v1 [math.GR]

199

[GabL09] D. Gaboriau and R. Lyons. A measurable-group-theoretic solution to von Neumanns problem. Invent. Math. 177 (2009), no. 3, 533540. arXiv:0711.1643 [math.GR] [GarPS10a] C. Garban, G. Pete and O. Schramm. The scaling limit of the Minimal Spanning Tree a preliminary report. XVIth International Congress on Mathematical Physics (Prague 2009), ed. P. Exner, pp. 475480. World Scientic, Singapore, 2010. arXiv:0909.3138 [math.PR] [GarPS10b] C. Garban, G. Pete and O. Schramm. Pivotal, cluster and interface measures for critical planar percolation. Preprint, arXiv:1008.1378 [math.PR]. [GeHM01] H-O. Georgii, O. H aggstr om and C. Maes. The random geometry of equilibrium phases. Phase transitions and critical phenomena, Vol. 18, Academic Press, San Diego, CA, pp. 1142. [arXiv:math.PR/9905031v1] [GlW97] E. Glasner and B. Weiss. Kazhdans property T and the geometry of the collection of invariant measures. Geom. Funct. Anal. 7 (1997), 917935. [GoV97] E. I. Gordon and A. M. Vershik. Groups that are locally embeddable in the class of nite groups. Algebra i Analiz 9 (1997), no. 1, 7197. English translation: St. Petersburg Math. J. 9 (1997), no. 1, 4967. http://www.pdmi.ras.ru/~vershik/gordon.ps [Gra10] L. Grabowski. On Turing machines, dynamical systems and the Atiyah problem. Preprint, arXiv:1004.2030 [math.GR]. [GraG06] B. T. Graham and G. R. Grimmett. Inuence and sharp threshold theorems for monotonic measures. Ann. Probab. 34 (2006), 17261745. [GHS70] R. B. Griths, C. A. Hurst and S. Sherman. Concavity of magnetization of an Ising ferromagnet in a positive external eld. J. Math. Phys. 11 (1970), 790795. [Gri83] R. I. Grigorchuk. On the Milnor problem of group growth. Soviet Math. Dokl. 28 (1983), no. 1, 2326. [GriNS00] R. I. Grigorchuk, V. V. Nekrashevich and V. I. Sushchanskii. Automata, dynamical systems and groups. Proceedings of the Steklov Institute of Mathematics 231 (2000), 128203. http://www.math.tamu.edu/~grigorch/publications/PSIM128.PS [GriP08] R. Grigorchuk and I. Pak. Math. Groups (2) 54 of intermediate (2008), no. growth, 3-4, an in-

troduction.

Enseign.

251272.

http://www.math.tamu.edu/~grigorch/publications/grigorchuk_pak_intermediate_growth.pdf [GriZ01] R. Grigorchuk and A. Zuk. The lamplighter group as a group generated by a 2-state automaton and its spectrum. Geom. Dedicata 87 (2001), no. 1-3, 209244.

200

[Gri99]

G. Grimmett. Percolation. Second edition. Grundlehren der Mathematischen Wissenschaften, 321. Springer-Verlag, Berlin, 1999.

[Gri06]

G. Grimmett. The random-cluster model. Grundlehren der mathematischen Wissenschaften, 333, Springer-Verlag, Berlin, 2006.

[Gri10]

G. Grimmett. Probability on Graphs. IMS Textbook Series, Vol. 1. Cambridge University Press, 2010. http://www.statslab.cam.ac.uk/~grg/books/pgs.html

[GriM11] G. R. Grimmett and I. Manolescu. Inhomogeneous bond percolation on square, triangular, and hexagonal lattices. Preprint, arXiv:1105.5535 [math.PR]. [Gro81] M. Gromov. Groups of polynomial growth and expanding maps. Appendix by J. Tits. Publ. Math. I.H.E.S. 53 (1981), 5373, 7478. [Gro93] M. Gromov. Asymptotic invariants of innite groups. Geometric group theory, Vol. 2 (Sussex, 1991), pages 1295. Cambridge Univ. Press, Cambridge, 1993. [Gro99] M. Gromov. Endomorphisms of symbolic algebraic varieties. J. European Math. Soc. 1 (1999), 109197. [GuGN13] O. Gurel-Gurevich and A. Nachmias. Recurrence of planar graph limits. Ann. Math. 177 (2013), no. 2, 761781. arXiv:1206.0707 [math.PR] [H ag95] . H aggstr om. Random-cluster measures and uniform spanning trees. Stochastic Process. Appl. 59 (1995), 267275. [H aM09] . H aggstr om and P. Mester. Some two-dimensional nite energy percolation processes. Electr. Comm. Probab. 14 (2009), 4254. arXiv:1011.2872 [math.PR] [H aPS99] . H aggstr om, Y. Peres and R. H. Schonmann. Percolation on transitive graphs as a coalescent process: Relentless merging followed by simultaneous uniqueness. Perplexing Problems in Probability, Festschrift in honor of Harry Kesten, ed. M. Bramson and R. Durrett, pages 6990. Birkh auser, Boston, 1999. [HamMP12] A. Hammond, E. Mossel and G. Pete. Exit time tails from pairwise decorrelation in hidden Markov chains, with applications to dynamical percolation. Elect. J. Probab. 17 (2012), Article no. 68, 116. arXiv:1111.6618 [math.PR]. [HamPS12] A. Hammond, G. Pete, and O. Schramm. Local time on the exceptional set of dynamical percolation, and the Incipient Innite Cluster. Preprint, arXiv:1208.3826 [math.PR] [HarvdHS03] T. Hara, R. van der Hofstad and G. Slade. Critical two-point functions and the lace expansion for spread-out high-dimensional percolation and related models. Ann. Probab. 31 (2003), no. 1, 349408. http://www.math.ubc.ca/~slade/xspace_AOP128.pdf

201

[dlHar00] P. de la Harpe. Topics in geometric group theory. Chicago Lectures in Mathematics, University of Chicago Press, 2000. [Har60] T. E. Harris. A lower bound for the critical probability in a certain percolation process. Proc. Cambridge Philos. Soc. 56 (1960), 1320. [HatLSz12] H. Hatami, L. Lov asz and B. Szegedy. Limits of local-global convergent graph sequences. Preprint, arXiv:1205.4356 [math.CO]. [HeS95] Z. X. He and O. Schramm. Hyperbolic and parabolic packings. Discrete Comput. Geom. 14 (1995), 123149. [vdHof13] R. van der Hofstad. Random graphs and complex networks. Book in preparation. http://www.win.tue.nl/~rhofstad/NotesRGCN.pdf [Hol07] A. E. Holroyd: Astonishing cellular automata (expository article). Bul-

letin du Centre de Recherches Mathematiques (2007), Vol 13, No 1, 1013. http://www.math.ubc.ca/~holroyd/papers/cell.pdf [HooLW06] S. Hoory, N. Linial and A. Wigderson. Expander graphs and their applications. Bulletin of the American Mathematical Society 43 (2006), No. 4, 439561. http://www.cs.huji.ac.il/~nati/PAPERS/expander_survey.pdf [How00] C. D. Howard. Zero-temperature Ising spin dynamics on the homogeneous tree of degree three. J. Appl. Prob. 37 (2000), 736747. [HowN03] C. D. Howard and C. M. Newman. The percolation transition for the zero-temperature stochastic Ising model on the hexagonal lattice. J. Stat. Physics, 111 (2003), 5772. [IKT09] A. Ioana, A. S. Kechris and T. Tsankov. Subequivalence relations and positive-denite functions. Groups Geom. Dyn. 3 (2009), no. 4, 579625. arXiv:0806.0430 [math.DS] [JaR09] T. S. Jackson and N. Read. Theory of minimum spanning trees I: Mean-eld theory and strongly disordered spin-glass model. arXiv:0902.3651v3 [cond-mat.stat-mech] [JaLR00] S. Janson, T. Luczak and A. Rucinski. Random graphs. Wiley-Interscience, New York, 2000. [J ar03] A. J arai. Incipient innite percolation clusters in 2D. Ann. Probab. 31 (2003) no. 1, 444485. [JoS99] J. Jonasson and J. E. Steif. Amenability and phase transition in the Ising model. J. Theoret. Probab. 12 (1999), 549559. http://www.math.chalmers.se/~steif/p25.pdf [KahKL88] J. Kahn, G. Kalai and N. Linial. The inuence of variables on boolean functions. 29th Annual Symposium on Foundations of Computer Science, 6880, 1988. http://www.ma.huji.ac.il/~kalai/kkl.ps 202

[Kai00]

V. Kaimanovich. The Poisson formula for groups with hyperbolic properties. Ann. of Math. (2) 152 (2000), no. 3, 659692.

[KaiV83]

V. A. Kaimanovich and A. M. Vershik. Random walks on discrete groups: boundary and entropy. Ann. Probab. 11 (1983), 457490.

[KalS06]

G. Kalai and S. Safra. Threshold phenomena and inuence: perspectives from mathematics, computer science, and economics. In Computational complexity and statistical physics, St. Fe Inst. Stud. Sci. Complex., pages 2560. Oxford Univ. Press, New York, 2006. http://www.ma.huji.ac.il/~kalai/ML.pdf

[Kap07]

M. Kapovich. Energy of harmonic functions and Gromovs proof of the Stallings theorem. Preprint, arXiv:0707.4231 [math.GR], 2007.

[KecM04] A. S. Kechris and B. D. Miller. Topics in orbit equivalence. Springer, 2004. [Kes59] H. Kesten. Full Banach mean values on countable groups. Math. Scand. 7 (1959), 146 156. [Kes80] H. Kesten. The critical probability of bond percolation on the square lattice equals 1/2. Comm. Math. Phys. 74 (1980), no. 1, 4159. [Kes86] H. Kesten. The incipient innite cluster in two-dimensional percolation. Probab. Th. Rel. Fields 73 (1986), 369394. [Kir07] W. Kirsch. An invitation to random Schr odinger operators. Lecture notes,

arXiv:0709.3707 [math-ph]. [Kle10] B. Kleiner. A new proof of Gromovs theorem on groups of polynomial growth. J. Amer. Math. Soc. 23 (2010), 815829. arXiv:0710.4593 [math.GR] [KorS97] N. J. Korevaar and R. M. Schoen. Global existence theorems for harmonic maps to non-locally compact spaces. Comm. Anal. Geom. 5 (1997), no. 2, 333387. [Koz07] G. Kozma. Percolation, perimetry, planarity. Rev. Math. Iberoam. 23 (2007), no. 2, 671 676. [arXiv:math.PR/0509235] [Koz11] G. Kozma. Percolation on a product of two trees. Ann. Probab. 39 (2011), 18641895. Special issue dedicated to the memory of Oded Schramm. arXiv:1003.5240 [math.PR]. [KoN09a] G. Kozma and A. Nachmias. The Alexander-Orbach conjecture holds in high dimensions. Invent. Math 178 (2009), no. 3, 635654. arXiv:0806.1442 [math.PR] [KoN09b] G. Kozma and A. Nachmias. Arm exponents in high dimensional percolation. J. Amer. Math. Soc., to appear. arXiv:arXiv:0911.0871v1 [math.PR].

203

[LeeP09]

J. Lee and Y. Peres. Harmonic maps on amenable groups and a diusive lower bound for random walks. Ann. Probab., to appear. arXiv:0911.0274 [math.PR]

[LeeRa10] J. Lee and P. Raghavendra. Coarse dierentiation and multi-ows in planar graphs. Discrete Comput Geom. 43 (2010), 346362. arXiv:0804.1573v4 [math.MG] [LeNW07] F. Lehner, M. Neuhauser and W. Woess. On the spectrum of lamplighter groups and percolation clusters. Math. Ann., to appear. arXiv:0712.3135v2 [math.FA] [LevPW09] D. A. Levin, Y. Peres and E. L. Wilmer. Markov chains and mixing times. With a chapter by J. G. Propp and D. B. Wilson. American Mathematical Society, Providence, RI, 2009. http://www.uoregon.edu/~dlevin/MARKOV/ [Los87] V. Losert. On the structure of groups with polynomial growth. Math. Z. 195 (1987), 109117. [LovKa99] L. Lov asz and R. Kannan. Faster mixing via average conductance. Proceedings of the 27th Annual ACM Symposium on theory of computing, 1999. [Lub94] A. Lubotzky. Discrete groups, expanding graphs and invariant measures. Progress in Mathematics, 125. Birkh auser Verlag, Basel, 1994. With an appendix by J. D. Rogawski. [L uc94] W. L uck. Approximating L2 -invariants by their nite-dimensional analogues. Geom. Funct. Anal. 4 (1994), no. 4, 455481. [Lyo92] R. Lyons. Random walks, capacity, and percolation on trees. Ann. Probab. 20 (1992), 20432088. http://mypage.iu.edu/~rdlyons/ps/cap11.ps [Lyo00] R. Lyons. Phase transitions on nonamenable graphs. J. Math. Phys. 41 (2000), 1099 1126. [arXiv:math.PR/9908177] [Lyo05] R. Lyons. Asymptotic enumeration of spanning trees. Combin. Probab. & Comput. 14 (2005), 491522. [arXiv:math.CO/0212165] [Lyo09] R. Lyons. Random complexes and 2 -Betti numbers. J. Top. Analysis 1 (2009), 153175. arXiv:0811.2933 [math.PR] [LyNaz11] R. Lyons and F. Nazarov. Perfect matchings as IID factors on non-amenable groups. Europ. J. Combin. 32 (2011), 11151125. arXiv:0911.0092v2 [math.PR] [LyPer10] R. Lyons, with Y. Peres. Probability on trees and networks. Book in preparation, present version is at http://mypage.iu.edu/~rdlyons. [LyPS03] R. Lyons, Y. Peres and O. Schramm. Markov chain intersections and the loop-erased walk. Ann. Inst. H. Poincar e, Probab. Statist. 39 (2003), 779791. [arXiv:math.PR/0107055]

204

[LyPS06] R. Lyons, Y. Peres and O. Schramm. Minimal spanning forests. Ann. Probab. 34 (2006), no. 5, 16651692. [arXiv:math.PR/0412263v5] [LySch99] R. Lyons and O. Schramm. Indistinguishability of percolation clusters. Ann. Probab. 27 (1999), 18091836. [arXiv:math.PR/9811170v2] [LyT83] T. Lyons. A simple criterion for transience of a reversible Markov chain. Ann. Probab. 11 (1983), 393402. [LyT87] T. Lyons. Instability of the Liouville property for quasi-isometric Riemannian manifolds and reversible Markov chains. J. Dierential Geom. 26 (1987), no. 1, 3366. [Mes10] P. Mester. Invariant monotone coupling need not exist. Ann. Probab., to appear. arXiv:1011.2283 [math.PR] [Mes11] P. Mester. A factor of i.i.d. with uniform marginals and innite clusters spanned by equal labels. Preprint, arXiv:1111.3067 [math.PR]. [Mil68a] [Mil68b] [Moh88] J. Milnor. A note on curvature and fundamental group. J. Di. Geom. 2 (1968), 17. J. Milnor. Growth of nitely generated solvable groups. J. Di. Geom. 2 (1968), 447449. B. Mohar. Isoperimetric inequalities, growth, and the spectrum of graphs. Linear Algebra Appl. 103 (1988), 119131. [MohW89] B. Mohar and W. Woess. A survey on spectra of innite graphs. Bull. London Math. Soc. 21 (1989), no. 3, 209234. [Mok95] N. Mok. Harmonic forms with values in locally constant Hilbert bundles. Proceedings of the Conference in Honor of Jean-Pierre Kahane (Orsay, 1993). J. Fourier Anal. Appl. (1995), Special Issue, 433453. [MonMS12] A. Montanari, E. Mossel and A. Sly. The weak limit of Ising models on locally tree-like graphs. Probab. Theory Related Fields 152 (2012), 3151. arXiv:0912.0719 [math.PR] [MorP05] B. Morris and Y. Peres. Evolving sets, mixing and heat kernel bounds. Prob. Th. Rel. Fields 133 (2005), no. 2, 245266. [arXiv:math.PR/0305349] [Mor10] R. Morris. Zero-temperature Glauber dynamics on Zd . Prob. Th. Rel. Fields (2010), arXiv:0809.0353v2 [math-ph] [M oP10] P. M orters and Y. Peres. Brownian motion. Cambridge Series in Statistical and Probabilistic Mathematics, Vol. 30. Cambridge University Press, 2010. http://people.bath.ac.uk/maspm/book.pdf. [MuP01] R. Muchnik and I. Pak. Percolation on Grigorchuk groups. Comm. Algebra 29 (2001), 661671. http://www.math.ucla.edu/~pak/papers/perc.ps 205

[NaNS00] S. Nanda, C. M. Newman and D. Stein. Dynamics of Ising spin systems at zero temperature. In On Dobrushins way (From Probability Theory to Statistical Mechanics), eds. R. Minlos, S. Shlosman and Y. Suhov, Am. Math. Soc. Transl. (2) 198 (2000), 183194. [NaoP08] A. Naor and Y. Peres. Embeddings of discrete groups and the speed of random walks. International Mathematics Research Notices (2008), article ID rnn076, 34 pages. arXiv:0708.0853 [math.MG] [Nek03] [Nek05] V. V. Nekrashevych. Iterated monodromy groups. [arXiv:math.DS/0312306] V. V. Nekrashevych. Self-similar groups. Mathematical Surveys and Monographs, Vol. 117, Amer. Math. Soc., Providence, RI, 2005. [NekP09] V. Nekrashevych and G. Pete. Scale-invariant groups. Groups, Geometry and Dynamics, to appear. arXiv:0811.0220v4 [math.GR] [NewS96] C. M. Newman and D. L. Stein. Ground-state structure in a highly disordered spin-glass model. J. Statist. Phys. 82 (1996), 11131132. [OW87] D. S. Ornstein and B. Weiss. Entropy and isomorphism theorems for actions of amenable groups. J. Analyse Math. 48 (1987), 1141. [Osi02] D.V. Osin. Weakly amenable groups. Computational and statistical group theory (Las Vegas, NV/Hoboken, NJ, 2001), Contemporary Mathematics, vol. 298, American Mathematical Society, Providence, RI, 2002, pp. 105113. [PaSN00] I. Pak and T. Smirnova-Nagnibeda. On non-uniqueness of percolation on nonamenable Cayley graphs Comptes Rendus de lAcademie des Sciences Series I Mathematics 330 (2000), No. 6, 495500. http://www.math.ucla.edu/~pak/papers/psn.pdf [Pel07] R. Peled. On rough isometries of Poisson processes on the line. Annals of Applied Probability, to appear. arXiv:0709.2383v2 [math.PR] [Per04] Y. Peres. Course notes for Probability on Trees and Networks, UC Berkeley, Fall 2004. Edited by Asaf Nachmias. http://stat-www.berkeley.edu/~peres/notes1.pdf [Per09] Y. Peres. Connectivity probability in critical percolation: An unpublished gem from Oded. Lecture at the Oded Schramm Memorial Conference, August 30-31, 2009, Microsoft Research. Viewable with Internet Explorer and Windows Media Player at

http://content.digitalwell.washington.edu/msr/external_release_talks_12_05_2005/17504/lectu [PerPS06] Y. Peres, G. Pete and A. Scolnicov. Critical percolation on certain non-unimodular graphs. New York J. of Math. 12 (2006), 118. [arXiv:math.PR/0501532] [PerR04] Y. Peres and D. Revelle. Mixing times for random walks on nite lamplighter groups. Electron. J. Probab. 26 (2004), 825845. [arXiv:math.PR/0404190] 206

[PerW10] Y. Peres, with D. Wilson. Game Theory: Alive. Book in preparation, available at http://dbwilson.com/games/. [Pes08] V. G. Pestov. Hyperlinear and soc groups: a brief guide. Bulletin of Symbolic Logic 14 (2008), 449480. arXiv:0804.3968v8 [math.GR] [Pet08] G. Pete. A note on percolation on Zd : isoperimetric prole via exponential cluster repulsion. Elect. Comm. Probab. 13 (2008), 377392. [arXiv:math.PR/0702474v4] [Pey08] R. Peyre. A probabilistic approach to Carnes bound. Potential Analysis 29 (2008), 17 36. http://www.normalesup.org/~rpeyre/pro/research/carne.pdf [Pin73] M. Pinsker. On the complexity of a concentrator. 7th International Teletrac Conference, Stockholm, 1973. 318/1-4. [Pip77] [Rau04] N. Pippenger. Superconcentrators. SIAM J. Comput. 6 (1977), no. 2, 298304. A. Raugi. A general Choquet-Deny theorem for nilpotent groups. Ann. Inst. H. Poincar e Probab. Statist. 40 (2004), no. 6, 677683. [Rud73] [SaC97] W. Rudin. Functional Analysis. McGraw-Hill, 1973. L. Salo-Coste. Lectures on nite Markov chains. Lectures on probability theory and statistics (Saint-Flour, 1996), 301413. LNM 1665, Springer, Berlin, 1997. [Sap07] M. V. Sapir. Some group theory problems. Internat. J. of Algebra and Comput. 17 (2007), 11891214. arXiv:0704.2899v1 [math.GR] [Scho92] R. H. Schonmann. On the behavior of some cellular automata related to bootstrap percolation. Ann. Probab. 20 (1992), no. 1, 174193. [Scho01] R. H. Schonmann. Multiplicity of phase transitions and mean-eld criticality on highly non-amenable graphs. Comm. Math. Phys. 219 (2001), no. 2, 271322. [Scho02] R. H. Schonmann. Mean-eld criticality for percolation on planar non-amenable graphs. Comm. Math. Phys. 225 (2002), no. 3, 453463. [Schr07] Oded Schramm. Conformally invariant scaling limits: an overview and a collection of problems. In International Congress of Mathematicians (Madrid, 2006), Vol. I, pages 513543. Eur. Math. Soc., Z urich, 2007. [arXiv:math.PR/0602151] [Sha04] Y. Shalom. Harmonic analysis, cohomology, and the large-scale geometry of amenable groups Acta Math. 192 (2004), 119185. [ShaT09] Y. Shalom and T. Tao. A nitary version of Gromovs polynomial growth theorem. Preprint, arXiv:0910.4148 [math.GR].

207

[Sid00]

S. Sidki. Automorphisms of one-rooted trees: growth, circuit structure, and acyclicity. J. Math. Sci. (New York) 100 (2000) no. 1, 19251943.

[Sla02]

G. Slade: Scaling limits and super-Brownian motion. Notices of Amer. Math. Soc. 49 (2002), no. 9, 10561067. http://www.ams.org/notices/200209/index.html

[Sla06]

G. Slade. The lace expansion and its applications. Lecture Notes in Mathematics 1879, xiv + 232 pages. Springer, Berlin, 2006.

[Sly10]

A. Sly. Computational transition at the uniqueness threshold. Extended abstract to appear in the Proceedings of FOCS 2010. Preprint at arXiv:1005.5584 [cs.CC].

[Sok01]

A. D. Sokal. Bounds on the complex zeros of (di)chromatic polynomials and Pottsmodel partition functions. Combinatorics, Probability and Computing 10 (2001), 4177. [arXiv:cond-mat/9904146]

[Spa09]

I. Spakulova. Critical percolation of virtually free groups and other tree-like graphs. Ann. Probab. 37 (2009), no. 6, 22622296. [arXiv:math.PR/0801.4153]

[Sta68]

J. Stallings. On torsion-free groups with innitely many ends. Ann. of Math. 88 (1968), 312334.

[SwW87]

R. H. Swendsen and J. S. Wang. Nonuniversal critical dynamics in Monte Carlo simulations. Phys Rev Lett. 58 (1987), no. 2, 8688.

[Tao10]

T. Tao. A proof of Gromovs theorem. Whats new blog entry, February 2010, http://terrytao.wordpress.com/2010/02/18/a-proof-of-gromovs-theorem/

[Tho92]

C. Thomassen. Isoperimetric inequalities and transient random walks on graphs. Ann. Probab. 20 (1992), 15921600.

Tim [Tim06a] A. ar. Neighboring clusters in Bernoulli percolation. Ann. Probab. 34 (2006), no. 6. 23322343. [arXiv:math.PR/0702873] Tim [Tim06b] A. ar. Percolation on nonunimodular transitive graphs. Ann. Probab. 34 (2006), no. 6, 23442364. [arXiv:math.PR/0702875] [Tim07] Tim A. ar. Cutsets in innite graphs. Combin. Probab. & Comput. 16 (2007), no. 1, 159166. arXiv:0711.1711 [math.CO] [Tim11] A. Tim ar. Approximating Cayley diagrams versus Cayley graphs. Preprint,

arXiv:1103.4968 [math.CO]. [Tit72] [Tro85] J. Tits. Free subgroups in linear groups. J. Algebra 20 (1972), 250270. V. I. Tromov. Graphs with polynomial growth. Math. USSR Sbornik 51 (1985), 405417.

208

[Var85a]

N. Th. Varopoulos. Isoperimetric inequalities and Markov chains. J. Funct. Anal. 63 (1985), 215239.

[Var85b]

N. Th. Varopoulos. Long range estimates for Markov chains. Bull. Sci. Math. (2) 109 (1985), no. 3, 225252.

[VarSCC92] N. Th. Varopoulos, L. Salo-Coste and T. Coulhon. Analysis and geometry on groups. Cambridge University Press, Cambridge, 1992. [Ver00] A. M. Vershik. Dynamic theory of growth in groups: Entropy, boundaries, examples. Russ. Math. Surveys 55 (2000), 667733. [Wag93] S. Wagon. The Banach-Tarski paradox. With a foreword by Jan Mycielski. Corrected reprint of the 1985 original. Cambridge University Press, Cambridge, 1993. [Wei00] B. Weiss. Soc groups and dynamical systems. Ergodic theory and harmonic analysis (Mumbai, 1999). Sankhy a: The Indian J. of Stat., Ser. A 62 (2000), no. 3, 350359. [Wer03] W. Werner. Random planar curves and Schramm-Loewner evolutions. Lecture notes from the 2002 Saint-Flour Summer School, [arXiv:math.PR/0303354]. [Wer07] W. Werner. Lectures on two-dimensional critical percolation, IAS Park City Graduate Summer School, 2007. arXiv:0710.0856v2 [math.PR] [Wik10a] English Wikipedia. Circle packing theorem. Version of May 7, 2010.

http://en.wikipedia.org/wiki/Circle_packing_theorem#A_uniqueness_statement [Wik10b] English Wikipedia. FKG inequality. Version of May 7, 2010.

http://en.wikipedia.org/wiki/FKG_inequality [Wil98] J. S. Wilson. Pronite groups. London Mathematical Society Monographs. New Series, 19. The Clarendon Press, Oxford University Press, New York, 1998. [Wil09] H. Wilton. 392C Geometric Group Theory, taught at the University of Texas at Austin, Spring 2009. http://392c.wordpress.com/2009/01/ [Woe00] W. Woess. Random walks on innite graphs and groups. Cambridge Tracts in Mathematics Vol. 138, Cambridge University Press, 2000. [Woe05] W. Woess. Lamplighters, Diestel-Leader graphs, random walks, and harmonic functions. Combin. Probab. & Comput. 14 (2005), 415433. [Wol68] J. Wolf. Growth of nitely generated solvable groups and curvature of Riemannian manifolds. J. Di. Geom. 2 (1968), 421446. [Zuk00] A. Zuk. On an isoperimetric inequality for innite nitely generated groups. Topology 39 (2000), no. 5, 947956.

209

You might also like