Professional Documents
Culture Documents
Nicholas Nethercote
Jeremy Fitzhardinge
Computer Laboratory
University of Cambridge
United Kingdom
San Francisco
United States
jeremy@goop.org
njn25@cam.ac.uk
ABSTRACT
This paper presents a new te
hnique for performing bounds
he
king. We use dynami
binary instrumentation to modify
programs at run-time, tra
k pointer bounds information and
he
k all memory a
esses. The te
hnique is neither sound
nor
omplete, however it has several very useful
hara
teristi
s: it works with programs written in any programming
language, and it requires no
ompiler support, no
ode re
ompilation, no sour
e
ode, and no spe
ial treatment for
libraries. The te
hnique performs best when debug information and symbol tables are present in the
ompiled program,
but degrades gra
efully when this information is missing|
fewer errors are found, but false positives do not in
rease.
We des
ribe our prototype implementation, and
onsider
how it
ould be improved by better intera
tion with a
ompiler.
Keywords
Bounds-
he
king, memory debuggers
1.
INTRODUCTION
Low-level programming languages like C and C++ provide raw memory pointers, permit pointer arithmeti
, and
do not
he
k bounds when a
essing arrays. This
an result
in very e
ient
ode, but the unfortunate side-ee
t is that
a
identally a
essing the wrong memory is a very
ommon
programming error.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SPACE 2004 Venice, Italy
Copyright 2003 ACM X-XXXXX-XX-X/XX/XX ...$5.00.
The most obvious example in this
lass of errors is ex
eeding the bounds of an array. However the bounds of
non-array data obje
ts
an also be violated, su
h as heap
blo
ks, C stru
ts, and sta
k frames. We will des
ribe as
a bounds error any memory a
ess whi
h falls outside the
intended memory range.
These errors are not di
ult to introdu
e, and they
an
ause a huge range of bugs, some of whi
h
an be extremely
subtle and lurk undete
ted for years. Be
ause of this, tools
for preventing and identifying them are extremely useful.
Many su
h tools are available, using a variety of te
hniques.
None are ideal, and ea
h one has a dierent set of
hara
teristi
s, in
luding:
the kinds of bugs they
an and
annot spot;
In this paper we des
ribe a new te
hnique for identifying bounds errors. The basi
idea is that all data obje
ts
that have bounds we wish to
he
k are tra
ked, as are all
pointers; ea
h pointer has a legitimate memory range it
an
a
ess, and a
esses outside this range are
agged as errors.
Our tool ee
tively turns programs pointers into fat pointers on-the-
y. However, the pointer ranges are maintained
separately from the pointers themselves, so pointer sizes do
not
hange.
The te
hnique spots many, but not all, bounds errors in
the heap, the sta
k, and in stati
memory; it gives few false
positives; it is dynami
, and relies on dynami
binary translation. These
hara
teristi
s are not parti
ularly ex
iting.
The main
ontribution of this paper is that our bounds
he
king te
hnique has several unique
hara
teristi
s: it does
not require any
ompiler support; it works with programs
written in any language; and it
he
ks entire programs, without requiring any spe
ial treatment for libraries. This last
hara
teristi
is parti
ularly important|inadequate treatment of library
ode is the single major short
oming of many
previous bounds-
he
king te
hniques.
Thus, our te
hnique has its advantages and disadvantages,
and is unlikely to be a world-beater in its own right. However, it is a useful
omplement to existing te
hniques, and
2.
THEORY
This se
tion provides a high-level des
ription of our te
hnique. Certain details are kept vague, and subsequently
eshed out in Se
tion 3.
2.1 Overview
The basi
idea is simple. Every pointer has a range of
addresses it
an legitimately a
ess. The range depends on
what the pointer originally pointed to. For example, the
legitimate range of a pointer returned by mallo
() is the
bounds of the heap blo
k pointed to by that pointer. We
all
that range the pointer's segment. All memory a
esses are
he
ked to make sure that the memory a
essed is within the
a
essing pointer's segment. Any violations are reported.
Pointers are often used in operations other than memory
a
esses. The obvious example is pointer arithmeti
; for
example, array elements are indexed using addition on the
array's base pointer. However, if two pointers are added,
the result should not be used to a
ess memory; nor should
a non-pointer value be used to a
ess memory. Thus we also
need to know whi
h program values are non-pointers. The
result is that every value has a run-time type, and we need
a simple type system to determine the type of all values
produ
ed by the program.
The following se
tions des
ribe aspe
ts of the te
hnique
in more detail.
2.2 Metadata
The te
hnique requires maintenan
e of metadata des
ribing the run-time types of data obje
ts. We say this metadata shadows the real values. This metadata has one of the
following four forms.
In prin
iple, every value produ
ed, of any size,
an be assigned a type. However, our most important
he
king takes
pla
e when memory is a
essed, whi
h is always through
word-sized pointers. Therefore, the shadow value tra
king
an be done at word-sized granularity.
: Do nothing.
# load a[0
# load a[5
# load a[5
1
Se
tion 3.9 explains how we
an deal with programs that
use
ustom allo
ators.
# load a[5
# load a[5
# load a[10
2.5 Operations
Every data-manipulating instru
tion exe
uted by the program must be shadowed by an operation to manipulate the
relevant metadata. These shadow operations are des
ribed
in the following se
tions.
p Y
p Y
p Y
p X
p X
p X
( )
( )
( )
( )
( )
(a) Add
( )
*
(b) Multiply
( )
( )
( )
&
p Y
p Y
p Y
p X
p X
p X
( )
( )
( ) Bitwise-and
( )
(d) Bitwise-xor
on pointers, e.g. when putting pointers through a hash fun
tion. Several times we had to remove warnings on pointer
arithmeti
operations we had assumed were ridi
ulous, be
ause real programs o
asionally do them. Generally, this
should not be a problem be
ause the result is always marked
as n, and any subsequent use of the result to a
ess memory
will be
agged as an error.
Bitwise-and, shown in Figure 1(
), is more subtle. If
a non-pointer is bitwise-and'd with a pointer, the result
an be a non-pointer or a pointer, depending on the nonpointer value. For example, if the non-pointer has value
0xfffffff0, the operation is probably nding some kind of
base value, and the result is a pointer. If the non-pointer has
value 0x000000ff, the operation is probably nding some
kind of oset, and the result is a non-pointer. We deal with
these possibilities by assuming the result is a pointer, but
also doing the range test on the result and
onverting it to
n if ne
essary. The resulting shadow operation is thus the
same as that for addition.
For bitwise-xor, shown in Figure 1(d), we do not try anything tri
ky; we simply use a range test to
hoose either
u or n.
This is be
ause there are not any sensible ways
to transform a pointer with bitwise-xor. However, there
are two
ases where bitwise-xor
ould be used in a nontransformative way. First, the following C
ode swaps two
pointers using bitwise-xor.
p1 ^= p2;
p2 ^= p1;
p1 ^= p2;
2.5.4 Subtraction
We have not mentioned subtra
tion yet. It is somewhat
similar to addition, and is shown in Figure 2. Subtra
ting two non-pointers gives a non-pointer; subtra
ting a nonpointer from a pointer gives a pointer; subtra
ting a pointer
from a non-pointer is
onsidered an error.
( )
*
p Y
p X
p X
n=
( )
( )
The big
ompli
ation is that subtra
ting one pointer from
another is legitimate, and the result is a non-pointer. If the
two pointers involved in the subtra
tion point to the same
segment, there is no problem. However
onsider this C
ode:
har p1[10;
har p2[10;
int diff = p2 - p1;
p1[diff = 0;
2.6 Comments
The te
hnique is very heavyweight, sin
e it relies on (a)
instrumenting every instru
tion in the program that moves
an existing value, or produ
es a new value, and (b) shadowing every word in registers and memory with a shadow
value.
3.
PRACTICE
5
Shadow type information for words in memory are stored
in an equally-sized pie
e of shadow memory. Similarly, ea
h
register has a
orresponding shadow register.
6
This is a simplied garbage
olle
tion, as the traversals
will only be one level deep, sin
e segment stru
tures
annot
annot point to other segment stru
tures.
A segment-type X is represented by a dynami
ally allo
ated segment stru
ture
ontaining its base address,
size, a pointer to an \exe
ution
ontext" (a sta
ktra
e from the time the segment is allo
ated, whi
h
is updated again when the segment is freed), and a
tag indi
ating whi
h part of memory the segment is in
(heap, sta
k, or stati
) and its status (in-use or freed).
Ea
h stru
ture is 16 bytes. Exe
ution
ontexts are
stored separately, in su
h a way that no single
ontext
is stored more than on
e, be
ause repeated
ontexts
are very
ommon. Ea
h exe
ution
ontext
ontains n
pointers, where n is the sta
k-tra
e depth
hosen by
the user (default depth is four).
A non-pointer-type
stant NONPTR.
freed-queue this should happen extremely rarely. Also, re
y
led segments
ould be marked, and error messages arising
from them
ould in
lude a dis
laimer that there is a small
han
e that the range given is wrong.
Alternatively, sin
e the p(X ) representation is a pointer
to a segment stru
ture, whi
h is aligned, there are two bits
available in the pointer whi
h
ould be used to store a small
generation number. If the generation number in the pointer
doesn't mat
h the generation number in the segment stru
ture itself, a warning
an be issued.
One other
hara
teristi
of re
y
ling is worth mentioning.
If the program being
he
ked leaks heap blo
ks, the
orresponding segments will also be leaked and lost by our tool.
This would not happen with garbage
olle
tion.
So the trade-o between the two approa
hes is basi
ally
that garbage
olle
tion
ould introdu
e pauses, whereas re
y
ling has a small
han
e of
ausing in
orre
t error messages. Currently our prototype uses re
y
ling, whi
h was
easier to implement.
bad.
3.12 Performance
This se
tion presents some basi
performan
e gures for
our prototype. All experiments were performed on an 1400
MHz AMD Athlon with 1GB of RAM, running Red Hat
Linux 9.0, kernel version 2.4.19. The test programs are a
subset of the SPEC2000 suite. All were tested with the
\test" (smallest) inputs.
Table 1 shows the performan
e of our prototype. Column
1 gives the ben
hmark name,
olumn 2 gives its normal running time in se
onds, and
olumn 3 gives the slow-down fa
tor. Programs above the line are integer programs, those
below are
oating point programs.
Program
bzip2
rafty
gap
g
gzip
m
f
parser
twolf
vortex
ammp
art
equake
mesa
median
3.10 Leniency
Some
ommon programming pra
ti
es
ause bounds to
be ex
eeded. Most notably, glib
has heavily optimised
versions of fun
tions like mem
py(), whi
h read arrays one
word at a time. On 32-bit x86 ma
hines, these fun
tions
an
read up to three bytes past the end of an array. In pra
ti
e,
this does not
ause problems. Therefore, by default we allow
aligned, word-sized reads to ex
eed bounds by up to three
bytes, although there is a
ommand-line option to turn on
stri
ter
he
king that
ags these as errors.
3.11 Examples
This se
tion shows some example errors given by our prototype. Figure 3 shows a short C program, bad.
. This
ontrived program shows three
ommon errors: two array
overruns, and an a
ess to a freed heap blo
k.
Figure 4 shows the output produ
ed by our prototype.
The error messages are sent to standard error by default,
but
an be redire
ted to any other le des
riptor, le, or
so
ket.
Ea
h line is prexed with the running program's pro
ess
ID. Ea
h error report
onsists of a des
ription of the error,
the lo
ation of the error, a des
ription of the segment(s) involved, and the lo
ation where the segment was allo
ated
or freed (whi
hever happened most re
ently). The fun
tions mallo
() and free() are identied as being in the
le vg_repla
e_mallo
.
be
ause that is the le that
ontains our tool's implementations of these fun
tions, whi
h
override the standard ones.
The program was
ompiled with -g to in
lude debugging
4. SHORTCOMINGS
Like all error-
he
king te
hniques, ours is far from perfe
t.
How well it does depends on the
ir
umstan
es. Happily, it
exhibits \gra
eful degradation"; as the situation be
omes
less favourable, more and more p(X ) metavalues will be lost
and seen instead as u. Thus it will dete
t fewer errors, but
will not give more false positives.
4.4 No Symbols
4.2 Implementation
Our implementation suers from a few more short
omings, mostly be
ause the optimal
ase is too
omplex to implement.
5.
RELATED WORK
This se
tion des
ribes several tools that nd bounds errors for C and C++ programs, and
ompares them to our
te
hnique. No single te
hnique is best; ea
h has its strengths
and weaknesses, and they
omplement ea
h other.
5.1 Redzones
The most
ommon kinds of bounds-
he
king tools dynami
ally
he
k a
esses to obje
ts on the heap. This approa
h
is
ommon be
ause heap bounds-
he
king is easy to do.
The simplest approa
h is to repla
e the standard versions
of mallo
(), new, and new[ to produ
e heap blo
ks with
a few bytes of padding at their ends (redzones ). These redzones are lled with a distin
tive values, and should never
be a
essed by a program. When the heap blo
k is freed
with free(), delete or delete[, the redzones are
he
ked,
and if they have been written to, a warning is issued. The
do
umentation for mpatrol [13 lists many tools that use this
te
hnique.
This te
hnique is very simple, but it has many short
omings.
1. It only dete
ts small overruns/underruns, within the
redzones|larger overruns or
ompletely wild a
esses
ould a
ess the middle of another heap blo
k, or nonheap memory.
2. It only dete
t writes that ex
eed bounds, not reads.
3. It only reports errors when a heap blo
k is freed, whi
h
auses two problems: rst, not all heap blo
ks will
ne
essarily be freed, and se
ond, this gives no information about where the overrun/underrun o
urred.
Alternatively,
alls to a heap-
he
king fun
tion
an be
inserted, but that requires sour
e
ode modi
ation,
will
ause pauses while the entire heap is
he
ked, and
still does not give pre
ise information about when an
error o
urs.
4. A
esses to freed heap blo
ks via dangling pointers are
not dete
ted, unless they happen to hit another blo
k's
redzone (even then, identifying the problem will be
di
ult).
5. It does not work with heap blo
ks allo
ated with
ustom allo
ators (although the te
hnique
an be built
into
ustom allo
ators).
6. It only works with heap blo
ks|sta
k and stati
blo
ks
are pre-allo
ated by the
ompiler, and so redzones
annot (without great di
ulty) be used for them.
This te
hnique has too many problems to be
onsidered further. All these problems are avoided by te
hniques that
tra
k pointer bounds, su
h as ours.
Some of these short
omings are over
ome by Ele
tri
Fen
e
[11, another mallo
() repla
ement that uses entire virtual
pages as redzones. These pages are marked as ina
essible, so that any overruns/underruns
ause the program to
abort immediately, whereupon the oending instru
tion
an
be found using a debugger. This avoids problems 2 and 3
above, and mitigates problem 1 (be
ause the redzones are so
big). However, it in
reases virtual memory usage massively,
making it impra
ti
al for use with large programs.
A better approa
h is used by the Valgrind tool Mem
he
k [9, and Purify [5. They too repla
e mallo
() et
al with versions that produ
e redzones, but they also maintain addressability metadata about ea
h byte of memory,
and
he
k this metadata before all loads and stores. Be
ause the redzones are marked as ina
essible, all heap overruns/underruns within the redzones are spotted immediately, avoiding problems 2 and 3 above. If the freeing of
heap blo
ks is delayed, this
an mitigate problem 4. These
tools also provide hooks that a
ustom allo
ator
an use to
tell them when new memory is allo
ated, alleviating problem 5.
Purify is also
apable of inserting redzones around stati
variables in a pre-link step, in
ertain
ir
umstan
es, as explained in the Purify manual:
Purify inserts guard zones into the data se
tion
only if all data referen
es are to known data variables. If Purify nds a data referen
e that is relative to the start of the data se
tion as opposed
to a known data variable, Purify is unable to
determine whi
h variable the referen
e involves.
In this
ase, Purify inserts guard zones at the
beginning and end of the data se
tion only, not
between data variables.
Similarly, the Solaris implementation of Purify
an also
insert redzones at the base of ea
h new sta
k frame, and so
dete
t overruns into the parent frame. We do not know the
details of how Purify does this, but we suspe
t that the way
the SPARC sta
k is handled makes it mu
h easier to do than
on x86.
Redzones and addressability tra
king works very well, whi
h
a
ounts for the widespread use of Mem
he
k and Purify.
However, the remaining short
omings|1 and parti
ularly 6
(even with Purify's partial solution)|are important enough
that tools tra
king pointer bounds are worth having.
2. All
ode must be re
ompiled to use fat pointers, in
luding libraries. In pra
ti
e, this
an be an enormous
hassle.
Alternatively, parts of the program
an be left unre
ompiled, so long as interfa
e
ode is produ
ed that
onverts fat pointers to normal pointers and vi
e versa
when moving between the two kinds of
ode. Produ
ing this
ode requires a lot of work, as there are
many libraries used by normal programs. If this work
is done, two kinds of errors
an still be missed. First,
pointers produ
ed by the library
ode may la
k the
bounds metadata and thus not be
he
ked when they
are used in the \fat"
ode. Se
ond, library
ode will
not
he
k the bounds data of fat pointers when performing a
esses.
3. Changing the size of a fundamental data type will
break any
ode that relies on the size of pointers, for
example,
ode that
asts pointers to integers or vi
e
versa, or C
ode that does not have a
urate fun
tion
prototypes.
4. Support may be required in not only the
ompiler, but
also the linker (some pointer bounds
annot be known
by the
ompiler), and possibly debuggers (if the fat
pointers are to be treated transparently).
Jones and Kelly des
ribe a better implementation in [6.
Ea
h pointer's metadata is stored separately from the pointer
itself, so it preserves ba
kward
ompatibility with existing
programs, avoiding problem 3, and redu
ing problem 4.9
Patil and Fis
her [10 also store metadata separately, in order to perform the
he
king operations on a se
ond pro
essor. We believe that if library
ode was handled better, this
te
hnique would be mu
h more widely used.
Both these approa
hes require
ompiler support, and the
he
king is only done within modules that have been re
ompiled by the
ompiler, and on pointers that were
reated
within these re
ompiled modules.
Our te
hnique has the same basi
idea of tra
king a pointer's
bounds, but the implementation is entirely dierent, as it
works on already-
ompiled
ode, and naturally
overs the
entire program, in
luding libraries. It also works with programs written in any language; this is useful for systems
written in a mix of C or C++ and another language. Our
te
hnique is less a
urate, however, and the overhead is
mu
h greater.
7. ACKNOWLEDGMENTS
Many thanks to: Julian Seward, for
reating Valgrind,
and for many useful dis
ussions, and to Alan My
roft and
the anonymous reviewers for their
omments about this paper. The rst author gratefully a
knowledges the nan
ial
support of Trinity College, Cambridge.
8. REFERENCES
[1 T. M. Austin, S. E. Brea
h, and G. S. Sohi. E
ient
dete
tion of all pointer and array a
ess errors. In
Pro
eedings of the ACM SIGPLAN Conferen
e on
Programming Language Design and Implementation
(PLDI '94), pages 290{301, Orlando, Florida, USA,
June 1994.
[2 M. Burrows, S. N. Freund, and J. L. Wiener. Run-time
type
he
king for binary programs. In Pro
eedings of
CC 2003, pages 90{105, Warsaw, Poland, Apr. 2003.