You are on page 1of 30

CS352: Systems Programming & UNIX

Lecture 26:
Advanced Pointers
Marshall McMullen
Computer Science Department
University of Arizona

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 1


Outline
● Dynamic Storage Allocation
● Dynamically Allocated Strings
● Dynamically Allocated Arrays
● realloc Function
● Deallocating Storage
● Structure Pointers
● Pointers to Pointers
● Pointers to Functions

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 2


Dynamic Storage Allocation
● C's data structures are normally fixed in size and cannot be
modified after declaration. An array, for example, has a fixed
number of elements and each element has a fixed size
● C also supports dynamic storage allocation: the ability to allocate
storage (memory) that can grow and shrink during program
execution
● Although dynamic memory allocation is available for all types in
C, it is used primarily for strings, arrays, and structures.
Dynamically allocated structures are the most important since we
can use them to create abstract data types such as lists, trees, and
other ADTs.
– We've already used some of these techniques this semester, but now it is
time to formalize them and take them to a much deeper level

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 3


Dynamic Memory Allocation (cont)
● To allocate storage dynamically, we'll need to call one of the three
memory allocation functions declared in <stdlib.h>:
#include <stdlib.h>
void *calloc(size_t nmemb, size_t size);
void *malloc(size_t size);
void *realloc(void *ptr, size_t size);

– malloc Allocates a block of size bytes and returns a pointer to it.


Memory is not initialized.
– calloc Allocates a block of size bytes and returns a pointer to it
and initializes it to 0
– realloc Resizes a previously allocated block of memory
● malloc is by far the most frequently used for several reasons:
– More efficient than calloc since memory is not initialized
– realloc is very often misunderstood and hence used incorrectly
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 4
Dynamic Memory Allocation (cont)
● void * type:
– These functions have no way of knowing what type of data we're storing at
the new memory so they cannot return an ordinary int or char pointer.
– Instead, these functions return a value of type void *. A void * value
is a “generic” pointer – essentially just a memory address
– Since a void * variable is “generic” any pointer type can be assigned to
a void * variable, and vice versa.
– No need to cast pointers returned from malloc functions. In Classic C,
this was not the case. A void * was not generic, so you were forced to
cast the return value from these functions.
– sizeof(void) equals sizeof(char), so pointer arithmetic with a
void * v is scaled by 1 byte not 4 bytes. This is a common point of
confusion.

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 5


Dynamic Memory Allocation (cont)
● If any of the memory allocation functions fail to allocate the
requested amount of memory, they will return a null pointer.
– A null pointer is simply a “pointer to nothing”-- a special value that can be
distinguished from all valid pointers
– The macro NULL is defined in six header files: <locale.h>
<stdef.h> <stdio.h> <stdlib.h> <string.h> <time.h>
and evaluates to (void *)0
– We should always test if memory allocation was successful as follows:
p = malloc(1000);
if (p == NULL){...} // same as if(!p){...}

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 6


Dynamically Allocated Strings
● Dynamic storage allocation is extremely useful for manipulating
strings since we don't always know ahead of time how much space
we need for a given string
● Using malloc to allocate memory for a string is easy because the C
standard guarantees that a char value requires exactly one byte of
storage (sizeof(char) is always 1). To allocate space for n
characters we'd write:
char *p = malloc(n+1);
– Note: The generic pointer malloc returns automatically gets converted to
a char *, no cast is necessary.
– Don't forget to include room for the NUL character, hence n+1 bytes have
to be allocated for a string of n characters.

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 7


Dynamically Allocated Strings
● Recall that memory allocated via malloc is not cleared or
initialized so it will contain garbage.
● Exercise: Let's consider the task of writing a readline function
to read in a line of input from the user of an unlimited length and
return a pointer to the allocated memory for that string. Approach:
– Allocate some initial string and read characters into that memory until it is
filled. Once filled, allocate a larger buffer (twice as large) and copy over
the old string, then continue reading

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 8


Arrays of Allocated Strings
● Recall the problem of ragged arrays:
Planets
0  M e r c u r y \0
1  V e n u s \0
2  E a r t h \0
3  M a r s \0
4  J u p i t e r \0
5  S a t u r n \0
6  U r a n u s \0
7  N e p t u n e \0
8  P l u t o \0
● Problem: How would we allocate the memory for “Planets” and for
each string inside it?
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 9
Arrays of Allocated Strings (cont)
● Previously we did this as follows:
char *planets[] = {“Mercury”,“Venus”,“Earth”,“Mars”,
“Jupiter”,“Saturn”,“Uranus”, “Neptune”, “Pluto”};
● This is an array of pointers – these pointers point to string literals
and hence cannot be modified! The following code would segfault:
*planets[0] = 'm';
● How can we instead allocate room for each string such that it is not
read-only? First create our array large enough to hold the required
9 pointers:
char *planets2[9];
● Now we can allocate room for each element of the array and put
the appropriate string into that element using a simple for loop:

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 10


Arrays of Allocated Strings (cont)
int i;
for(i = 0; i < 10; i++){
planets2[i] = malloc(strlen(planets[i]) + 1);
if(!planets2[i]){
fprintf(stderr, “Memory allocation failed!\n”);
exit(1);
}
strcpy(planets2[i], planets[i]);
}
● Note: We had to allocate strlen + 1 to leave room for the NUL
character
● Note: Notice how I check to make sure the memory allocate was
successful, and if not print an error message and exit
● Finally I used strcpy to copy the string literal characters into the
newly allocated string memory

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 11


Dynamically Allocated Arrays
● When we're writing a program it's often difficult to predict ahead of
time how many elements we'll need in an array – it would be far
more convenient if we could let the program determine this
dynamically at runtime
● This can be achieved by allocating space for the array, then using
pointer arithmetic from the start of the allocated space to allocate
the various elements of the array
● We can use malloc to allocate room for the array much the same
way we used it to allocate space for a string. The primary
difference is that the elements of an array may not necessarily be
only one byte long (consider an array of ints for floats!).
– As a result, we must use the sizeof operator to calculate the amount of
space required for each element

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 12


Dynamically Allocated Arrays (cont)
● Consider a simple example. Suppose we want an array of n ints.
This could be dynamically allocated at runtime as follows:
int *nums = malloc(n * sizeof(int));
● Always use the sizeof operator to calculate the correct amount of
space to allocate. Never make any assumptions about the size of
the type on your platform. Failing to allocate enough memory will
have extremely severe adverse consequences!
● Important Note: When dynamically allocating memory, you
always want to allocate the required number multiplied by the size
of the type you are pointing to. In the above example, we're
creating a int *, so when we allocate space, we ask for the
sizeof int, not sizeof int *.
– This is perhaps the most common error new C programmers make with
regards to memory allocation.
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 13
Dynamically Allocated Arrays (cont)
● Once we've allocated the memory for our array, we can ignore the
fact that it is a pointer and use array subscript notation to traverse
its elements:
for (i = 0; i < n; i++)
a[i] = i;
● As an alternative to malloc is calloc – which initializes the
memory for us. If we're allocating arrays, this is a common
requirement. Avoid using a loop to initialize the memory!
int *nums = calloc(n, sizeof(int));
– The first argument is how many elements there are, and the second
argument is the size of one element.

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 14


realloc Function
● Suppose we have a dynamically allocated array that we later decide
is too small. We could allocate a larger array and copy over all the
elements. This would work, but is prone to error if you make a
mistake, and is not as efficient as it could be. A better approach is
to use void *realloc(void *ptr, size_t size);
● realloc works as follows:
– When realloc is called, ptr must point to a memory block obtained from
a previous call to malloc,calloc, or realloc.
– size represents the new size of the block, which may be larger or smaller
than the original size.
– When it expands a memory block, realloc doesn't initialize the bytes
added to a block
– If it cannot enlarge the memory block as requested, it returns a null pointer
and the data in the old memory location is unchanged
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 15
realloc Function (cont)
– If realloc is called with a null pointer as its first argument, it behaves
just like malloc
– If realloc is called with 0 as its second argument, it frees the memory
block
● Although the C standard does not specify how realloc must be
implemented, we can still expect it to be reasonably efficient:
– When asked to reduce the size of memory, realloc should shrink the
block “in place” without moving the data
– realloc should attempt to expand memory without moving it. If there is
free memory available after the block to be expanded, it can just expand
into that free memory
– Thus, realloc avoids unnecessary copying of memory when possible
● The most serious mistake people make with realloc is forgetting
that it is free to move your memory to another location
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 16
realloc Function (cont)
● If your memory is moved, any pointers you had that previously
pointed to the memory you just moved are now invalid!!
● For this reason, once realloc returns, you must update all
pointers to the memory block since its possible that it has moved
the block elsewhere
● Consider the following example:
char * s = malloc( size * sizeof( char ));
char * p = s;
size += BLOCK_SIZE;
s = realloc( s, size * sizeof( char ));
*p++ = ... /* WRONG */
● Since realloc is free to move the pointer s, then when we
return from realloc, p no longer points to s. A better
technique here is to not use p at all, but offset s by some integer
value.
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 17
Deallocating Storage
● Dynamic memory is allocated in the heap. Calling these functions
often, or asking them for very large amounts of memory can
exhaust the heap, causing the functions to return a null pointer.
● To make matters worse, a program may allocate blocks of memory
and then lose track of them, thereby wasting space:
p = malloc(...);
q = malloc(...);
p = q;
● After this last assignment statement, there are no pointers pointed
to the memory region allocated for q initially, so we'll never be able
to use it again – it is effectively lost.
● A block of memory that is no longer accessible to a program is said
to be garbage. A program that leaves behind garbage has a
memory leak.
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 18
Deallocating Storage (cont)
● Some languages (such as Java and C++) provide a garbage
collector that automatically locates and recycles garbage, but for
reasons beyond the scope of this course, it is impossible to write a
garbage collector for C
● Instead, each C program is responsible for recycling its own
garbage by calling the free function to release unneeded memory
● The free function has the following prototype in <stdlib.h>
void free (void *ptr);
● Using free is easy, we just pass it the pointer we want to release:
p = malloc(...);
q = malloc(...);
free(p);
p = q;

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 19


Deallocating Storage (cont)
● Calling free releases the block of memory that p points to. This
block is returned to a list of free blocks of memory that the OS
maintains – making it available for subsequent memory allocation

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 20


Dangling Pointers
● Although free allows us to reclaim memory no longer needed,
using it leads to a new problem: dangling pointers.
● The call free(p) deallocates the memory block that p points to,
but does not modify the contents of p itself. If we forget that p no
longer points to a valid memory block, chaos ensues:
char *p = malloc(7);
strcpy(p, "apples");
printf("p = %s\n", p); // prints “apples”
free(p);
printf("p = %s\n", p); // may print “apples” or
// garbage or nothing

strcpy(p, “abc”); /**** WRONG ****/


}

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 21


Dangling Pointers (cont)
● Attempting to access a pointer that has been freed will have
unpredictable results
● Attempting to modify a pointer that has been freed is a serious
error (such as erratic behavior or a segfault) that can be difficult to
spot if you have many pointers pointing to the same block of
memory
● Strictly speaking, any use of a freed pointer, even if it is not
indirected, can theoretically lead to trouble!

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 22


Structure Pointers
● It is particularly useful to be able to create pointers to structures.
– This allows us to create complex Abstract Data Types such as linked lists,
stacks, queues, trees, etc.
– Recall that structs are copied into and out of functions – and this is
extremely inefficient. Passing a pointer to a struct is far more efficient
● Example: Linked List. First we declare a node for our linked list:
typedef struct node {
int value;
struct node *next;
} node;
– Next we need to know how to create new nodes:
node *new_node = malloc(sizeof(node));
– Be careful to give sizeof the name of the type to be allocated, not the
name of the pointer to that type or else not enough memory allocated!
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 23
Structure Pointers (cont)
● Next, we need to know how to store data into the value member of
the new node:
(*new_node).value = 10;
– Are the parentheses in the above example required? In this case yes! The
precedence of “.” operator is higher than the “*” operator. Thus we have
to first indirect new_node to get the struct it points at, then get the value
member
● Accessing a member of a structure using a pointer is so common in
C, the it provides a useful shortcut known as right arrow selection:
new_node->value = 10;
– The -> operator is a combination of the * and . operators: it performs
indirection on new_node to first locate the structure it points to, then
selects the value member of the structure
– -> produces an Lvalue so we can use it wherever any variable is allowed
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 24
Structure Pointers (cont)
● We can insert something at the start of our linked list by
maintaining a “first” pointer initially set to NULL. To insert at the
beginning, just do the following:
first = NULL;
new_node = malloc(sizeof(node));
new_node->value = 10;
new_node->next = first;
first = new_node;
● To remove a node, we simply locate the node to be deleted, alter
the previous node so that it skips over the node we're about to
delete, then call free to reclaim space occupied by the deleted node
● One useful idiom is searching through a linked list:
for(p = first; p != NULL; p = p->next)
...
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 25
Pointers to Functions
● So far, we've only used pointers to refer to data, however C does
not limit us in this fashion. We can also have pointers to code – in
particular we can have pointers to functions (function pointers)
● Function pointers can be passed as arguments to functions.
Suppose we want to write a sort function and make it as generic as
possible. We could do so by giving it the array, and a pointer to a
compare function.
● First, how do we declare a function which takes a function pointer?
void sort(int *a,int size,int (*comp)(int, int));
– This looks a little odd, so lets dissect it one piece at a time.
– The parentheses around *comp indicate that comp is a pointer to a
function, not a function that returns a pointer. A more user-friendly
looking version is also available (though not commonly used):
int sort(int *a,int size, int comp(int, int));
CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 26
Pointers to Functions (cont)
– The parentheses around (int, int) tell sort what types the function compare
takes
● In order to call the sort function, you will have to pass it a function
pointer for how to compare elements of the array, as in:
sort(array, 100, compare);
– Where “compare” is the name of a function which takes two integers and
compares them, returning an integer.
– Note that there are no parentheses after “compare”. When a function name
isn't followed by parentheses, C produces a pointer to the function instead
of generating code for a function call.
– Inside our “sort” function, we'll have access to the “compare” function
through our parameter “comp”. There is no need to indirect the function
pointer, though some do:
comp(a[i],a[j]) is equivalent to (*comp)(a[i],a[j]);

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 27


Pointers to Functions (cont)
● C provides a built-in function qsort that performs the quicksort
algorithm to sort an array:
#include <stdlib.h>
void qsort(void *base, size_t nmemb, size_t size,
int(*compar)(const void *, const void *));
– For more details, read the man page on qsort.
● We can also write functions which return function pointers
● We can also store function pointers into variables:
void (*pf)(int) = f;
– pf can point to any function which takes an int as an argument
and returns void. If f is such a function, then this is legal.
– We can call f through pf, as in: pf(i);

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 28


Reading Assignments
● C Programming: A Modern Approach. By K.N. King.
– Chapter 17

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 29


Acknowledgments
● C Programming: A Modern Approach. By K.N. King.

CSc 352 Systems Programming and Unix Lecture 26 – Advanced Pointers – 30

You might also like