You are on page 1of 7

Introduction

Proper error detection and recovery is often ignored by UNIX developers. The lack of
exceptions from the C language and the rudimentary error mechanisms from the standard C library
certainly contribute to this. This article familiarizes you with UNIX error reporting in the standard
C library and (hopefully) encourages you to report and handle errors in a user-friendly way.
Error reporting in C programs
C is the most commonly used programming language on UNIX platforms. Despite the popularity
of other languages on UNIX (such as Java, C++, Python, or Perl), all of the application program
ming interfaces (APIs) of systems have been created for C. The standard C library, part of every C
compiler suite, is the foundation upon which UNIX standards, such as Portable Operating System
Interface (POSIX) and the Single UNIX Specification, were created.
When C and UNIX were developed in the early 1970s, the concept of exceptions, which interrupt
the flow of an application when some condition occurs, was fairly new or non-existent. The
libraries had to use other conventions for reporting errors.
While you're pouring over the C library, or almost any other UNIX library, you'll discover two
common ways of reporting failures:
1) The function returns an error or success code; if it's an error code, the code itself can be used
to figure out what went wrong.
2) The function returns a specific value (or range of values) to indicate an error, and the global
variable errno is set to indicate the cause of the problem.
The errno global variable (or, more accurately, symbol, since on systems with a thread-safe C
library, errno is actually a function or macro that ensures each thread has its own errno) is defined
in the <errno.h> system header, along with all of its possible values defined as standard constants.
Many of the functions in the first category actually return one of the standard errno codes, but it's
impossible to tell how a function behaves and what it returns without checking the Returns section
of the manual page. If you're lucky, the function's man page lists all of its possible return values
and what they mean in the context of this particular function. Third party libraries often have a
single convention that's followed by all of the functions in the library but, again, you'll have to
check the library's documentation before making any assumptions.
Let's take a quick look at some code demonstrating errno and a couple of functions that you can
use to transform that error code into something more human-readable.
Reporting failure
In Listing 1, you'll find a short program that tries to open a file that is unlikely to exist and reports
the error to whomever is running the program, using two different techniques.
Listing 1. The errno variable records your failures
// errno for fun and profit

#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
const char *FILE_NAME = "/tmp/this_file_does_not_exist.yarly";
int main( int argc, char **argv )
{
int fd = 0;
printf( "Opening %s...\n", FILE_NAME );
fd = open( FILE_NAME, O_RDONLY, 0644 );
if( fd < 0 ) {
// Error, as expected.
perror( "Error opening file" );
printf( "Error opening file: %s\n", strerror( errno ) );
}
return EXIT_SUCCESS;
}
When you run this program, you'll see something like Listing 2.
Listing 2. The output from Listing 1
chrish@dhcp2 [507]$ ./Debug/errnoDemo
Opening /tmp/this_file_does_not_exist.yarly...
Error opening file: No such file or directory
As you can see from the output (Listing 2), the perror() function displays the string you pass to it,
followed by a colon, a space, and then the textual representation of the current errno value. You
can simulate this yourself by using a printf() call and the strerror() function, which returns a poiner
to the textual representation of the current errno value.
One detail you can't see from the output is that perror() writes its message to the standard error
channel (stderr); the printf() call in Listing 1 is writing to the standard output channel (stdout).
The strerror() function isn't necessarily thread-safe; for unknown values, it formats an error messa
ge in a static buffer and returns a pointer to that buffer. Additional calls to strerror() will overwrite
the contents of that buffer.
The POSIX 1003.1 standard defines strerror_r(), which accepts a pointer to a buffer and a buffer
size in addition to the error value. Listing 3 shows you how to use this thread-safe version.
Listing 3. The thread-safe strerror_r() function in action
// Thread-safe usage of strerror_r().
void thread_safe( int err )

{
char buff[256];
if( strerror_r( err, buff, 256 ) == 0 ) {
printf( "Error: %s\n", buff );
}
}
The perror() and strerror()/strerror_r() functions are probably the most commonly used error report
ing methods when dealing with standard errno values. Let's take a look at some additional errorrelated global variables and the standard defined by POSIX-1003.1 errno values.
Error global variables and standard values
So, the global errno variable is set by standard C library functions (and possibly others; read the
manual to find out if a function you intend to use sets errno) to indicate some kind of error, be it
some bad values passed in as arguments, or a failure while the function was performing its duties.
The perror() and strerror() functions that pull standard error descriptions come from the global
variable, sys_errlist.
The standard C library defines two additional error-related global variables, sys_nerr (an int) and
sys_errlist (an array of pointers to char). The first is the number of standard error messages stored
in sys_errlist. Historical applications (that is, horribly outdated legacy code) sometimes refer to
these directly, but produce errors during compilation because they're declared inconsistently.
The POSIX standard defines quite a few possible values for errno; not all of these are applicable
to every function, obviously, but they do provide developers with a large menu to choose from
when writing their own functions.
Here's an Eclipse tip: opens the errno.h system header and highlights the declaration from errno,

In addition to noticing that my tab settings don't match those of whoever wrote this file, you'll see
several of the standard error values, their symbolic names, and a brief comment describing each.
Most system headers contain at least this much information for the standard errno values, so don't
be afraid to take a look. Your system headers and manual pages are also your only source of infor
mation about the non-standard values that your system might support.
The standard errno values include:
E2BIG -- The argument list passed to the function was too long.
EACCESS -- Access denied! The user running the program doesn't have permission to access a
file, directory, and so forth.
EAGAIN -- The required resource is temporarily unavailable; if you try the operation again later,
it might succeed.
EBADF -- A function tried to use a bad file descriptor (it doesn't refer to an open file, for example,
or it was used in an attempt to write to a file that was opened read-only).
EBUSY -- The requested resource is unavailable. For example, attempting to remove a directory
while another application is reading it. Note the ambiguity between EBUSY and EAGAIN; obvi
ously you'd be able to remove the directory later, when the reading program has finished.
ECHILD -- The wait() or waitpid() function tried to wait for a child process to exit, but all
children have already exited.
EDEADLK -- A resource deadlock would occur if the request continued. Note that this is not the
sort of deadlock you get in multithreaded code -- errno and its friends definitely can't help you
track those down.
EDOM -- The input argument is outside of the domain of a mathematical function.
EEXIST -- The file already exists, and that's a problem. For example, if you call mkdir() with a
path that names an existing file or directory.
EFAULT -- One of the function arguments refers to an invalid address. Most implementations
can't detect this (your program receives a SIGSEGFAULT signal and exit instead).
EFBIG -- The request would cause a file to expand past the implementation-defined maximum file
size. This is generally around 2GB, but most modern file systems support much larger files, some
times requiring 64-bit versions of the read()/write() and lseek() functions.
EINTR -- The function was interrupted by a signal, which was caught by a signal handler in the
program, and the signal handler returned normally.
EINVAL -- You passed an invalid argument to the function.
EIO -- An I/O error occurred; this is usually generated in response to hardware problems.
EISDIR -- You called a function that requires a file argument with a directory argument.
ENFILE -- Too many files are already open in this process. Each process has OPEN_MAX file
descriptors, and you're trying to open (OPEN_MAX + 1) files. Remember that file descriptors
include things like sockets.
ENLINK -- The function call would cause a file to have more than LINK_MAX links.
ENAMETOOLONG -- You've created a path name longer than PATH_MAX, or you've created a
file or directory name longer than NAME_MAX.
ENFILE -- The system has too many simultaneously open files. This should be a temporary
condition, and it is unlikely to happen on a modern system.
ENODEV -- No such device or you're attempting to do something inappropriate for the specified

device (don't try reading from an ancient line printer, for example).
ENOENT -- No such file was found or the specified path name doesn't exist.
ENOEXEC -- You tried to run a file that isn't executable.
ENOLCK -- No locks are available; you've reached a system-wide limit on file or record locks.
ENOMEM -- The system is out of memory. Traditionally, applications (and the OS itself) don't
handle this gracefully, which is why you need to have more RAM than you expect to use,
especially on systems that can't dynamically increase the size of the on-disk swap space.
ENOSPC -- No space left on the device. You've tried to write to or create a file on a device that's
full. Again, it's traditional for applications and the OS to not handle this gracefully.
ENOSYS -- The system doesn't support that function. For example, if you call setpgid() on a sys
tem without job control, you'll get an ENOSYS error.
ENOTDIR -- The specified path name needs to be a directory, but it isn't. This is the opposite of
the EISDIR error.
ENOTEMPTY -- The specified directory isn't empty, but it needs to be. Note that an empty direct
ory still contains the . and .. entries.
ENOTTY -- You've attempted an I/O control operation on a file or special file that doesn't support
that operation. Don't try setting the baud rate on a directory, for example.
ENXIO -- You've attempted an I/O request on a special file for a device that doesn't exist.
EPERM -- The operation isn't permitted; you don't have permission to access the specified resou
rce.
EPIPE -- You've attempted to read from or write to a pipe that doesn't exist any more. One of the
programs in the pipe chain has closed its part of the stream (by exiting, for example).
ERANGE -- You've called a function, and the return value is too large to be represented by the
return type. For example, if a function returns an unsigned char value but calculated a result of
256 or more (or -1 or less), errno would be set to ERANGE and the function would return some
irrelevant value. In cases like this, it's important to check your input data for sanity, or check errno
after every call.
EROFS -- You attempted to modify a file or directory stored on a read-only file system (or a file
system that was mounted in read-only mode).
ESPIPE -- You attempted to seek on a pipe or First In, First Out (FIFO).
ESRCH -- You've specified an invalid process ID or process group.
EXDEV -- You've attempted an operation that would move a link across devices. For example,
UNIX filesystems don't let you move a file between file systems (instead, you have to copy the
file, then delete the original).
One annoying feature of the POSIX 1003.1 specification is the lack of a no error value. When
errno is set to 0, you've encountered no problems, except you can't refer to this with a standard
symbolic constant. I've programmed on platforms that had E_OK, EOK, and ENOERROR in their
errno.h, and I've seen loads of code that includes something like Listing 4. It would've been nice
to have this covered in the specification in order to avoid doing this sort of thing.
Listing 4. The no error error value
#if !defined( EOK )
# define EOK 0
/* no error */
#endif

Using the sys_nerr global variable and the strerror() function, you can easily whip up some code
(see Listing 5) to print out all of the built-in error messages of the system. Remember, this dumps
all of the additional implementation-defined (that is, non-standard) errno values supported by the
system you're using. Only the errors listed above are required to exist on a POSIX 1003.1-conform
ing system, anything else is gravy.
Listing 5. Showing off all of your errors
// Print out all known errors on the system.
void print_errs( void )
{
int idx = 0;
for( idx = 0; idx < sys_nerr; idx++ ) {
printf( "Error #%3d: %s\n", idx, strerror( idx ) );
}
}
I won't bore you with a complete list of all the errno values supported by my system (Mac OS X
10.4.7 at the time of this writing), but here's a sample of the output from the print_errs() function
(see Listing 6).
Listing 6. There sure are a lot of possible standard error values
Error # 0: Unknown error: 0
Error # 1: Operation not permitted
Error # 2: No such file or directory
Error # 3: No such process
Error # 4: Interrupted system call
Error # 5: Input/output error
Error # 6: Device not configured
Error # 7: Argument list too long
Error # 8: Exec format error
Error # 9: Bad file descriptor
Error # 10: No child processes
Error # 93: Attribute not found
Error # 94: Bad message
Error # 95: EMULTIHOP (Reserved)
Error # 96: No message available on STREAM
Error # 97: ENOLINK (Reserved)
Error # 98: No STREAM resources
Error # 99: Not a STREAM
Error #100: Protocol error
Error #101: STREAM ioctl timeout
Error #102: Operation not supported on socket
That's quite a lot of errors! Luckily, most functions will only have a few possible errors to report,
so it's usually not that hard to handle them appropriately.

Dealing with errors


Adding error-handling code to your program can be annoying, tedious, and time-consuming. It can
clutter up the elegance of your code, and you can get bogged down adding handlers for every
conceivable error. Developers often hate doing it.
But, you're not doing it for yourself, you're doing it for the people who are going to actually use
your program. If something can fail, they need to know why it failed and, more importantly, what
they can do to fix the problem.
That last part is often the bit that developers often miss. Telling the user File not found isn't nearly
as helpful as telling them Unable to find the SuperWidget configuration file, and then giving them
the option to select the missing file (give them a file selection widget or something), search for the
missing file (have the program look in likely places for the file), or create a new version of the file
filled with the default data.
Yes, I know this interrupts the flow of your code, but slick error-handling and recovery really
makes your application a hit with the users. And, because other developers are often lacking when
it comes to error-handling, it's easy to do better than everyone else.
Summary
On UNIX, the standard error reporting mechanisms are pretty minimalistic, but that's no reason
for your application to handle run time errors by crashing or exiting without telling the user what's
going on.
The standard C library and POSIX 1003.1 define a number of possible standard error values, and a
couple of handy functions for reporting errors and translating the errors into something humans
can read. But these aren't really enough, developers should try harder to tell the user what's going
on and give them ways of fixing or working around the problem.

You might also like