Professional Documents
Culture Documents
Systems Programming
00. Introduction
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
Visiting Card
Your Tutors
Jochen Oekonomopulos
jochen.oekonomopulos@uni-konstanz.de
Alexander Holupirek
alexander.holupirek@uni-konstanz.de
V 504
http://www.inf.uni-konstanz.de/~holupire
88 4440
E 217
Thomas Zink
thomas.zink@uni-konstanz.de
Enrolled in master studies Information Engineering
V 504
Tutorial Groups
1
7
Organizational Matters
I
I
Have fun!
Literature
10
Literature
11
12
Systems Programming
I
Operating System
13
14
standards4
OS as Black Box
15
16
.
I
I
The latest version POSIX.1 has been jointly developed by the IEEE
and The Open Group5 . As such it is both an IEEE and an Open
Group Technical Standard:
I IEEE Std 1003.1, 2004 Edition
I The Open Group Technical Standard Base Specifications, Issue 6
OS as Black Box
OS kernel
http://www.opengroup.org/overview/members/membership list.htm
17
Part 4: Rationale
18
http://www.unix.org/version3/xsh contents.html
20
UNIX Architecture
applications
shell
system calls
Example
System Interface Table. Lists 1123 interfaces.
kernel
http://www.opengroup.org/onlinepubs/009695399/functions/atoi.html
http://www.opengroup.org/onlinepubs/009695399/functions/read.html
library routines
21
22
0
1
2
3
4
5
6
7
8
9
man(1) on Linux
man(1) on OpenBSD
23
24
26
Essentials
I
I
Good knowledge of C.
Knowledge about the services an OS provides:
I
I
system calls.
C libraries.
28
A Tutorial Introduction
Systems Programming
Arrays
Functions
Call by Value, Call by Reference
Character Arrays
Variables, Declarations and Scope
30
29
Quick introduction
Provide examples
Show the basics, such as
I
I
I
I
I
I
I
Compile it successfully
Run it
2
3
4
pointers
structures
standard library
6
7
8
31
int
main ( void )
{
printf ( " Hello , world \ n " );
return (0);
}
32
Compilation On A UNIX-like OS
C Programs
$ cc - Wall hello . c
$ ls
hello . c a . out
$ ./ a . out
Hello , world
$
engine
preprocessor
compiler
assembler
linker
filename
hello.c
hello.i
hello.s
hello.o
a.out
description
source code
source w/ preproc. directives expanded
assembler code
object code ready to be linked
executable
functions
statements
variables
arguments
33
34
2
3
4
I
I
5
6
7
8
35
int
main ( void )
{
printf ( " Hello , world \ n " );
return (0);
}
I
"Hello world\n"
I
$ cc hello . c
hello . c :6:16: missing terminating " character
hello . c :7:9: missing terminating " character
hello . c : In function main :
hello . c :8: error : syntax error before " return "
38
37
Escape Sequences
our first program could just as well have been written like
below to produce identical output
\\
\?
\
\"
\ooo
\xhh
backslash
question mark
single quote
double quote
octal number
hexadecimal number
39
40
2
3
4
5
6
7
Arrays
Functions
lower = 0;
/* lower limit */
upper = 300; /* upper limit */
step = 20;
/* step size */
10
11
12
13
fahr = lower ;
while ( fahr <= upper ) {
celsius = 5 * ( fahr - 32) / 9;
printf ( " % d \ t % d \ n " , fahr , celsius );
fahr = fahr + step ;
}
return (0);
14
15
Character Arrays
16
17
18
19
20
21
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
41
42
char
int
float
double
-17
-6
4
15
26
37
48
60
71
82
93
104
115
126
137
148
lower = 0;
/* lower limit */
upper = 300; /* upper limit */
step = 20;
/* step size */
description
a single byte, capable of holding one character
in the local character set
an integer, typically reflecting the natural
size of integers on the host machine
single-precision floating point
double-precision floating point
43
44
15
16
17
18
limits10
19
Numerical
are documented in <limits.h> and
<float.h>. Additional limits are specified in <stdint.h>11
Assignment
10
11
Integer Division
46
printf(3) Revisited
# include < stdio .h >
int
printf ( const char * format , ...);
Each % in the 1st arg is paired with the 2nd, 3rd arg etc.
printf ( " % d \ t % d \ n " , fahr , celsius );
17
47
12
1
2
Fixing problems
3
4
6
8
9
10
11
12
NAME
13
lower = 0;
/* lower limit */
@@ -13 ,8 +13 ,8 @@ main ( void )
14
15
SYNOPSIS
diff [ OPTION ]... FILES
16
17
18
DESCRIPTION
Compare files line by line .
-u -U NUM -- unified [= NUM ]
Output NUM ( default 3) lines of unified context .
-p -- show -c - function
Show which C function each change is in .
19
20
21
22
23
+
+
fahr = lower ;
while ( fahr <= upper ) {
celsius = 5 * ( fahr - 32) / 9;
printf ( " % d \ t % d \ n " , fahr , celsius );
celsius = (5.0/9.0) * ( fahr - 32.0);
printf ( " %3.0 f \ t %6.1 f \ n " , fahr , celsius );
fahr = fahr + step ;
}
return (0);
50
49
Fahrenheit-Celsius Converter v2
1
2
3
fah re n he it _ v1 _ v2 . diff
4
5
6
7
8
lower = 0;
/* lower limit */
upper = 300; /* upper limit */
step = 20;
/* step size */
10
11
12
13
fahr = lower ;
while ( fahr <= upper ) {
celsius = (5.0/9.0) * ( fahr - 32.0);
printf ( " %3.0 f \ t %6.1 f \ n " , fahr , celsius );
fahr = fahr + step ;
}
return (0);
14
15
16
17
18
fahrenheit_v1 . c . orig
19
20
21
51
0
20
40
60
80
100
120
140
160
180
200
220
240
...
-17.8
-6.7
4.4
15.6
26.7
37.8
48.9
60.0
71.1
82.2
93.3
104.4
115.6
...
}
52
print as . . .
decimal integer
decimal, at least 6 characters wide
floating point
floating point, at least 6 characters wide
floating point, 2 characters after decimal point
floating point, at least 6 wide and 2 after decimal point
1
2
3
4
5
6
7
9
10
11
return (0);
12
13
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
-17.8
-6.7
4.4
15.6
26.7
37.8
48.9
60.0
71.1
82.2
93.3
104.4
115.6
126.7
137.8
148.9
54
53
A Tutorial Introduction
Variables and Arithmetic Expressions
Character Input and Output
2
3
4
5
Arrays
# define LOWER 0
/* lower limit of table */
# define UPPER 300 /* upper limit */
# define STEP 20 /* step size */
Functions
6
7
8
9
10
11
12
13
14
15
return (0);
16
17
}
55
56
int
getchar ( void );
I
I
int
putchar ( int c );
57
File Copying
58
File Copying, v1
read a character
while (character is not end-of-file indicator)
output the character just read
read a character
2
3
4
5
6
read a character
while (character is not end-of-file indicator)
output the character just read
read a character
c = getchar ();
while ( c != EOF ) {
putchar ( c );
c = getchar ();
}
9
10
11
12
13
14
return (0);
15
16
59
60
File Copying, v2
I
Character Counting, v1
3
4
2
3
4
5
6
7
nc = 0;
while ( getchar () != EOF )
++ nc ;
printf ( " % ld \ n " , nc );
9
10
11
12
9
10
13
return (0);
14
11
15
return (0);
12
13
61
Character Counting, v2
62
Line Counting
I
2
3
4
5
6
7
3
4
5
6
7
9
10
11
nl = 0;
while (( c = getchar ()) != EOF )
if ( c == \ n )
++ nl ;
printf ( " % d \ n " , nl );
9
10
12
11
return (0);
13
14
12
13
14
return (0);
15
16
63
}
64
Word Counting
2
3
4
# define IN 1
# define OUT 0
/* inside a word */
/* outside a word */
5
6
NAME
8
9
SYNOPSIS
wc [ - c | -m ] [ - hlw ] [ file ...]
10
11
state = OUT ;
nl = nw = nc = 0;
while (( c = getchar ()) != EOF ) {
++ nc ;
if ( c == \ n )
++ nl ;
if ( c == || c == \ n || c == \ t )
state = OUT ;
else if ( state == OUT ) {
state = IN ;
++ nw ;
}
}
printf ( " % d % d % d \ n " , nl , nw , nc );
return (0);
12
DESCRIPTION
The wc utility reads one or more input text files , and ,
by default , writes the number of lines , words , and bytes
contained in each input file to the standard output
13
14
15
16
17
$ wc / etc / services
285
1398
9732 / etc / services
$ cc count_words . c
$ cat / etc / services | ./ a . out
285 1398 9732
18
19
20
21
22
23
24
25
26
27
65
66
It will help us to . . .
Arrays
Functions
Call by Value, Call by Reference
introduce arrays
Character Arrays
67
68
2
3
4
5
6
7
8
A Tutorial Introduction
Variables and Arithmetic Expressions
Character Input and Output
nwhite = nother = 0;
for ( i = 0; i < 10; ++ i )
ndigit [ i ] = 0;
10
11
12
Arrays
13
14
15
16
17
18
19
20
Functions
Call by Value, Call by Reference
Character Arrays
21
22
23
24
25
26
return (0);
27
28
70
69
power(m,n)
2
3
4
5
Functions, self-defined
6
7
8
9
10
11
12
13
14
return-type
function-name(parameter declarations, if any)
{
declarations
statements
}
15
16
17
18
19
20
21
p = 1;
for ( i = 1; i <= n ; ++ i )
p = p * base ;
return p ;
22
23
24
25
26
13
Only handles positive powers of small integers, in real life take pow(3).
71
72
Function Terminology
A Tutorial Introduction
Variables and Arithmetic Expressions
line 3: function declaration (function prototype), says that power is a
function that expects two int arguments and returns an int
line 17: function definition starts with the declaration of the parameter
types and names, and the type of the result that the function
returns (has to match with the prototype)
I
Arrays
Functions
Call by Value, Call by Reference
Character Arrays
Variables, Declarations and Scope
74
73
ArgumentsCall by Value/Reference
75
the function can access and alter any element of the array
76
Character Arrays
A Tutorial Introduction
Variables and Arithmetic Expressions
Program outline:
Functions
77
78
8
9
10
11
12
13
14
15
max = 0;
while (( len = getline ( line , MAXLINE )) > 0)
if ( len > max ) {
max = len ;
copy ( longest , line );
}
if ( max > 0)
/* there was a line */
printf ( " % s " , longest );
return (0);
16
17
18
19
1
2
20
21
22
3
4
5
line length */
length seen so far */
current input line */
longest line saved here */
23
24
25
79
80
Getting A Line
27
28
29
30
31
copy()
43
44
45
46
32
47
33
34
35
36
37
38
39
40
41
48
i = 0;
while (( to [ i ] = from [ i ]) != \0 )
++ i ;
49
50
51
52
}
I
}
I
"hello\n" is stored as
\n
\0
81
82
Automatic Variables
A Tutorial Introduction
Variables and Arithmetic Expressions
Character Arrays
Arrays
Functions
External Variables
They retain their values even after the functions that set them
have returned
86
85
1
2
30
31
32
3
4
5
6
33
int max ;
/* maximum length seen so far */
char line [ MAXLINE ];
/* current input line */
char longest [ MAXLINE ]; /* longest line saved here */
34
35
37
38
39
10
11
12
13
14
15
16
17
40
41
42
43
44
45
47
48
max = 0;
while (( len = getline ()) > 0)
if ( len > max ) {
max = len ;
copy ();
}
if ( max > 0)
/* there was a line */
printf ( " % s " , longest );
return (0);
19
20
21
22
23
24
25
26
27
46
18
28
36
7
8
int
getline ( void )
{
int c , i ;
extern char line [];
49
50
51
void
copy ( void )
{
int i ;
extern char line [] , longest [];
52
i = 0;
while (( longest [ i ] = line [ i ]) != \0 )
++ i ;
53
54
55
56
87
}
88
void
f ( unsigned int m , long n )
{
static int i ;
...
}
15
We will see later how to define external variables and functions that are
visible only within a single source file, once again, the keyword is static
90
89
Register Variables
The register declaration
void
f ( register unsigned int m , register long n )
{
register int i ;
...
}
static void
f ( register unsigned int m , register long n )
{
...
}
91
92
Initialization
I
int x = 1;
char squote = \ ;
long day = 1000 L * 60 L * 60 L * 24 L ; /* milliseconds / day */
I
if ( n > 0) {
int i ;
94
93
Systems Programming
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
int x ;
int y ;
void
f ( double x )
{
double y ;
...
}
95
96
Storage Classes
Construction of an Executable
Relocation Process
97
C-Quellcode
int a, b;
a = b * b;
Intel iA32-Assembler-Quellcode
mov
imul
mov
0x403030,%eax
0x403030,%eax
%eax,0x403020
Maschinenbefehle bzw.
Prozessorinstruktionen
Adresse
98
4012ee
4012ef
4012f0
4012f1
4012f2
4012f3
4012f4
4012f5
4012f6
4012f7
4012f8
4012f9
4012fa
4012fb
4012fc
4012fd
4012fe
C-Quellcode
a1
30
30
40
00
0f
af
05
30
30
40
00
a3
20
30
40
00
int a=4, b;
int main(void) {
Speicheradresse
Assembler-Quellcode
Speicherinhalt
(=Maschinenbefehl)
if (a>5)
8048344:
804834b:
83 3d 94 94 04 08 05
7e 0c
cmpl
jle
$0x5,0x8049494
8048359
804834d:
8048354:
c7 05 8c 95 04 08 01
00 00 00
movl
$0x1,0x804958c
b=1;
8048357:
eb 0a
jmp
8048363
8048359:
8048360:
c7 05 8c 95 04 08 00
00 00 00
movl
$0x0,0x804958c
c9
...
else
b=0;
}
Ausfhrbarer Binrcode
8048363:
99
100
Address Space
Byte Ordering
Speicherinhalte
Adressraum
Startadresse des
Datenblocks
16 Byte
Datenblock
Adr.
n
n+1
n+2
n+3
0x1000000f
0x10000010
Gre des
Datenblocks
Adressen einzelner
Byte
0x50000000
0x50000001
Hchstmgliche Adresse
(Speicherende)
Daten (4 Byte):
LSB
MSB
d3
d2
Big-Endian-System
0x10000000
Letzte Byteadresse
des Datenblocks
Adr.
Speicheradressen
Tiefstmgliche Adresse
(Speicherbeginn)
0x56
0xfc
max.
Inhalt
d3
d2
d1
d0
MSB
LSB
d1
d0
Little-Endian-System
Adr.
n
n+1
n+2
n+3
Inhalt
d0
d1
d2
d3
LSB
MSB
Mit der Adresse n wird auf die 4 Byte groen Daten im Programm zugegriffen
MSB = Most Significant Byte (hchstwertiges Byte)
LSB = Least Significant Byte (niedrigstwertiges Byte)
max.
101
Alignment Rules
alignment(1)
Adressen
(hexadezimal)
0x35
0x36
0x37
0x38
alignment(4)
Datenbus
Adressraum
DatenLangwort
(misaligned)
102
Adressoffsets (Byteadressen)
+0
0x34
+1
0x35
+2
0x36
+3
0x37
0x38
0x39
0x3a
0x3b
1. Zugriff
2. Zugriff
16
103
arrays, functions, pointers, structures, unions (we will discuss them later)
104
Storage Classes
Repetition Computer Architecture
Storage Classes
Construction of an Executable
Relocation Process
105
Automatic Objects
I
17
Static Objects
106
In both cases, they retain their values across exit from and
reentry to functions and blocks
108
Intermediate Summary
I
I
dynamic data
I
I
stack or heap
storage space not known
volatile life time
110
109
Program Sections
Adressraum
.text
.data
.bss
PROM:
Programmable Read Only Memory
(im Betrieb nicht beschreibbarer
Speicherbaustein)
RAM
RAM
RAM:
Random Access Memory
(Speicher mit wahlfreiem Zugriff)
111
The Stack
The Heap
112
A Program In Memory
0
initialisierte Daten
dynamic
data
Adressen
Heap
Stack
Code, Konstanten
Stack
Programmstartadresse
initialisierte Daten
nicht initialisierte Daten
Code, Konstanten
Stack
initialisierte Daten
nicht initialisierte Daten
Heap
Adressen
Adressen
static
data
Code, Konstanten
Heap
113
Memory Segments
114
115
116
int a ;
static int b ;
void
func ( void )
{
char c ; /* only for the life time of func () */
/* but 2 x ; visible only in func ()
*/
static int d ; /* i m unique , exist once at a stable */
/* address , visible only in func ()
*/
}
void
func ( void )
{
char c ;
static int d ;
}
int
main ( void )
{
int e ;
int
main ( void )
{
int e ; /* life time of main () */
}
}
117
Variable Placement
Adresse
0
1. Instruktion
2. Instruktion
3. Instruktion
4. Instruktion
...
a
b
d
int
PC(t=0)
PC(t=x)
pi
SP(t=x)
c
pi
e
SP(t=0)
max.
Code
Daten
Halde (Heap)
Stapel (Stack)
118
119
120
Storage Classes
Construction of an Executable
Relocation Process
121
Quellcode C/C++
Eingabedateien
*.c/*.cc/*.cpp
Ausgabedateien
Vorverarbeiteter
C/C++-Quellcode
Objektdatei,
Bibliotheksdatei
Assembler-Quellcode
*.s
Prprozessor
Compiler
*.i/*.ii
Assembler-Quellcode
122
*.o/*.a
Assembler
*.s
Objektdatei
(ungebunden)
suffix
.c
.i
.h
.s
.o
Binder
*.o
For any given input file, the file name suffix determines what kind
of compilation is done (see gcc(1)) for more details and suffixes:
a.out
compilation step
C source code which must be preprocessed
C source code which should not be preprocessed
Header file to be turned into a precompiled header
Assembler code
An object file to be fed straight into linking
Ausfhrbare Datei
(= Objektdatei, ladbar)
123
124
The C Preprocessor
(Filename).c
Kompilieren
gcc
(Filename).s
= Operation
= Kommando
= Eingang oder
Ausgang
Assemblieren
gas
Macro Substitution
Conditional Compilation
(Filename).o
Object/Library Files
ld
Binden
a.out
125
File Inclusion
126
Macro Substitution
# include filename
Note
Example
The characters in the name filename must not include > or \n, and
the effect is undefined if it contains any of ", , \ , or /*.
# define
# define
# define
# define
# define
# define
Location
The named file is searched for in a sequence of implementationdependent places (often starting in /usr/include).
127
EXIT_FAILURE
1
EXIT_SUCCESS
0
S_IRWXU 0000700
S_IRUSR 0000400
S_IWUSR 0000200
S_IXUSR 0000100
/*
/*
/*
/*
128
# undef identifier
Example
Example
# define
# define
# define
# define
# define
S_ISDIR ( m )
S_ISCHR ( m )
S_ISBLK ( m )
S_ISREG ( m )
S_ISFIFO ( m )
(( m
(( m
(( m
(( m
(( m
&
&
&
&
&
0170000)
0170000)
0170000)
0170000)
0170000)
==
==
==
==
==
0040000)
0020000)
0060000)
0100000)
0010000)
/*
/*
/*
/*
/*
/*
* Some header files may define an abs macro .
* If defined , undef it to prevent a syntax error
* and issue a warning .
* # warning is a pragma ( implementation - dependent action )
*/
# ifdef abs
# undef abs
# warning abs macro collides with abs () prototype , undefining
# endif
directory */
char sp . */
block sp . */
regular */
fifo
*/
130
129
Conditional Inclusion
Predefined Names
Example
# ifndef
# ifdef
# define
# else
# define
# endif
# endif
NULL
__GNUG__
NULL
__null
NULL
LINE
FILE
DATE
TIME
0L
STDC
131
132
Compilation
Assembly
Text
Kompilation
Compiler
Assembler-Quellcode
Text
Objektformat
Text
Assemblierung
AssemblerQuellcode
Assembler
bersetzungsliste mit
Fehlermeldungen
Maschinencode und
Zusatzinformationen
Text
bersetzungsliste mit Fehlermeldungen und Symboltabelle
133
134
Linking
Repetition Computer Architecture
Storage Classes
Binrcode od.
Objektformat
Binden
Objektformat
Maschinencode und Zusatzinfo.
Bibliotheksobjektformat
Maschinencode und Zusatzinfo.
Binder (Linker)
library
search
Text
Link Map (Adressraumbenutzung), Symbolliste
Construction of an Executable
Relocation Process
135
136
Nach Bindung
Nach Kompilation
.text1
OBJ2
.text2
OBJ3
.text3
.data1
.bss1
.text: Code
.data: initialisierte Variablen
.bss: nicht initialisierte Variablen
.bss2
Adressraum
.data3
.bss3
0x08048244
xx
Bindung (linking)
.text1
OBJtotal
0x08049370
0
yy
Jede Sektion beginnt bei Adr. 0, Sektionen
sind logische. Adressrume des Compilers
.text2
.text3
.data1
.data3
.bss1
.bss2
.bss3
0xffffffff
Alle Sektionen sind im Adressraum absolut platziert
I
I
centralization of sections
relocation of adresses
137
Relocation Records
I
138
int
main ( void )
{
static int c ;
b = 5;
c = b + a + 16;
return c ;
}
139
140
$ file compile . o
ELF 32 - bit LSB relocatable , Intel 80386 , version 1 , not stripped
SYMBOL TABLE :
00000000 l
00000000 l
00000000 l
00000000 l
00000000 l
00000000 l
00000000 g
00000000 g
00000004
$ objdump -x compile . o
compile . o :
file format elf32 - i386
compile . o
architecture : i386 , flags 0 x00000011 :
HAS_RELOC , HAS_SYMS
start address 0 x00000000
Sections :
Idx Name
0 . text
1 . data
2 . bss
3 . rodata
Size
0000005 a
CONTENTS ,
00000004
CONTENTS ,
00000004
ALLOC
00000005
CONTENTS ,
VMA
LMA
00000000 00000000
ALLOC , LOAD , RELOC ,
00000000 00000000
ALLOC , LOAD , DATA
00000000 00000000
File off
00000034
READONLY ,
00000090
Algn
2**2
CODE
2**2
00000094
2**2
2**0
df
d
d
d
O
d
O
F
O
* ABS *
. text
. data
. bss
. bss
. rodata
. data
. text
* COM *
00000000
00000000
00000000
00000000
00000004
00000000
00000004
0000005 a
00000004
compile . c
c .0
a
main
b
141
compile . o :
file format elf32 - i386
Disassembly of section . text :
00000000 < main >:
0:
55
push
1:
89 e5
mov
3:
83 ec 18
sub
6:
83 e4 f0
and
9:
b8 00 00 00 00
mov
e:
29 c4
sub
10:
a1 00 00 00 00
mov
15:
89 45 e8
mov
18:
c7 05 00 00 00 00 05
movl
1f:
00 00 00
22:
a1 00 00 00 00
mov
27:
03 05 00 00 00 00
add
2d:
83 c0 10
add
30:
a3 00 00 00 00
mov
35:
a1 00 00 00 00
mov
3a:
8 b 55 e8
mov
3d:
3 b 15 00 00 00 00
cmp
43:
74 13
je
45:
83 ec 08
sub
48:
ff 75 e8
pushl
4b:
68 00 00 00 00
push
50:
e8 fc ff ff ff
call
55:
83 c4 10
add
58:
c9
leave
59:
c3
ret
142
compile . o :
file format elf32 - i386
Disassembly of section . text :
00000000 < main >:
int b ;
/* Global variable , uninitialized -> . bss
% ebp
% esp ,% ebp
$0x18 ,% esp
$0xfffffff0 ,% esp
$0x0 ,% eax
% eax ,% esp
0 x0 ,% eax
% eax ,0 xffffffe8 (% ebp )
$0x5 ,0 x0
*/
int
main ( void )
{
0:
55
push
% ebp
... 6 more lines ...
15:
89 45 e8
mov
% eax ,0 xffffffe8 (% ebp )
static int c ; /* Local , static variable -> . bss */
0 x0 ,% eax
0 x0 ,% eax
$0x10 ,% eax
% eax ,0 x0
0 x0 ,% eax
0 xffffffe8 (% ebp ) ,% edx
0 x0 ,% edx
58 < main +0 x58 >
$0x8 ,% esp
0 xffffffe8 (% ebp )
$0x0
51 < main +0 x51 >
$0x10 ,% esp
18:
1f:
22:
27:
2d:
30:
35:
b = 5;
c7 05 00 00
00 00 00
c = b + a +
a1 00 00 00
03 05 00 00
83 c0 10
a3 00 00 00
return c ;
a1 00 00 00
movl
$0x5 ,0 x0
00
mov
add
add
mov
0 x0 ,% eax
0 x0 ,% eax
$0x10 ,% eax
% eax ,0 x0
00
mov
0 x0 ,% eax
00 00 05
16;
00
00 00
}
... 10 more lines ...
143
144
int
main ( void )
{
1 c0005c0 :
55
1 c0005c1 :
89
1 c0005c3 :
83
1 c0005c6 :
83
1 c0005c9 :
b8
1 c0005ce :
29
1 c0005d0 :
a1
1 c0005d5 :
89
static int
compile :
file format elf32 - i386
compile
architecture : i386 , flags 0 x00000112 :
EXEC_P , HAS_SYMS , D_PAGED
start address 0 x1c000408
Sections :
Idx Name
...
9 . text
Size
...
12 . data
...
20 . bss
SYMBOL TABLE :
3 c003140 l
3 c003280 g
1 c0005c0 g
3 c001018 g
O
O
F
O
File off
Algn
2**2
00001008
2**2
00000184
ALLOC
00001100
. bss
. bss
. text
. data
VMA
LMA
3 c003100
00000004
00000004
0000005 a
00000004
3 c003100
e5
ec
e4
00
c4
00
45
c;
push
% ebp
mov
% esp ,% ebp
18
sub
$0x18 ,% esp
f0
and
$0xfffffff0 ,% esp
00 00 00
mov
$0x0 ,% eax
sub
% eax ,% esp
31 00 3 c
mov
0 x3c003100 ,% eax
e8
mov
% eax ,0 xffffffe8 (% ebp )
/* Local , static variable -> . bss */
b = 5;
1 c0005d8 :
c7 05 80
1 c0005df :
00 00 00
c = b + a + 16;
1 c0005e2 :
a1 18 10
1 c0005e7 :
03 05 80
1 c0005ed :
83 c0 10
1 c0005f0 :
a3 40 31
return c ;
1 c0005f5 :
a1 40 31
}
2**5
c .0
b
main
a
*/
32 00 3 c 05
movl
$0x5 ,0 x3c003280
00 3 c
32 00 3 c
00 3 c
mov
add
add
mov
0 x3c001018 ,% eax
0 x3c003280 ,% eax
$0x10 ,% eax
% eax ,0 x3c003140
00 3 c
mov
0 x3c003140 ,% eax
145
146
Storage Classes
Relocation Process
147
148
18:
1 c0005d8 :
c7 05 00 00 00 00 05
c7 05 80 32 00 3 c 05
movl
movl
$0x5 ,0 x0
$0x5 ,0 x3c003280
149
150
Systems Programming
03. Functions and Program Structure
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
151
152
Basics Of Functions
Basics of Functions
Functions Returning Non-integers
Basics of Functions
External Variables
Scope Rules
Header Files
Static Variables
A Program in Execution - Unix Run-time
154
153
Example:
Input: Text in /etc/services
Pattern: http
$ ./ a . out < / etc / services
# See also http :// www . iana . org / assignments / port - numbers
www
80/ tcp
http
# WorldWideWeb HTTP
https
443/ tcp
# secure http ( SSL )
155
As said, small pieces are easier to deal with than one big one
156
18
158
Function Definition
A function definition has the form:
return-type
function-name(parameter declarations, if any)
{
declarations
statements
}
I
void dummy(void) { }
which does nothing, accepts nothing, and returns nothing19
}
I
19
159
by argument
values returned by the functions
through external variables
161
162
External Variables
Scope Rules
Header Files
Static Variables
20
163
double
/* atof : convert string s to double */
atof ( char s [])
{
double val , power ;
int i , sign ;
int
/* rudimentary calculator */
main ( void )
{
double sum , atof ( char []);
char line [ MAXLINE ];
int getline ( char line [] , int max );
+123.2
123.2
-0.2
123
+0.7
123.7
-123.1
sum = 0;
while ( getline ( line , MAXLINE ) > 0)
printf ( " \ t % g \ n " , sum += atof ( line ));
return (0);
0.6
}
165
166
The declaration
double sum, atof(char []);
says that sum is a double variable, and that atof is a function that
takes one char[] argument and returns double.
I
167
168
170
169
External Objects
Basics of Functions
Functions Returning Non-integers
External Variables
Scope Rules
Header Files
Static Variables
A Program in Execution - Unix Run-time
21
171
1 -1 -1 -1 -1 -9
2
4 4 9
5
input :
Function evaluation
Scope Rules
Program description
( 1 - 2 ) * ( 4 + 5 )
1 2 - 4 5 + *
I
I
The value on the top of the stack is popped and printed when
the end of the input line is encountered.
174
173
Keep it in main.
Pass the stack to the routines that push and pop it
I But main doesnt need to know about the stack
I main only does push and pop operations
175
176
4
5
# includes
# defines
9
10
13
14
15
16
17
21
22
23
24
11
12
19
20
7
8
18
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
178
177
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
switch ( type ) {
41
case \ n :
case NUMBER :
42
printf ( " \ t %.8 g \ n " , pop ());
push ( atof ( s ));
43
break ;
break ;
44
default :
case + :
45
printf ( " error : unknown command % s \
push ( pop () + pop ());
46
}
break ;
case * :
To guarantee the right order, it
push ( pop () * pop ());
is necessary to pop the first value
break ;
into a temporary variable.
case - :
op2 = pop ();
push ( pop () - op2 );
break ;
case / :
op2 = pop ();
if ( op2 != 0.0)
push ( pop () / op2 );
else
printf ( " error : zero divisor \ n " );
break ;
180
The stack itself and its fill factor (the stack pointer) are
shared by push and pop
Since they are defined outside any function, they are external
66
67
51
68
69
52
53
54
70
int sp = 0;
/* next free stack position */
double val [ MAXVAL ]; /* value stack */
71
72
55
56
57
58
59
60
61
62
63
64
73
74
75
76
181
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
182
}
183
184
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
22
186
Scope Rules
Header Files
Static Variables
A Program in Execution - Unix Run-time
187
188
Visibility Scope
Visibility Scope
main, sp, val, push, & pop defined in one file, in the order shown:
The scope of a name is the part of the program within which the
name can be used.
int
main ( void )
{ ... }
int sp = 0;
double val [ MAXVAL ];
void
push ( double f )
{ ... }
double
pop ( void )
{ ... }
190
and serve as the declaration for the rest of that source file.
extern int sp ;
extern double val [];
191
They declare for the rest of the source file that sp is an int
and val is a double[] (whose size is determined elsewhere)
Definition/Declaration Of Externals
Although it is not a likely organization for this program
23
extern int sp ;
extern double val [];
There may also be extern declarations in the file containing the definition
194
193
External Variables
Scope Rules
Header Files
main main.c
getop getop.c
Static Variables
A Program in Execution - Unix Run-time
24
We seperate them from the others because they would come from a
seperately-compiled library in a realistic program
195
196
Header File
Program Structure
calc . h
main . c
# define NUMBER 0
void push ( double );
double pop ( void );
int getop ( char []);
int getch ( void );
void ungetch ( int );
getch . c
stack . c
getop . c
int sp = 0;
double val [ MAXVAL ];
198
197
Static Variables
Basics of Functions
The variables
Functions Returning Non-integers
External Variables
Scope Rules
Header Files
limits the scope of that object to the rest of the source file
External static thus provides a way to hide names like buf and
bufp in the getch-ungetch combination, which must be external so
they can be shared, yet which should not be visible to users of
getch and ungetch.
Static Variables
A Program in Execution - Unix Run-time
199
200
If the two function and the two variables are compiled in one file:
The names will not conflict with the names in other files of
the same program
$ readelf -s global . o
Symbol table . symtab
Num :
Value Size
0: 00000000
0
1: 00000000
0
2: 00000000
0
3: 00000000
0
4: 00000000
0
5: 00000005
5
6: 00000000
contains 7 entries :
Type
Bind
Vis
NOTYPE LOCAL DEFAULT
FILE
LOCAL DEFAULT
SECTION LOCAL DEFAULT
SECTION LOCAL DEFAULT
SECTION LOCAL DEFAULT
FUNC
LOCAL DEFAULT
^^^^^
5 FUNC
GLOBAL DEFAULT
^^^^^^
Ndx Name
UND
ABS global . c
1
2
3
1 local_func
1 global_func
202
201
$ readelf -a internal_stat ic . o
Section Headers :
[ Nr ] Name
Type
Addr
Off
[ 3] . data PROGBITS 00000000 000074
[ 4] . bss
NOBITS
00000000 000080
^^^^
Size
ES Flg Lk Inf Al
000000 00 WA 0
0 4
004000 00 WA 0
0 32
^^^^^^
$ readelf -a internal_stat ic . o
[ Nr ] Name
Type
Addr
Off
[ 3] . data PROGBITS 00000000 000080
^^^^^
[ 4] . bss
NOBITS
00000000 004080
203
Size
ES Flg Lk Inf Al
004000 00 WA 0
0 32
^^^^^^
000000 00 WA 0
0 4
204
Array Intialization
Basics of Functions
int main ( void ) {
static int i nte rn al_ stat ic [4096] = { 1 , 2 , 3 };
}
Array Intialization
I
Scope Rules
Header Files
Static Variables
A Program in Execution - Unix Run-time
206
205
Following slides about the function call stack are taken from:
Prof. Dr. Torsten Grust.
Buffer Overflow Exploits.
Talk at the University of Konstanz
February 2004
http://www3.in.tum.de/cms/members/grust
207
208
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
209
210
void *
malloc ( size_t size );
void *
calloc ( size_t nmemb , size_t size );
malloc(3)
211
212
void *
malloc ( size_t size );
void *
realloc ( void * ptr , size_t size );
void *
calloc ( size_t nmemb , size_t size );
realloc(3)
I
calloc(3)
I
214
213
int * ip ;
215
216
int
main ( void )
{
char c ;
char * p ;
c = @ ;
p = &c;
Command-line Arguments
/* @ = 0 x40 */
return (0);
}
218
217
p : 0 xcfbe5174 :
75:
76:
77:
:
:
:
c : 0 xcfbe517b :
p : 0 xcfbe5174 :
75:
76:
77:
:
:
:
c : 0 xcfbe517b :
p is said to point to c
& operator only applies to objects in memory
219
220
Pointing To Integers
(gdb) p /x i
/* examine content of i in hex
$2 = 0xdeadbeaf
(gdb) p pi
/* print content of pi
$3 = (unsigned int *) 0xcfbe2958 /* address uint is stored
(gdb) x /4b 0xcfbe2958 /* examine 4 bytes at this address
0xcfbe2958: 0xaf 0xbe 0xad 0xde
/* little endian
(gdb) p &pi
/* print the address of the pointer to int
$4 = (unsigned int **) 0xcfbe2954
(gdb) x /4b 0xcfbe2954
/* print what is stored there
0xcfbe2954: 0x58 0x29 0xbe 0xcf
/* the address of i
int
main ( void )
{
unsigned int i ;
unsigned int * pi ; /* pointer to int */
i = 0 xdeadbeaf ;
pi = & i ; /* address of i in pointer variable */
pi :
return (0);
}
i:
0 xcfbe2954 :
55:
56:
57:
0 xcfbe2958 :
59:
5a:
5b:
0 x58 -.
0 x29 |
0 xbe | - - - -.
0 xcf -
|
|
0 xaf
<----
0 xbe
0 xad
0 xde
0xcfbe2954:
0xcfbe2958:
0xcfbe2958
0xdeadbeaf
*/
*/
222
(gdb) p /x &pi
$10 = 0xcfbe2954
(gdb) p pi
$3 = (unsigned int *) 0xcfbe2958
(gdb) p /x i
$13 = 0xdeadbeaf
pi:
i:
*/
*/
*/
*/
*/
pointer to int
221
*/
int x = 1, y = 2, z[10];
int *ip; /* ip is a pointer to int */
/* address of an int */
y = *ip; /* y is now 1 */
*ip = 0; /* x is now 0 */
ip = &z[0]; /* ip now points to z[0] */
223
224
Declaration Of A Pointer
int * ip ;
1
2
3
int * ip ;
int x = 0;
4
5
ip = & x ;
* ip = * ip + 10;
226
225
Unary operator * and & bind more tightly than arithmetic ones
Operators
( ) [ ] -> .
! ~ ++ -- + - * & (type) sizeof
* / %
+ << >>
< <= > >=
== !=
&
^
|
&&
||
?:
= += -= *= /= %= &= ^= |= <<= >>=
,
y = * ip + 1;
I
* ip += 1;
++* ip ;
(* ip )++;
I
Associativity
left to right
right to left
left to right
left to right
left to right
left to right
left to right
left to right
left to right
left to right
left to right
left to right
left to right
right to left
left to right
Table: Unary +, -, and * have higher precedence than the binary forms
227
228
iq = ip ;
I
230
229
/* WRONG */
void
swap ( int * px , int * py )
{
int tmp ;
tmp = x ;
x = y;
y = tmp ;
tmp = * px ;
* px = * py ;
* py = tmp ;
swap : px :
- - -.
|
py : - - -+ - -.
| |
| |
| |
| |
| |
caller : a : <-- |
|
b : <-----
231
232
Problem Statement:
Selected Approach25 :
25
233
Usage Of getint()
It also increments n
236
int a [10];
int * pa ;
int x ;
a[0] a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9]
238
237
x = * pa ;
pa1
pa1
a:
a:
a[0] a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9]
a[0] a[1] a[2] a[3] a[4] a[5] a[6] a[7] a[8] a[9]
Assignment pa = &a[0];
Assignment x = *pa;
239
240
Adding 1 To A Pointer
Adding i To A Pointer
pa1 pa+1
a:
a:
a[0] a[1] a[2] a[3] a[4]
In general:
In general:
241
242
a[i]
&a[i]
pa[i]
243
*(a+i)
a+i
*(pa+i)
244
Indexing Backwards
Illegal to refer to objects that are not within the array bounds
p[-1]
*(p + -1)
int a [] = {0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9};
char a [] = " abcdefghij " ;
a+i
0xcfbf2a44
0xcfbf2a48
0xcfbf2a4c
...
0xcfbf2a74
0xcfbf2a75
0xcfbf2a76
...
*(p - 1)
a[i]
0
1
2
...
a
b
c
...
(a+i+1) - (a+i)
0xcfbf2a48 - 0xcfbf2a44
0xcfbf2a4c - 0xcfbf2a48
0xcfbf2a50 - 0xcfbf2a4c
...
0xcfbf2a75 - 0xcfbf2a74
0xcfbf2a76 - 0xcfbf2a75
0xcfbf2a77 - 0xcfbf2a76
...
(a+i+1) - (a+i)
1
1
1
...
1
1
1
...
246
245
Arrays In C
A pointer is a variable
As such:
int * pa ;
int a [3];
pa = a ; /* legal */
pa ++;
/* legal */
/* a = pa ;
/* a ++;
illegal */
illegal */
* a = 1;
*( a + 1) = 2;
*( a + 1) = 3;
*( a + 2) = 4;
247
248
Definition:
The value of a variable (or expression) of type array is the
address of element zero of the array
I
250
249
Function definition:
As formal parameters char s[] and char *s are equivalent
I
f(a+2);
251
252
for ( n = 0; * s != \0 ; s ++)
n ++;
return n ;
}
I
254
253
Correct Statements?
Explanation
Left block
int a[10];
int *pa;
pa = a;
*pa = 0;
*(pa+1) = 1;
pa[2] = 2;
pa = &a[5];
*pa = 5;
*(pa-1) = 4;
pa[1] = 6;
pa = &a[9];
*pa = 9;
pa[-1] = 8;
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*/
*(pa+10) = 0;
*(pa-1) = 0;
/* WRONG */
/* WRONG */
pa = &a[5];
*(pa+10) = 0;
pa = &a[10];
*pa = 0;
/*
/*
/*
/*
OK */
WRONG */
OK */
WRONG */
Right block
int *pa2;
pa = &a[5];
/* OK */
pa2 = pa + 10; /* WRONG */
pa2 = pa - 10; /* WRONG */
255
The first examples set pa to point into the array a but then
use overly-large offsets (+10, -1) which end up trying to store
a value outside of the array a
int a [10];
int * pa , * pa2 ;
pa = &a[5];
pa2 = pa + 10; /* WRONG */
*pa2 = 0; /* WRONG */
# define NULL
0L
257
Comparison Of Pointers
258
Pointer Substraction
Pointer substraction to determine string length
Then relations like ==, !=, <, >=, etc. work properly
259
260
262
261
263
264
Good to remember:
C does not provide any operators for processing an entire string of
characters as a unit
I
pmessage:
char * pmessage ;
pmessage = " now is the time " ;
pmessage = " hello , world " ;
I
266
265
pmessage [0] = N ;
267
Equivalent to *pmessage = N
Why?
268
String Copy
Direct consequence of not treating a string as a unit
void
strcpy ( char s [] , char t [])
{
int i ;
for ( i = 0; t [ i ] != \0 ; i ++)
s [ i ] = t [ i ];
s [ i ] = \0 ;
}
269
270
Expressions like *p++ and *--p may seem cryptic at first sight
271
272
Comparing Strings
If the pointers are equal, they point to the same place, so they
certainly point to the same string
I
I
274
273
Intermediate Summary
for ( i = 0; s [ i ] == t [ i ]; i ++)
if ( s [ i ] == \0 )
return (0);
return s [ i ] - t [ i ];
}
/* strcmp as pointer version */
int
strcmp ( char *s , char * t )
{
for (; * s == * t ; s ++ , t ++)
if (* s == \0 )
return (0);
return * s - * t ;
}
275
276
277
Coding Style
I
279
280
281
282
Program Startup
Dynamic Memory Allocation
Command-line Arguments
283
284
Command-line Arguments
argv:
echo\0
hello,\0
world\0
NULL
286
285
Environment
Environment (cont.)
Terminology
HOME=/home/holu\0
SHELL=/bin/ksh\0
PS1=\w \$\0
NULL
287
Systems Programming
int
main ( int argc , char * argv [])
{
int i ;
return (0);
}
289
290
Enumerations
Typedef
Pointers to Functions
Function Callbacks
The libxml2 library
291
292
Structures
294
293
Structure Declaration
keyword struct
295
296
Structure Tag
Tagged structure
I
copying it
assigning to it as a unit
Illegal operation
I
Initialization
Anonymous structure
struct {
int i ;
int j ;
} a;
by assignment
by calling a function returning a struct of apt type
298
297
struct point * pp ;
p1 = makepoint (0 ,0);
p2 = makepoint ( XMAX , YMAX );
299
300
Nested Structures
struct rect {
struct point pt1 ;
struct point pt2 ;
}
ps->m
301
302
Unions
I
I
Enumerations
I
I
Typedef
a
a
a
a
This reads in C:
Pointers to Functions
struct tnode {
char * word ;
int count ;
struct tnode * left ;
struct tnode * right ;
};
Function Callbacks
The libxml2 library
303
304
now is the time for all good men to come to the aid of their party
I
I
1
1
1
1
1
1
1
1
1
1
2
1
1
2
now
aid
all
come
for
good
is
men
now
of
party
the
their
time
to
is
men
for
good
all
aid
the
time
of
party
their
come
305
to
306
struct tnode *
addtree ( struct tnode * , char *);
void treeprint ( struct tnode *);
int getword ( char * , int );
int
main ( void )
{
struct tnode * root ;
char word [ MAXWORD ];
/* make a duplicate of s */
char *
strdupl ( char * s )
{
char * p ;
p = ( char *) malloc ( strlen ( s ) + 1);
if ( p != NULL )
strcpy (p , s );
return p ;
}
}
root = NULL ;
while ( getword ( word , MAXWORD ) != EOF )
if ( isalpha ( word [0]))
root = addtree ( root , word );
treeprint ( root );
return (0);
}
struct tnode * talloc ( void );
307
308
Unions
Basics of Structures
I
Self-referential Structures
Unions
Enumerations
Pointers to Functions
union u_tag {
int ival ;
float fval ;
char * sval ;
} u;
Function Callbacks
Typedef
309
310
Unions (cont.)
I
Unions may occur within structures and arrays, and vice versa
Basics of Structures
Self-referential Structures
Unions
struct {
char * name ;
int flags ;
int utype ;
union {
int ival ;
float fval ;
char * sval ;
} u;
} symtab [ NSYM ];
Enumerations
Typedef
Pointers to Functions
Function Callbacks
The libxml2 library
symtab [ i ]. u . ival
* symtab [ i ]. u . sval
/* first char of string sval */
symtab [ i ]. u . sval [0] /* dito */
311
312
Enumerations
Enumerations (cont.)
314
313
Self-referential Structures
Unions
Enumerations
Typedef
Type Length can be used exactly in the same way as type int
Pointers to Functions
Portability issues
I
Function Callbacks
315
316
Further typedefs
Basics of Structures
I
Self-referential Structures
Unions
Enumerations
Typedef
Pointers to Functions
Function Callbacks
317
Pointers To Functions
318
I
I
void
print_one ( void )
{
printf ( " 1\ n " );
}
void
print_two ( void )
{
printf ( " 2\ n " );
}
int
main ( void )
{
void (* func [])( void ) = { print_one , print_two };
(* func [0])();
(* func [1])();
return (0);
320
Enumerations
Typedef
Pointers to Functions
321
322
<d / >
<e / >
</b >
<f >
<g / >
<h >
The SAX parser reads its input sequentially, and once only.
c
<i / >
<j / >
</h >
h
i
</f >
</a >
26
324
SAX Events
Selected events defined by the SAX interface27 :
Event
startDocument
endDocument
startElement
endElement
characters
comment
..
.
. . . triggered by
<?xml ...?>
EOF
<t a1 = v1 . . . an = vn >
</t>
text content
<!-- c -->
..
.
Formal arguments
t, (a1 , v1 ), . . . , (an , vn )
t
buffer pointer, length
c
..
.
Be careful with the characters event! For performance reasons the parser will
give you a pointer to its own memory space. Never write to this memory, and
never look further than the length given by the parser!
27
325
326
Function Callbacks
<? xml version ="1.0" encoding =" iso -8859 -1"? >
<fs >
< dir name =" etc " >
< file name =" services " >
# $OpenBSD : services , v 1.67 2007/05/01 11:48:40 steven Exp $
#
# Network services , Internet style
#
# Note that it is presently the policy of IANA to assign a single ...
</ file >
</ dir >
< dir name =" usr "/ >
</ fs >
Event
startDocument
startElement
startElement
startElement
characters
endElement
endElement
..
.
Actual arguments
t = "fs"
t = "dir", a1 = "name", v1 = "etc"
t = "file", a1 = "name", v1 = "services"
c = "# $OpenBSD: services ...", len = n
t = "file"
t = "dir"
..
.
327
Whenever any of the SAX event occurs, the parser calls the
function that is registered for this event.
328
Self-referential Structures
void
s ax _ st ar t _d oc u me n t ( void * ctx );
Unions
void
sax_en d_docu ment ( void * ctx );
Enumerations
void
sa x_s tar t_el eme nt ( void * ctx , const xmlChar *t , const xmlChar ** atts );
void
sax_end_element ( void * ctx , const xmlChar * t );
Typedef
void
sax_characters ( void * ctx , const xmlChar *c , int len );
Pointers to Functions
Function Callbacks
The libxml2 library
329
330
struct xmlSAXHandler {
s t a rt D o c u m e n t S A X F u n c startDocument ;
en dD oc um en tS AX Fun c endDocument ;
s t ar t El e m en t S AX F u nc startElement ;
endE lement SAXFun c endElement ;
char acters SAXFun c characters ;
...
};
331
332
/* context pointer */
xmlParserCtxtPtr ctx ;
int
main ( void )
{
/* register callback function */
sax_handler . characters = sax_characters ;
333
Course repository:
pub/src/sax xmp.c
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
334
int
main ( void )
{
/* context pointer */
x mlPar serC txtPtr ctx ;
Compilation Phase
cc - Wall
-I / usr / local / include / libxml2
-I / usr / local / include
-c
sax_xmp . c
Linking Phase
cc - Wall
-L / usr / local / lib
- lxml2
sax_xmp . o
return (0);
}
335
336
At one sweep
mond10 :~ > cc - Wall -I / usr / include / libxml2 - lxml2 sax_xmp . c
338
337
User Level
Kernel Level
Zeichen (character)
Block (block)
ProzesssteuerungsSubsystem (process
control subsystem)
Interprozesskommunikation (IPC)
Prozessorzuteilung
(scheduling)
Speicherverwaltung
(memory management)
Hardwaresteuerung
Kernel Level
Hardware Level
Hardware
339
340
System Architecture
FUSE
Implementation
Implementation Steps
I
cat ~/Mail/a.msg/subject
pre/size/level
...
libc
libfuse
user
Expected result
kernel
FUSE
VFS
ext3
Figure: Extending the file hierarchy into the file.
341
342
a
b
f
g
c
d
h
i
343
344
Pre-Order/Post-Order Traversal
<a >
0 <a >
<b >
1 <b >
<c >
<d / >
<e / >
</c >
</b >
<f >
<g / >
<h >
<i / >
<j / >
</h >
</f >
</a >
2 <c >
3 <d / > 0
4 <e / > 1
</c > 2
</b > 3
5 <f >
6 <g / > 4
7 <h >
8 <i / > 5
9 <j / > 6
</h > 7
</f > 8
</a > 9
345
346
pre
0
1
2
3
4
5
6
7
8
9
post
9
3
2
0
1
8
4
7
5
6
Tree Reconstruction
8
Implementation Details
http://www-db.in.tum.de/~grust/files/xpath-accel.pdf (short)
http://www-db.in.tum.de/~grust/files/accelerating-locsteps.pdf
347
348
Tree Reconstruction
I Sequentially scan table T with encoded document (in pre-order).
I Maintain a stack with nodes whose start (but no end) tag was printed.
I Before processing a node n, check stack for nodes n0 whose end tags have
to be emitted first. This is the case for n0 with post(n0 ) < post(n).
Tree Reconstruction
350
349
post
12
11
10
1
0
3
2
5
4
7
6
9
8
kind
elem
elem
elem
elem
text
elem
text
elem
text
elem
text
elem
text
content
fs
dir
file
date
Tue, 13 May 2008 17:48:56 +0200 (CEST)
from
Christian.Pich@uni-konstanz.de
to
inf@inf.uni-konstanz.de
subject
EM-Tipprunde
message
Lieber Fachbereich, es sind ...
351
352
pre
0
1
2
3
4
5
6
7
8
9
10
11
12
post
12
11
10
1
0
3
2
5
4
7
6
9
8
size
12
11
10
1
0
1
0
1
0
1
0
1
0
level
0
1
2
3
4
3
4
3
4
3
4
3
4
kind
elem
elem
elem
elem
text
elem
text
elem
text
elem
text
elem
text
content
fs
dir
file
date
Tue, 13 May 2008 17:48:56 +0200 (CEST)
from
Christian.Pich@uni-konstanz.de
to
inf@inf.uni-konstanz.de
subject
EM-Tipprunde
message
Lieber Fachbereich, es sind ...
354
353
Tree Reconstruction
Storing XML Trees in (R)DBMSs
http://www.pathfinder-xquery.org/
I
MonetDB/XQuery DBMS
http://monetdb-xquery.org/
Implementation Details
355
356
UNIX Filesystem
$ tree ./ a
0 a9
| - - 1 b3
|
-- 2 c 2
|
|-|
--- 5 f 8
| - - 6 g4
-- 7 h 7
|---
3 d0
4 e1
8 i5
9 j6
28
The stat(2) and fstat(2) functions return a structure containing all the
attributes of a file.
357
358
Mapi
MonetDB/XQuery
xls /mnt/fuse
FUSE
Implementation
FUSE XQuery
Module
ls /mnt/fuse
FSOps to
XQuery/XQUF
Pathfinder
XQuery Compiler
libc
libfuse
MonetDB Kernel
Tree Reconstruction
Storing XML Trees in (R)DBMSs
user
kernel
VFS
FUSE
ext3
Implementation Details
Figure: A filesystem in userspace implemented by a tree-aware DBMS
359
360
Tuple Representation
enum kind {
ELEM = 1 , ATTR , TXT , COMMENT , DOC , UDEF
};
struct tuple {
int pre ;
int post ;
int level ;
int size ;
enum kind ;
char * name ;
};
I
struct tuple {
unsigned int size ;
unsigned int level ;
enum kind kind ;
void * cnt ;
};
361
362
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
15.06. 29.06.
363
364
So far:
I
I
I
366
365
Unbuffered I/O
shell
system calls
kernel
library routines
367
368
File Descriptor
File Descriptor
I
370
369
I/O Redirection
I
The user can redirect I/O to and from files with < and >:
In this case, the shell changes the default assignments for file
descriptors 0 and 1 to the named files.
In all cases, the file assignments are changed by the shell, not
by the program. The program does not known where its input
comes from nor where its output goes, so long as it uses file 0
for input and 1 and 2 for output.
$ strace ./ hello
execve ( " ./ hello " , [ " ./ hello " ] , [ /* 72 vars */ ]) = 0
brk (0)
= 0 x602000
mmap ( NULL , 4096 , PROT_READ | PROT , ...
) = 0 x2ac75deeb000
uname ({ sys = " Linux " , node = " titan05 " , ...}) = 0
access ( " / etc / ld . so . preload " , R_OK )
= -1 ENOENT
open ( " / etc / ld . so . cache " , O_RDONLY )
= 3
fstat (3 , { st_mode = S_IFREG |0644 , ...})
= 0
mmap ( NULL , 197102 , PROT_READ , ...
) = 0 x2ac75deec000
close (3)
= 0
open ( " / lib64 / libc . so .6 " , O_RDONLY )
= 3
read (3 , " \177 ELF \2\1\1\0 \0\0\03 " ... , 832) = 832
fstat (3 , { st_mode = S_IFREG |075 , ...})
= 0
mmap ( NULL , 4096 , PROT_READ | ...)
= 0 x2ac75df1d000
mmap ( NULL , 3412216 , PROT_READ | ....)
= 0 x2ac75e0ed000
371
372
Low-Level I/O
File Descriptor
Open/Create/Close a File
Reposition File Offset
Read and Write a File
Properties of a File
Primitive System Data Types
The size-related Fields
Directory Properties and Functions
Device Numbers and Time-related Fields
374
373
Open A File
O RDONLY
O WRONLY
O RDWR
int
open ( const char * path , int flags , mode_t mode );
Table: One and only one of these three constants must be specified.29
But . . .
29
376
O
O
O
O
O
O
O
O
int
creat ( const char * path , mode_t mode );
APPEND
CREAT
EXCL
TRUNC
NONBLOCK
SYNC
RSYNC
DSYNC
O TRUNC options now provided by open, a separate creat function is no longer needed:
377
Close A File
378
Low-Level I/O
File Descriptor
Open/Create/Close a File
int
close ( int d );
Releases any record locks the process may have on the file.
I
I
Properties of a File
380
Offset Interpretation
off_t
lseek ( int fildes , off_t offset , int whence );
off_t
lseek ( int fildes , off_t offset , int whence );
whence
SEEK SET
SEEK CUR
SEEK END
offset (re-)position
offset is set to offset bytes from the beginning of file.
files offset is set to its current value plus offset.30
files offset is set to the size of the file plus offset.30
30
381
382
Seeking Capability
Same goes to determine if a file is capable of seeking:
lseek will fail and the file pointer will remain unchanged if:
int
main ( void )
{
if ( lseek ( STDIN_FILENO , 0 , SEEK_SET ) == -1)
err ( errno , " can not seek [% d ]. " , errno );
else
printf ( " seek OK .\ n " );
return (0);
}
383
/* Illegal seek */
384
lseek only records the current file offset within the kernel.
int
main ( void )
{
int fd ;
if (( fd = creat ( " file . hole " , S_IRUSR | S_IWUSR )) < 0)
err (1 , " creat error " );
if ( write ( fd , buf1 , 10) != 10)
err (1 , " buf1 write error " );
/* offset now = 10 */
if ( lseek ( fd , 50*16384 , SEEK_SET ) == -1)
err (1 , " lseek error " );
/* offset now = 50 * 16384 */
if ( write ( fd , buf2 , 10) != 10)
err (1 , " buf2 write error " );
/* offset now = 50 * 16384 + 10 */
Creating holes
I
The files offset can be greater than the files current size.
return (0);
}
385
Reading Holes
386
Bytes in a file that have not been written are read back as 0.
$ ls -l
-rw - - - - - - - 1 holu holu 819210 May 15 11:05 file . hole
$ od -c file . hole
0000000 a b c d e f g h i j \0 \0 \0 \0 \0 \0
0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
3100000 A B C D E F G H I J
3100012
387
388
Low-Level I/O
File Descriptor
Input and output uses the read and write system calls.
Open/Create/Close a File
Reposition File Offset
Read and Write a File
Properties of a File
ssize_t
read ( int d , void * buf , size_t nbytes );
ssize_t
write ( int d , const void * buf , size_t nbytes );
390
An implementation of getchar()
I
391
392
if ( n == 0) {
/* buffer is empty */
n = read (0 , buf , sizeof buf );
bufp = buf ;
}
return ( - - n >= 0) ? ( unsigned char ) * bufp ++ : EOF ;
393
File Copying, v4
394
Low-Level I/O
File Descriptor
Open/Create/Close a File
Reposition File Offset
Read and Write a File
if ( argc != 3)
error ( " Usage : cp from to " );
if (( f1 = open ( argv [1] , O_RDONLY , 0)) == -1)
error ( " can t open % s " , argv [1]);
if (( f2 = creat ( argv [2] , PERMS )) == -1)
error ( " can t create %s , mode %03 o " , argv [2] , PERMS );
while (( n = read ( f1 , buf , BUFSIZ )) > 0)
if ( write ( f2 , buf , n ) != n )
error ( " write error on file % s " , argv [2]);
return (0);
Properties of a File
Primitive System Data Types
The size-related Fields
Directory Properties and Functions
Device Numbers and Time-related Fields
}
395
396
Properties of a File
int
stat ( const char * path , struct stat * sb );
int
fstat ( int fd , struct stat * sb );
int
lstat ( const char * path , struct stat * sb );
397
398
Low-Level I/O
struct stat {
mode_t
st_mode ;
/* inode s mode */
uid_t
st_uid ;
/* user ID of owner */
gid_t
st_gid ;
/* group ID of owner */
off_t
st_size ;
/* file size , in bytes */
int64_t
st_blocks ; /* blocks allocated for file */
u_int32_t st_blksize ; /* optimal blocksize for I / O */
dev_t
st_dev ;
/* device inode resides on */
ino_t
st_ino ;
/* inode s number */
nlink_t
st_nlink ; /* number of hard links to the file */
dev_t
st_rdev ;
/* device type , for special file inode */
struct timespec st_atimespec ; /* last access */
struct timespec st_mtimespec ; /* last data modification */
struct timespec st_ctimespec ; /* last file status change */
u_int32_t st_flags ; /* user defined flags for file */
u_int32_t st_gen ;
/* file generation number */
};
File Descriptor
Open/Create/Close a File
Reposition File Offset
Read and Write a File
Properties of a File
Primitive System Data Types
The size-related Fields
Directory Properties and Functions
400
File Types
Description
device numbers (major and minor)
file descriptor sets
file position
numeric group IDs
inode numbers
file type, file creation mode
link counts for directory entries
file sizes and offsets (signed)
process IDs and process group IDs
result of subtracting two pointers (signed)
size of objects (such as strings) (unsigned)
functions that return a count of bytes
(signed) (e.g., read, write)
counter of seconds of calendar time
numeric user IDs
can represent all distinct character codes
struct stat {
mode_t
...
};
st_mode ;
/* inode s mode */
401
/* inode s mode */
Macro
S ISREG(m)
S ISDIR(m)
S ISCHR(m)
S ISBLK(m)
S ISFIFO(m)
S ISLNK(m)
S ISSOCK(m)
__mode_t
mode_t ;
__uint32_t __mode_t ;
# define
# define
# define
# define
# define
# define
# define
0010000
0020000
0040000
0060000
0100000
0120000
0140000
S_IFIFO
S_IFCHR
S_IFDIR
S_IFBLK
S_IFREG
S_IFLNK
S_IFSOCK
402
/*
/*
/*
/*
/*
/*
/*
Type of file
Regular file
Directory file
Character special file
Block special file
FIFO
Symbolic Link
Socket
403
404
File types:
/* sys / stat . h ( OpenBSD 4.3)
# define S_IFMT
0170000 /*
# define S_IFIFO 0010000 /*
# define S_IFCHR 0020000 /*
# define S_IFDIR 0040000 /*
# define S_IFBLK 0060000 /*
# define S_IFREG 0100000 /*
# define S_IFLNK 0120000 /*
# define S_IFSOCK 0140000 /*
*/
type of file mask */
named pipe ( fifo ) */
character special */
directory */
block special */
regular */
symbolic link */
socket */
S_ISFIFO ( m )
S_ISCHR ( m )
S_ISDIR ( m )
S_ISBLK ( m )
S_ISREG ( m )
S_ISLNK ( m )
S_ISSOCK ( m )
(( m
(( m
(( m
(( m
(( m
(( m
(( m
&
&
&
&
&
&
&
0170000)
0170000)
0170000)
0170000)
0170000)
0170000)
0170000)
==
==
==
==
==
==
==
0010000)
0020000)
0040000)
0060000)
0100000)
0120000)
0140000)
/*
/*
/*
/*
/*
/*
/*
S_IFIFO
S_IFCHR
S_IFDIR
S_IFBLK
S_IFREG
S_IFLNK
S_IFSOCK
# define S_IFMT
fifo */
char spec . */
directory */
block spec . */
reg . file */
symb . link */
socket */
0
1
0 2
0 4
0 6
1
0
1 2
1 4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
/*
/*
/*
/*
/*
/*
/*
1
7
0
0
0
0 /* type of file mask */
--x xxx --- --- --- --- /* used bits st_mode */
m ode
405
File Permissions
406
S_IRWXU
S_IRUSR
S_IWUSR
S_IXUSR
0000700
0000400
0000200
0000100
/*
/*
/*
/*
# define
# define
# define
# define
S_IRWXG
S_IRGRP
S_IWGRP
S_IXGRP
0000070
0000040
0000020
0000010
/*
/*
/*
/*
# define
# define
# define
# define
S_IRWXO
S_IROTH
S_IWOTH
S_IXOTH
0000007
0000004
0000002
0000001
/*
/*
/*
/*
# define S_IREAD
# define S_IWRITE
# define S_IEXEC
S_IRWXO
S_IRWXG
S_IRWXU
S_IFMT
0
0
0
0
0
7
0
0
0
0
7
0
0
0
0
7
0
0
1
7
0
0
0
0
- -| ||| --- xxx xxx xxx
m ode
usr grp oth
/*
/*
/*
/*
/*
S_IRUSR
S_IWUSR
S_IXUSR
407
408
File Ownership
Process IDs
struct stat {
mode_t
uid_t
gid_t
...
};
st_mode ;
st_uid ;
st_gid ;
/* inode s mode */
/* user ID of owner */
/* group ID of owner */
409
410
0
0 2
0
0
0 /* set group id on exec */
0
0 4
0
0
0 /* set user id on exec */
- -| ||| xx - ||| ||| ||| /* used bits st_mode */
m ode ss - usr grp oth
411
412
"
" else %
if (eff gid equals st gid) then
if (apt group permission) then " else %
if (apt other permission) then " else %
What is the search bit? Where does its name come from?
Whenever we want to open any file by name we must have execute
permission in each directory mentioned in the name (including the
current directory if it is implied). This is why the execute
permission bit for the directory is often called the search bit.
414
413
# define S_ISTXT
If set, a copy of the programs text was saved in the swap area
on process termination. S ISVTX as mnemonic for saved-text.
415
0
0
1
0
0
0 /* sticky bit */
--x xxx xxx xxx xxx xxx /* bits st_mode */
m ode sst usr grp oth
416
418
417
Low-Level I/O
File Descriptor
struct stat {
off_t
int64_t
u_int32_t
...
};
Open/Create/Close a File
Reposition File Offset
st_size ;
/* file size , in bytes */
st_blocks ; /* blocks allocated for file */
st_blksize ; /* optimal blocksize for I / O */
Properties of a File
Primitive System Data Types
st blksize The preferred block size for I/O for the file.
420
int
main ( void )
{
int fd ;
ssize_t wb ;
struct stat s ;
struct statfs sfs ;
if (( fd = open ( " / tmp / file " , O_CREAT | O_TRUNC | O_RDWR , 0600)) == -1)
err ( errno , " can not create file . [% d ] " , errno );
if (( wb = write ( fd , " a " , 1)) == -1)
err ( errno , " can not write to fd % d " , fd );
if ( fstat ( fd , & s ) != 0)
err ( errno , " fstat failed . " );
if ( fstatfs ( fd , & sfs ) != 0)
err ( errno , " statfs error occured . " );
printf ( " st_size :\ t % lld \ n " , s . st_size );
printf ( " st_blocks :\ t % lld \ n " , s . st_blocks );
printf ( " st_blksize :\ t % d \ n " , s . st_blksize );
printf ( " f_bsize :\ t % d \ n " , sfs . f_bsize );
printf ( " f_iosize :\ t % u \ n " , sfs . f_iosize );
return (0);
st_size :
st_blocks :
st_blksize :
f_bsize :
f_iosize :
1
4
16384
2048
16384
/*
/*
/*
/*
/*
Compare it:
$ ls - ls / tmp / file
4 -rw -r - -r - - 1 holu
wheel
Occupies one filesystem data block (2K) to store the single byte.
}
421
422
int
fstatfs ( int fd , struct statfs * buf );
int
truncate ( const char * path , off_t length );
struct statfs {
u_int32_t f_flags ;
u_int32_t f_bsize ;
u_int32_t f_iosize ;
int
ftruncate ( int fd , off_t length );
/* Both return 0 if OK , -1 on error */
I
};
423
424
Low-Level I/O
File Descriptor
struct stat {
ino_t
st_ino ;
/* inode s number */
nlink_t
st_nlink ; /* number of hard links to the file */
...
};
Open/Create/Close a File
Reposition File Offset
Read and Write a File
Properties of a File
Primitive System Data Types
The size-related Fields
Directory Properties and Functions
Device Numbers and Time-related Fields
425
426
Figure: Cylinder groups inodes and data blocks in more detail. Two
directory entries point to the same inode.[Apue,Fig. 4.14]
Figure: A disk drive being divided into one or more partitions. Each
partition can contain a filesystem. Inodes are fixed-length entries that
contain most of the information about a file. [Apue,Fig. 4.13]
427
428
Only when the link count goes to zero can the file be deleted
(i.e., can the associated data blocks be released).
This is why unlinking a file does not always mean deleting the
blocks associated with the file.
If name1 is removed, the file name2 is not deleted and the link
count of the underlying object is decremented.
name1 must exist for the hard link to succeed and both
name1 and name2 must be in the same file system. As
mandated by Posix.1 name1 may not be a directory.
int
/* 0 if OK , -1 on error */
link ( const char * name1 , const char * name2 );
429
430
/* 0 if OK , -1 on error */
/* 0 if OK , -1 on error */
If one or more processes have the file open when the last link
is removed, the link is removed, but the removal of the file is
delayed until all references to it have been closed.
431
432
Both from and to must be of the same type (that is, both
directories or both non-directories), and must reside on the
same filesystem.
433
File Metadata
File type
434
struct ufs1_dinode {
u_int16_t
di_mode ;
int16_t
di_nlink ;
union {
u_int16_t oldids [2];
u_int32_t inumber ;
} di_u ;
u_int64_t
di_size ;
int32_t
di_atime ;
int32_t
di_atimensec ;
int32_t
di_mtime ;
int32_t
di_mtimensec ;
int32_t
di_ctime ;
int32_t
di_ctimensec ;
int32_t
di_db [ NDADDR ];
int32_t
di_ib [ NIADDR ];
u_int32_t
di_flags ;
int32_t
di_blocks ;
int32_t
di_gen ;
u_int32_t
di_uid ;
u_int32_t
di_gid ;
int32_t
di_spare [2];
};
struct dirent {
ino_t d_ino ;
/* inode number */
char d_name [ NAME_MAX + 1]; /* null - terminated filename */
}
435
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
8:
16:
20:
24:
28:
32:
36:
40:
88:
100:
104:
108:
112:
116:
120:
436
Reading Directories
# define NAME_MAX
We will see a similar approach when discussing the standard library I/O
function and the FILE structure.
438
437
Directory Functions
int
main ( int argc , char * argv [])
{
DIR * dp ;
struct dirent * dent ;
struct dirent *
/* ptr or NULL at end of dir / error */
readdir ( DIR * dirp );
long
/* current location in directory stream */
telldir ( const DIR * dirp );
int
closedir ( DIR * dirp );
if ( argc != 2)
errx (1 , " single arg ( directory name ) required . " );
if (( dp = opendir ( argv [1])) == NULL )
err ( errno , " can not open % s " , argv [1]);
while (( dent = readdir ( dp )) != NULL )
printf ( " % s \ n " , dent - > d_name );
closedir ( dp );
/* 0 if OK , -1 on error */
void
seekdir ( DIR * dirp , long loc );
void
rewinddir ( DIR * dirp );
return (0);
}
439
440
Low-Level I/O
File Descriptor
Open/Create/Close a File
Reposition File Offset
/* 0 if OK , -1 on error */
Properties of a File
int
chdir ( const char * path );
/* 0 if OK , -1 on error */
int
fchdir ( int fd );
/* 0 if OK , -1 on error */
char *
/* buf if ok , NULL on error */
getcwd ( char * buf , size_t size );
441
Device Numbers
struct stat {
dev_t st_dev ; /* device inode resides on */
dev_t st_rdev ; /* device type , for special file inode */
...
};
I
Each filesystem on the same disk drive would usually have the
same major number, but a different minor number.
32
struct stat {
...
struct timespec st_atime ; /* last access */
struct timespec st_mtime ; /* last data modification */
struct timespec st_ctime ; /* last - change time of inode */
};
444
Systems Programming
Low-Level I/O:
Unbuffered I/O and control functions on file descriptors.
Alexander Holupirek
Filesystem Interface:
Functions for operating on directories and for manipulating
file attributes such as access modes and ownership.
445
Ease of use.
447
448
An error flag.
Incidental Remark
I
449
Predefined streams
450
Buffering
__BEGIN_DECLS
extern FILE __sF [];
__END_DECLS
# define stdin
# define stdout
# define stderr
451
Fully buffered
Line buffered
Unbuffered
452
Actual I/O takes place when the standard I/O buffer is filled.
453
Unbuffered I/O
454
Unbuffered I/O:
455
456
int
/* 0 if OK else EOF ( but stream is still functional ) */
setvbuf ( FILE * stream , char * buf , int mode , size_t size );
void
setbuf ( FILE * stream , char * buf );
May only be used after sucessful open and before first I/O.
I
I
33
buf and size can optionally specify a buffer and its size.
If buf is NULL the system chooses an apt size33 .
System-dependent:
457
Opening a Stream
mode
setbuf
IOFBF
setvbuf
IOLBF
IONBF
buf
Type of buffering
nonnull
NULL
nonnull
NULL
nonnull
NULL
(ignored)
size
length
size
length
458
fully buffered
FILE *
freopen ( const char * path , const char * mode , FILE * stream );
line buffered
unbuffered
I
int
/* 0 if OK , EOF on failure and errno set */
fflush ( FILE * stream );
459
460
Description
open for reading
truncate to 0 length or create for writing
append; open for writing at end of file, or create for writing
open for reading and writing
truncate to 0 length or create for reading and writing
open or create for reading and writing at end of file
461
Appending to a Stream
462
34
464
Each process has its own file table entry, but they share a
single v-node table entry (see Figure 3.7).
Since the file size has been extended, the kernel also updates
the current file size in the v-node to 1600.
Each time a write is performed for a file with this append flag
set, the current file offset in the file table entry is first set to
the current file size from the i-node table entry.
I
35
Lost update.
465
r
x
r+
x
x
x
x
x
x
x
x
w+
a+
# include < stdio .h >
x
x
x
int
/* 0 if OK , else EOF / errno ( no further access ) */
fclose ( FILE * stream );
x
x
x
467
468
470
Push-Back Characters
# include < stdio .h >
int
ferror ( FILE * stream );
int
/* c if OK , EOF on failure */
ungetc ( int c , FILE * stream );
I
Pushing back EOF will fail and the stream remains unchanged.
int
clearerr ( FILE * stream );
I
471
472
int
fputc ( int c , FILE * stream );
int
putc ( int c , FILE * stream );
str specifies the address of the buffer to read the line into.
473
474
Binary I/O
int
/* 0 on success and EOF on error */
fputs ( const char * str , FILE * stream );
int
puts ( const char * str );
/* >=0 on success and EOF or error */
I
I
475
476
size_t
fwrite ( const void * ptr , size_t size ,
size_t nmemb , FILE * stream );
/* Return number of objects read or written */
I
I
37
struct tuple {
unsigned int size ;
unsigned int level ;
enum kind kind ;
void * cnt ;
} tup ;
if ( fwrite (& tup , sizeof ( tup ) , 1 , fp ) != 1)
err (1 , " fwrite error . " );
I
478
479
480
Positioning a Stream
int
/* file descriptor assoc . with the stream */
fileno ( FILE * stream );
They work similar to lseek(2) and the whence options (SEEK SET
etc.) are the same.
481
482
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
483
484
NetBSD/puffs/reFUSE
I
http://www.netbsd.org/docs/puffs/
FreeBSD/fuse4bsd
I
http://fuse4bsd.creo.hu/
Mac OS X/MacFUSE
I
http://code.google.com/p/macfuse/
OpenSolaris
I
38
http://fuse.sourceforge.net/
http://opensolaris.org/os/project/fuse/
485
FUSE-based FS Implementations
NTFS-3G (http://www.ntfs-3g.org)
Comprehensive list of FUSE-based FSs.
I
http://fuse.sourceforge.net/wiki/index.php/FileSystems
486
487
488
This file can be opened multiple times, and the obtained file
descriptor is passed to the mount syscall, to match up the
descriptor with the mounted filesystem.
489
490
Incidental Remarks
491
492
FUSE Summary
A. Kernel Module39
The kernel module hooks into the VFS code and looks like a
filesystem module.
They are translated back into the form expected by the kernel.
39
494
B. FUSE library
I
495
496
int
main ( int argc , char * argv [])
{
return fuse_main ( argc , argv , & hello_oper , NULL );
}
497
498
static int
hello_getattr ( const char * path , struct stat * stbuf )
{
int res = 0;
static int
hello_readdir ( const char * path , void * buf ,
fuse_fill_dir_ t filler , off_t offset ,
struct fuse_file_info * fi )
{
( void ) offset ;
( void ) fi ;
return res ;
499
500
static int
hello_open ( const char * path , struct fuse_fil e_info * fi )
{
if ( strcmp ( path , hello_path ) != 0)
return - ENOENT ;
if (( fi - > flags & 3) != O_RDONLY )
return - EACCES ;
return 0;
}
return size ;
}
501
502
I
I
503
504
505
506
Read Directory
/* *
fuse_lowlevel . h
* Read directory
*
* Send a buffer filled using f use_ add _di r en try () , with size
* not exceeding the requested size . Send an empty buffer on
* end of stream .
*
* Valid replies :
*
fuse_reply_buf
*
fuse_reply_err
*
* @param req request handle
* @param ino the inode number
* @param size maximum number of bytes to send
* @param off offset to continue reading the directory stream
* @param fi file information
*/
void (* readdir ) ( fuse_req_t req , fuse_ino_t ino , size_t size ,
off_t off , struct fuse_file_info * fi );
/* *
fuse_lowlevel . h
* Get file attributes
*
* Valid replies :
*
f use _r eply_attr
*
fuse_reply_err
*
* @param req request handle
* @param ino the inode number
* @param fi for future use , currently always NULL
*/
void (* getattr ) ( fuse_req_t req ,
fuse_ino_t ino ,
struct fuse_file_inf o * fi );
507
508
Open a File
/* *
fuse_lowlevel . h
* Open a file
*
* Open flags ( with the exception of O_CREAT , O_EXCL ,
* O_NOCTTY and O_TRUNC ) are available in fi - > flags .
*
* ...
* Valid replies :
*
f us e_reply_open
*
fuse_reply_err
*
* @param req request handle
* @param ino the inode number
* @param fi file information
*/
void (* open ) ( fuse_req_t req ,
fuse_ino_t ino ,
struct fuse_file_info * fi );
*/
*/
*/
*/
*/
*/
*/
509
Read data
510
Miscellaneous
Definitions
/* *
fuse_lowlevel . h
* Read data
*
* Read should send exactly the number of bytes requested
* except on EOF or error , otherwise the rest of the data
* will be substituted with zeroes .
* ...
* Valid replies :
*
fuse_reply_buf
*
fuse_reply_err
*/
void (* read ) ( fuse_req_t req ,
/* request handle */
fuse_ino_t ino ,
/* inode number */
size_t size , /* number of bytes to read */
off_t off ,
/* offset to read from */
struct fuse_ file_inf o * fi ); /* file info */
Caveats
511
512
/* *
fuse_lowlevel . h
* Reply with an error code or success
*
* Possible requests :
*
all except forget
*
* unlink , rmdir , rename , flush , release , fsync , fsyncdir ,
* setxattr , removexattr and setlk may send a zero code
*
* @param req request handle
* @param err the positive error value , or zero for success
* @return zero for success , - errno for failure to send reply
*/
int fuse_reply_err ( fuse_req_t req , int err );
514
513
/* *
fuse_lowlevel . h
* Reply with a directory entry
*
* Possible requests :
*
lookup , mknod , mkdir , symlink , link
*
* @param req request handle
* @param e the entry parameters
* @return zero for success , - errno for failure to send reply
*/
int fu se_ reply _e nt ry ( fuse_req_t req ,
const struct f u se _ en try _ pa r am * e );
I
/* *
fuse_lowlevel . h
* Reply with data
*
* Possible requests : read , readdir
*/
int
/* zero for success , - errno for failure */
fuse _reply_buf ( fuse_req_t req ,
/* request handle */
const char * buf ,
/* contains data */
size_t size ); /* data size in bytes */
515
516
/* *
fuse_lowlevel . h
* Reply with attributes
*
* Possible requests :
*
getattr , setattr
*
* @param req request handle
* @param the attributes
* @param attr_timeout validity timeout
*
( in seconds ) for the attributes
* @return zero for success , - errno for failure to send reply
*/
int fus e_reply_attr ( fuse_req_t req ,
const struct stat * attr ,
double attr_timeout );
/* *
fuse_lowlevel . h
* Reply with open parameters
*
* currently the following members of fi are used :
*
fh , direct_io , keep_cache
*
* Possible requests :
*
open , opendir
*
* @param req request handle
* @param fi file information
* @return zero for success , - errno for failure to send reply
*/
int f use_reply_open ( fuse_req_t req ,
const struct fuse_file_info * fi );
517
518
Introduction to FUSE
Alexander Holupirek
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
519
Managing projects.
520
Project Overview
DBFS Commands
DBFS
I
I
I
522
521
dbfs ll getattr
523
524
dbfs stat
dbfs ll lookup
static int
dbfs_stat ( fuse_ino_t ino , struct stat * stbuf )
{
enum kind_t kind = db_kind ( ino );
if ( kind == FREG || kind == FDIR ) {
stbuf - > st_ino = ino ;
stbuf - > st_mode = db_st_mode ( ino );
stbuf - > st_nlink = db_st_nlink ( ino );
stbuf - > st_uid = db_st_uid ( ino );
stbuf - > st_gid = db_st_gid ( ino );
stbuf - > st_rdev = db_st_rdev ( ino );
stbuf - > st_size = db_st_size ( ino );
stbuf - > st_blocks = db_st_blocks ( ino );
stbuf - > st_atime = db_st_atime ( ino );
stbuf - > st_mtime = db_st_mtime ( ino );
stbuf - > st_ctime = db_st_ctime ( ino );
return 0;
}
return -1;
}
525
526
dbfs ll readdir
527
528
dbfs ll read
static void dbfs_ll_rea ddi r
( fuse_req_t req , fuse_ino_t ino , size_t size ,
off_t off , struct fuse_file_info * fi ) {
( void ) fi , size , off ;
if ( db_kind ( ino ) == FDIR ) {
unsigned int numchildren ;
unsigned int children [512];
numchildren = db_children ( children , 512 , ino );
struct dirbuf dbuf ;
memset (& dbuf , 0 , sizeof ( dbuf ));
unsigned int i ;
for ( i =0; i < numchildren ; i ++) {
dirbuf_add (
req , & dbuf , db_cnt ( children [ i ]) , children [ i ]);
}
re ply _bu f _ li m it ed ( req , dbuf .p , dbuf . size , off , size );
free ( dbuf . p );
} else
fuse_reply_err ( req , ENOTDIR );
}
Things to Remember
530
Things to Remember
void pointer
I
cant be dereferenced
531
532
Organizing Files
Organizing Files
Header
I
struct definitions
typedefs
function prototypes
constants
#defined macros
Common Pitfalls
cyclic dependencies in two header files
duplicate definitions
duplicate instances
Source
I
actual implementation
533
High Level
I
class structure
optimizer
Vim
command line
I http://tldp.org/HOWTO/C-editing-with-VIM-HOWTO/index.html
Low Level
I
534
535
536
Kdevelop
I
Anjuta
I
537