You are on page 1of 71

UNIX

SOCKET PROGRAMMING

Outline
Socket and Internet Sockets
Network programming functions
Socket System Calls
TCP Sockets Programming
UDP Sockets Programming
Sockets
• Sockets provide mechanisms to communicate
between computers across a network

• There are different kind of sockets

• Berkeley sockets is the most popular


Internet Socket
–runs on Linux, FreeBSD, Windows
Network API
• Operating system provides Application
Programming Interface (API) for network
application

• API is defined by a set of function types,


data structures, and constants

• Application Programming Interface for


networks is called socket
Internet Sockets

• Support stream and datagram packets


(e.g. TCP, UDP, IP)

• Is Similar to UNIX file I/O API

• Based on C.
Types of Internet Sockets
• Different types of sockets implement different
communication types (stream vs. datagram)
• Type of socket: stream socket
– connection-oriented
– two way communication
– reliable (error free), in order delivery
– can use the Transmission Control Protocol
(TCP)
– e.g. telnet, http
• Type of socket: datagram socket
– connectionless, does not maintain an open
connection, each packet is independent
– can use the User Datagram Protocol (UDP)
– e.g. IP telephony
Data types

int8_t signed 8bit int


uint8_t unsigned 8 bit int
int16_t signed 16 bit int
uint16_t unsigned 16 bit int
int32_t signed 32 bit int
uint32_t unsigned 32 bit int

u_char, u_short, u_int, u_long


More data types
sa_family_t address family
socklen_t length of struct
in_addr_t IPv4 address
in_port_t IP port number
Naming and Addressing
• Host name
–identifies a single host
–variable length string (e.g.
www.berkeley.edu)
–is mapped to one or more IP addresses

• IP Address
–written as dotted octets (e.g. 10.0.0.1)
–32 bits. Not a number! But often needs to
be converted to a 32-bit to use.
• Port number
–identifies a process on a host
–16 bit number
–Reserved ports ( 0 -1024 )
IP Address Data Structure
struct sockaddr_in {
short int sin_family; // Address
family
unsigned short int sin_port; // Port
number
struct in_addr sin_addr; // Internet
address
unsigned char sin_zero[8]; // Padding 0
};

struct in_addr {
unsigned long s_addr; // 4 bytes
};
Generic Socket Address
• The sockets API is generic.

• There must be a generic way to


specify endpoint addresses.

• TCP/IP requires an IP address and a


port number for each endpoint
address.
Generic socket addresses
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family;
char sa_data[14];
};

• sa_family specifies the address type.


• sa_data specifies the address value.
Generic Socket Address
• We don’t need to deal with sockaddr
structures since we will only deal with
a real protocol family.
• We can use sockaddr_in structures.

BUT: Convert this socket address


structure into generic one using type
casting.
sockaddr sockaddr_in
sa_len sin_len
sa_family AF_INET
sin_port

sin_addr
sa_data

sin_zero
Network Programming
Functions
• Byte Ordering
• Byte Manipulation functions
• Addressing
• Socket system calls
Byte Ordering of Integers
• Different CPU architectures have different
byte ordering
memory memory
address A +1 address A

Stored at little-endian high-order byte low-order byte


computer

Integer representation (2 D3 F2
byte)

Stored at big-endian low-order byte high-order byte


computer
Byte Order

• Byte Ordering
• Big Endian vs. Little Endian
• Little Endian (Intel, DEC): Least
significant byte of word is stored in the
lowest memory address
• Big Endian (Sun, SGI, HP): Most
significant byte of word is stored in the
lowest
memory address
Network Byte Order Functions
‘16- and 32-bit conversion functions (for platform
independence)
Examples:
int m, n;
short int s,t;
m = ntohl (n) net-to-host long (32-bit)
translation
s = ntohs (t) net-to-host short (16-bit)
translation
n = htonl (m) host-to-net long (32-bit)
translation
t = htons (s) host-to-net short (16-bit)
translation
Byte Manipulation
Functions
• Void bzero (void *dest, size_t
nbytes)

• Void bcopy (const void *src, void


*dest, size_t nbytes)

• Int bcmp (const void *ptr1, const


void *ptr2, size_t nbytes)
IPv4 Address Conversion
int inet_aton( char *, struct in_addr
*);

Convert ASCII dotted-decimal IP address to


network byte order 32 bit value. Returns 1 on
success, 0 on failure.

char *inet_ntoa(struct in_addr);

Convert network byte ordered value to ASCII


dotted-decimal (a string).

a – ASCII dotted-decimal ipv4 address


n –32 bit binary ipv4 address in network byte
inet_ntop, inet_pton –
• convert IPv4 and IPv6 addresses between binary and text form

• #include <arpa/inet.h>

• int inet_pton(int af, const char *src, void *dst);


• const char *inet_ntop(int af, const void *src, char *dst, socklen_t
size);

• af : Specifies the family of the address to be converted. AF_INET and AF_INET6 .

• src
– (Input) The pointer to the null-terminated character string that contains the text
presentation form of an IPv4 / IPV6 address.

dst
– (Output) The pointer to a buffer into which the function stores the numeric
address. The calling application must ensure that the buffer referred to by dst is
large enough to hold the numeric address (4 bytes for AF_INET or 16 bytes for
AF_INET6).
size
• (Input) The size of the buffer pointed at by dst.
Address Access/Conversion
Functions

• All binary values are network byte ordered


• struct hostent* gethostbyname (const
char*
hostname);
• Translate English host name to IP address
(uses DNS)
• struct hostent* gethostbyaddr (const
char*
addr, size_t len, int family);
• Translate IP address to English host name
(not secure)
Socket code
• socket()- creates a TCP or UDP socket
• bind() – binds a socket to an address or file
name
• connect()- connect to another process via a
socket
• listen() – wait for connections over a socket
• accept() – create a communication socket
with a process which has connected to you
• send() – sends a string/information over a
socket
• recv() – receive a string/information which
was sent over a socket
• sendto() – UDP version of send()
• recvfrom() – UDP version of recv()
• unlink() – remove a Unix local domain socket
• close() – to close a socket
Simple TCP Client-Server Example
response
Client Server
request

socket()
bind()
socket() listen()
Connection
connect() establishment accept()
write()
Data request read()

Data response write()


read()
close() read()
End-of-file notification
close()
Creating a Socket

int socket(int family,int type,int


proto);

• family specifies the protocol family


(AF_INET for TCP/IP).
• type specifies the type of service
(SOCK_STREAM, SOCK_DGRAM).
• protocol specifies the specific protocol
(usually 0, which means the default).
socket()
• The socket() system call returns a
socket descriptor (small integer) or
-1 on error.

• socket() allocates resources needed


for a communication endpoint - but it
does not deal with endpoint
addressing.
Bind()
• The bind() system call is used to
assign an address to an existing
socket.

• It tells the os to assign a local IP


address and local port number to the
socket.

int bind( int sockfd,


const struct sockaddr *myaddr,
int addrlen);

• bind returns 0 if successful or -1 on error.


bind() Example
int mysock,err;
struct sockaddr_in myaddr;

mysock = Socket(AF_INET,SOCK_STREAM,0);
myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( portnum );
myaddr.sin_addr = htonl( ipaddress);

err=bind(mysock, (struct sockaddr *)


&myaddr, sizeof(myaddr));
Uses for bind()
• There are a number of uses for
bind():

– Server would like to bind to a well known


address (port number).

– Client can bind to a specific port.

– Client can ask the O.S. to assign any


available port number.
myaddr.port = htons(0);
Uses for bind()

If the computer has multiple network


interfaces?
•There is no realistic way to know
the right IP address for bind()

• specify the IP address as:


INADDR_ANY, this tells the OS to take
care of things.
listen()
int listen( int sockfd, int backlog);

sockfd is the TCP socket (already bound to


an address)

Once we call listen(), the O.S. will


queue incoming connections
backlog is the number of incoming
connections the kernel should queue for
this socket.

listen() returns -1 on error (otherwise 0).


accept()
int accept( int sockfd,
struct sockaddr* cliaddr,
socklen_t *addrlen);

sockfd is the passive mode TCP


socket.
cliaddr is a pointer to allocated
space.
addrlen is a value-result argument
– must be set to the size of cliaddr
– on return, will be set to be the
number of used bytes in cliaddr.
accept()

accept() returns a new socket


descriptor (small positive integer)
or -1 on error.
Connect()
• TCP clients can call connect() which:

– takes care of establishing an endpoint


address for the client socket.
•No need to call bind , the O.S. will take
care of assigning the local endpoint
address (TCP port number, IP address).

– Attempts to establish a connection to the


specified server.
• 3-way handshake
connect()
int connect( int sockfd,
const struct sockaddr *server,
socklen_t addrlen);

sockfd is an already created TCP socket.


server contains the address of the server
(IP Address and TCP port number)

connect() returns 0 if OK, -1 on error


connect()

After connection is established I/O


can be done using the system calls.
Terminating a TCP
connection

• Either end of the connection can call


the close() system call.
Value-Result Parameters
• Bind(),connect(), and sendto() pass a
socket address from the process to the
kernel.
• Accept(), recvfrom() pass a socket
address structure from kernel to the
process.
•Value – Tells the kernel the size of
the structure.
•Result – Tells the process how much
information the kernel actually stored
in the structure.
Concurrent servers

• To handle multiple clients at the


same time

• The simplest way is to fork a


child process to handle each
client.
Fork() and exec()
• Pid_t fork(void)
- Returns : 0 in child, process ID of child in parent,
-1on error
• Exec() – A process to execute another program
- calls fork to create a copy
- One of the copy calls exec() to replace itself
with new program
• Six functions - execl, xecv,execle,execve,execlp ,
execvp
- Whether file to execute is specified by filename
or pathname
- arguments to program listed or thro array of
pointers
- whether environment of calling process is
passed or new environment specified
• #include <unistd.h>
int execl(const char *path, const char *arg0, ... /*,
(char *)0 */);
int execv(const char *path, char *const argv[]);
int execle(const char *path, const char *arg0, ... /*,
(char *)0, char *const envp[]*/);
int execve(const char *path, char *const argv[],
char *const envp[]);
int execlp(const char *file, const char *arg0, ... /*,
(char *)0 */);
int execvp(const char *file, char *const argv[]);

• The arguments specified by a program with one of


the exec functions shall be passed on to the new
process image in the corresponding main()
arguments.
• The argument path points to a pathname that
identifies the new process image file.
• sock_ntop
- takes a pointer to a socket address structure,
looks inside the structure, and calls the
appropriate function to return the presentation
format of the address.
• #include "unp.h“

• char *sock_ntop(const struct sockaddr


*sockaddr, socklen_t addrlen);

• Returns: non-null pointer if OK, NULL on error


• sockaddr points to a socket address structure whose length
is addrlen.
• The function uses its own static buffer to hold the result
and a pointer to this buffer is the return value.
readn, writen, and readline Functions
• A read or write on a stream socket might input or output
fewer bytes than requested.
• The reason is that buffer limits might be reached for the
socket in the kernel.
• All that is required to input or output the remaining bytes is
for the caller to invoke the read or write function again.
• This scenario is always a possibility on a stream socket
with read, but is normally seen with write only if the socket
is nonblocking.
#include "unp.h“
• ssize_t readn(int filedes, void *buff, size_t nbytes);
• ssize_t writen(int filedes, const void *buff, size_t nbytes);
• ssize_t readline(int filedes, void *buff, size_t maxlen);

• All return: number of bytes read or written, –1 on error


• Note that our readline function calls the system’s read
function once for every byte of data.
• #include <sys/socket.h>

• int getsockname(int socket, struct


sockaddr *address, socklen_t
*address_len);
• The getsockname() function retrieves the locally-bound
name of the specified socket, stores this address in the
sockaddr structure pointed to by the address argument,
and stores the length of this address in the object pointed
to by the address_len argument.
• If the actual length of the address is greater than the
length of the supplied sockaddr structure, the stored
address will be truncated.
• If the socket has not been bound to a local name, the value
stored in the object pointed to by address is unspecified.
• #include <sys/socket.h>
• int getpeername(int socket, struct
sockaddr *address, socklen_t
*address_len);

• The getpeername() function retrieves the peer address of


the specified socket, stores this address in the sockaddr
structure pointed to by the address argument, and stores
the length of this address in the object pointed to by the
address_len argument.
• If the actual length of the address is greater than the
length of the supplied sockaddr structure, the stored
address will be truncated.
• If the protocol permits connections by unbound clients, and
the peer is not bound, then the value stored in the object
pointed to by address is unspecified.
Example: Client
Programming

• Create stream socket (socket() )


• Connect to server (connect() )
• While still connected:
– send message to server (write() )
– receive (read() ) data from server
and process it
• Close TCP connection and Socket
(close())
: Server Programming
Simple

• Create stream socket (socket() )


• Bind port to socket (bind() )
• Listen for new client (listen() )
• While
– accept user connection and create a
new socket (accept() )
– data arrives from client (read() )
– data has to be send to client (write() )
Creating & Binding TCP
Socket
int mysock;
struct sockaddr_in myaddr;

mysock=socket(PF_INET,SOCK_STREAM,0);

myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( 80 );
myaddr.sin_addr = htonl( INADDR_ANY);

bind(mysock,(struct sockaddr *) &myaddr,


sizeof(myaddr));
Reading from a TCP socket
int read( int fd, char *buf, int max);

• By default read() will block until


data is available.
Writing to a TCP socket

int write( int fd, char *buf, int


num);
UDP Sockets Programming
Server
(Connectionless
protocol)

socket ( )

bind ( )
Client

socket ( )
recvfrom( )

bind ( )
blocks until data received from a client
data (request)
sendo ( )

process request

sendto ( ) recvfrom( )
data reply

Socket system calls for connectionless protocol


Typical UDP client code
• Create UDP socket.

• Create sockaddr with address of


server.

• Call sendto(), sending request to the


server. No call to bind() is necessary!

• Possibly call recvfrom() (if we need a


reply).
Typical UDP Server code
• Create UDP socket and bind to well
known address.

• Call recvfrom() to get a request,


noting the address of the client.

• Process request and send reply back


with sendto().
Creating & Binding UDP
Socket

int mysock;
struct sockaddr_in myaddr;
mysock = socket(PF_INET,SOCK_DGRAM,0);

myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( 1234 );
myaddr.sin_addr = htonl( INADDR_ANY );

bind(mysock, &myaddr, sizeof(myaddr));


Sending UDP Datagrams
ssize_t sendto( int sockfd,
void *buff,
size_t nbytes,
int flags,
const struct sockaddr* to,
socklen_t addrlen);
sockfd is a UDP socket
buff is the address of the data (nbytes
long)
to is the address of a sockaddr containing
the destination address.
Return value is the number of bytes sent,
Receiving UDP Datagrams
ssize_t recvfrom( int sockfd,
void *buff,
size_t nbytes,
int flags,
struct sockaddr* from,
socklen_t *fromaddrlen);
sockfd is a UDP socket
buff is the address of a buffer (nbytes
long)
from is the address of a sockaddr.
Return value is the number of bytes received
and put into buff, or -1 on error.
recvfrom()
• If buff is not large enough, any extra
data is lost ...

• recvfrom doesn’t return until there is


a datagram available.

• The sockaddr at from is filled in with


the address of the sender.

• set fromaddrlen before calling.


UDP Echo Server
int mysock;
struct sockaddr_in myaddr, cliaddr;
char msgbuf[MAXLEN];
socklen_t clilen;
int msglen;

mysock = socket(PF_INET,SOCK_DGRAM,0);
myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( S_PORT );
myaddr.sin_addr = htonl( INADDR_ANY );
bind(mysock, &myaddr, sizeof(myaddr));
while (1) {
len=sizeof(cliaddr);
msglen=recvfrom(mysock,msgbuf,MAXLEN,0,cliaddr,&clilen);
sendto(mysock,msgbuf,msglen,0,cliaddr,clilen);
}
socket
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol)
domain is either AF_UNIX, AF_INET, or AF_OSI, or ..
AF_UNIX is the Unix domain, it is used for
communication within a single computer system.
[AF_LOCAL is the Posix name for AF_UNIX.]
AF_INET is for communication on the internet to
IPv4 addresses.
type is either SOCK_STREAM (TCP, connection
oriented, reliable), or SOCK_DGRAM (UDP,
datagram, unreliable), or SOCK_RAW (IP level).
protocol specifies the protocol used. It is usually 0 to
say we want to use the default protocol for the
chosen domain and type.

Returns, if successful, a socket descriptor which is an


int. It returns -1 in case of failure.
bind
#include <sys/types.h>
#include <sys/socket.h>
int bind(int sd, const struct sockaddr *addr, int
addrlen)
sd: File descriptor of local socket, as created by
the socket function.
addr: Pointer to protocol address structure of this
socket (e.g. sockaddr_in or sockaddr_un)
addrlen: Length in bytes of structure referenced
by addr.
Returns an integer, the return code (0=success,
-1=failure)
connect
#include <sys/types.h>
#include <sys/socket.h>
int connect(int sd, const struct sockaddr
*addr, int addrlen)
sd file descriptor of local socket
addr pointer to protocol address of other socket
(i.e. the one you want to connect to)
addrlen length in bytes of address structure.
Returns an integer (0=success, -1=failure)
listen
int listen(int fd, int qlen)
fd file descriptor of a socket that has
already been bound
qlen specifies the maximum number
of connection requests that can wait
to be processed by the server while
the server is busy servicing
another connection request.
Returns an integer (0=success,
-1=failure)
accept
#include <sys/types.h>
#include <sys/socket.h>
int accept(int fd, struct sockaddr *addressp, int *addrlen)
fd is an int, the file descriptor of the socket the server was
listening on [in fact it is called the listening socket],
i.e. on which the server has successfully completed
socket, bind, and listen.
addressp points to an address. It will be filled with address
of the calling client. We can use this address to
determine the IP address and port of the client.
addrlen is an integer that will contain the actual length of
address structure of client.
Returns an integer representing a new socket (-1 in case of
failure). It is the socket that the server will use from
now on to communicate with the client that requested
connection [in fact it is called the connected socket].
Different calls to accept will result in different
connected sockets.
Remember that the default behaviour for this function is to
block the calling process until a connection is actually
accepted (you can change that with fcntl)
sendto (for sending over UDP)
#include <sys/types.h>
#include <sys/socket.h>
int sendto(int sd, char *buff, int len, int flags, struct
sockaddr *addressp, int addrlen)
sd, socket file descriptor
buff, address of buffer with the information to be
sent
len, size of the message
flags, usually 0; could be used for priority
messages, etc.
addressp, address of process we are sending
message to
addrlen, length of message
Returns number of characters sent. It is -1 in
case of failure.
recvfrom (recv for UDP)
#include <sys/types.h>
#include <sys/socket.h>
int recvfrom (int sd, char *buff, int len, int flags,
struct sockaddr *addressp, int *addrlen)
sd, socket file descriptor
buff, address of buffer where message will be
stored len, size of buffer
flags, usually 0; used for priority messages,
peeking etc. addressp, buffer that will receive
address of process that sent message
addrlen, contains size of addressp structure;
Returns number of characters received. It is -1 in
case of failure.
send
int send(int sockfd, const void *msg, int len,
int flags)
sockfd is the socket descriptor you want
to send data to (whether it's the one
returned by socket() or the one you
got with accept().)
msg is a pointer to the data you want to
send. It can be any sort of structure.
flags. I am told you should leave this as
0 but see the man page for more info.
Returns the number of bytes actually sent.
recv
int recv(int sockfd, void *buf, int len, unsigned int
flags)
sockfd is the socket descriptor to read from
buf is the buffer to read the information into
len is the maximum length of the buffer, and
flags can again be set to 0. (See the recv() man
page for flag information.)
Returns the number of bytes actually read into
the buffer, or -1 on error.
gethostbyname – info about the hostname (e.g.
“www.mcgill.ca”)
#include <netdb.h> struct hostent *gethostbyname(const char
*name)
As you see, it returns a pointer to a struct hostent, the layout of
which is as follows:
struct hostent {
char *h_name;
char **h_aliases;
int h_addrtype;
int h_length;
char **h_addr_list; };
#define h_addr h_addr_list[0]
And here are the descriptions of the fields in the struct hostent:
h_name -- Official name of the host.
h_aliases -- A NULL-terminated array of alternate names for the
host.
h_addrtype -- The type of address being returned; usually
AF_INET.
h_length -- The length of the address in bytes.
h_addr_list -- A zero-terminated array of network addresses for
the host. Host addresses are in Network Byte Order.
h_addr -- The first address in h_addr_list.
#include <sys/select.h>
select
#include <sys/time.h>

int select(int maxfdp1, fd_set * readset, fd_set * writeset,


fd_set * exceptset, const struct timeval *timeout)

maxfdp1 the largest fd from the three fd_sets plus 1

readset set of fd’s for sockets you are waiting to read (so, e.g.
accept() or recv())

writeset set of fd’s for sockets you are waiting to write to (e.g.
you want to send())

exceptset set of fd’s you’re looking for an exception from


timeout select will block until either this amount of time has
elapsed, or one or more sockets from any of the three fd_sets
are ready, whichever comes first. To wait forever, timeout
should be a NULL pointer, to not block at all, timeout should
contain 0.
Returns the number of fd’s which are ready, from readset,
writeset, and exceptset
select utilities
void FD_ZERO(fd_set * fdset)
Set fdset to be the empty set – you should always start by
initializing your fd_set to empty

void FD_SET(int fd, fd_set * fdset)


Add file descriptor fd to fdset

void FD_CLR(int fd, fd_set * fdset)


Remove fd from fdset

int FD_ISSET(int fd, fd_set * fdset)


True iff select found that fd was ready (whether it was for reading,
writing, or an exception)

FD_SETSIZE = 256
Apparently you can’t have more than 256 items in an fd_set. This
shouldn’t be an issue right now, I just mention it for
completeness.
fcntl
#include <fcntl.h>
int fcntl(int fd, int cmd, … /* int arg */)
fd identifies the socket you wish to alter
cmd the command to execute. F_GETFL will cause fcntl to
return an integer which contains all flags for the socket.
In this case pass 0 for arg.
F_SETFL will cause the socket’s state to be set according to
the flag passed as arg.
(There are other options for this. See the man page)
arg … see above
For F_GETFL, returns the flags for socket fd.
For F_SETFL, returns >= 0 for success, else < 0

e.g.
flags = fcntl(fd, F_GETFL, 0)
flags |= O_NONBLOCK (to set the socket to non-blocking)
or flags &= ~O_NONBLOCK (to set the socket back to
blocking)
fcntl(fd, F_SETFL, flags)
Thank you

You might also like