7.4.2.5 Forking A Simple Server

7.4.2.5  Forking A Simple Server

   In the previous section, we forked the client so that it can handle reading and writing of the same socket. One forked process reads from the socket and the other writes to it. This makes for a versatile client.

However, there are problems with the servers we have seen so far. One major problem is that the server can respond to only one client at a time. Several clients can try to connect to the server at the same time. These clients are able to make TCP connections as we have seen in the earlier example. But, after making the TCP connection, all but one client hangs. Several clients can wait in a queue to be serviced. The size of the waiting queue is given by the Listen parameter when a socket is created. As a client finishes being serviced, the next client in the queue moves to the front, and gets serviced. If more than the maximum number of clients try to connect, they are refused right away.

Definitely, this is a problem. For real-world servers, we want the server to be able to handle several clients at the same time. For example, we want a Web server such as the Apache Web server to handle multiple clients. If only one client takes 100% of the server’s time, in reality it is no good. Therefore, it is necessary that we find ways in which it is possible for the server to handle many clients. It is because, quite frequently, servers just wait doing nothing. When most time is, thus, wasted by a server, it may as well service additional clients.

In this section, we will discuss how we can fork a server such that the parent works like the main receptionist at an organization. As soon as new request from a client comes, the parent process forks a child process and hands over the responsibilities of handling the requirements of the child to this process. The parent process immediately goes back to waiting for new client requests.

Thus, the arrangement is pretty simple. To service n clients, there will be n+1 processes in the system. One process is the parent process and each one of the rest of the processes handles one client each. When a client’s requests are handled completely and the client closes the connection, the corresponding child process is finished; it dies and is ultimately removed from the list of processes that the operating system maintains. There are some details associated with removing dead child processes. We have discussed some of this before in the section on forking. We will discuss it again a little later.

First, here is the client program. The client program is a slight modification of the simpleSend.pl program discussed earlier. It is called simpleSend1.pl and is given below.

 Program 7.33

#!/usr/bin/perl
#simpleSend1.pl
#works with simpleReceive.pl or simpleForkedReceive.pl

use IO::Socket::INET;
$sock = new IO::Socket::INET (PeerAddr => 'pikespeak',
         PeerPort => 1200,
         Proto => 'tcp');

die "Couldn't connect: $!" unless $sock;

foreach (1..10){
    print $sock "Repeating $_ time(s): How are you?\n";
    sleep 1; 
}

close ($sock);

It is the exact same program as before except that, inside the foreach loop, the following instruction has been inserted to make the program go a little slow.

 

    sleep 1;

 

It makes the client program sleep one second between iterations of the loop. This is done so that the client’s interaction with the server takes a while to be easily viewable on the terminal of the server.

The server program is given below. It is also a modification the server program simpleReceive.pl we have seen before. It is now called simpleForkedReceive.pl.

 Program 7.34

#!/usr/bin/perl
#simpleForkedReceive.pl
use IO::Socket;

$SIG{CHLD} = sub {wait ();};

$sock = new IO::Socket::INET (LocalHost => 'pikespeak.uccs.edu',
         LocalPort => 1200,
         Proto => 'tcp',
         Listen => 5,
         Reuse => 1);
die "Couldn't connect: $!" unless $sock;

while ($newSock = $sock->accept()) {
    $processID = fork ();
    next if ($processID > 0);

    #child process
    while (defined ($buf = <$newSock>)) {
          print $buf;
    }
    close ($newSock);
}
close ($sock);

There are two aspects to forking the server. The first is how the parent and the child coordinate their tasks. The second is how the child is disposed of once the child has finished its task.

First, let us look at the main while loop in the server program. The conditional of the while loop is a call to accept(). accept blocks till there is a client to respond to. When a client requests service, a new socket called $newSock is created to serve the request.

Immediately inside the main while loop, a call is made to fork(). Thus, a child process is produced at this line if forking is successful. So, the lines of code following the forking is run by two processes: the parent and the child. As we know from before, fork returns the positive process ID of the child to the parent, and 0 to the child itself.

Immediately after forking, we have the following statement.

 

    next if ($processID > 0);

 

This causes the server process to go to the next iteration of the main while loop. In other words, the server process does not do anything at all except look at the conditional of the main while loop and fork when needed.

The child processes are the workhorse of the program. A child process is dedicated to servicing a client. The child process simply reads the socket and prints its contents onto the standard output.

We can start the server on a certain machine, say, cs.uccs.edu and then start two clients on two machines. Because there are two child processes to handle two clients, the output printed on the server terminal is mixed, unlike the previous attempt at writing the client-server pair, when one client owned the services of the server completely till it was fully serviced. The output at the server terminal is shown below.

 

Repeating 1 time(s): How are you?

Repeating 2 time(s): How are you?

Repeating 3 time(s): How are you?

Repeating 4 time(s): How are you?

Repeating 5 time(s): How are you?

Repeating 1 time(s): How are you?

Repeating 6 time(s): How are you?

Repeating 2 time(s): How are you?

Repeating 7 time(s): How are you?

Repeating 3 time(s): How are you?

Repeating 8 time(s): How are you?

Repeating 4 time(s): How are you?

Repeating 9 time(s): How are you?

Repeating 5 time(s): How are you?

Repeating 10 time(s): How are you?

Repeating 6 time(s): How are you?

Repeating 7 time(s): How are you?

Repeating 8 time(s): How are you?

Repeating 9 time(s): How are you?

Repeating 10 time(s): How are you?

 

Now, there is still the issue of disposing of child processes properly. We know every child process dies off after the corresponding client is serviced. This is because the child process is created inside the main while loop of the simple forked server, and as such, it does not execute the main while loop in its entirety. It executes only the portion of the code inside the main while loop after its creation. Thus, it executes the second or included while loop, and once this included while loop is completely executed, the child is finished and dies automatically.

A child process usually dies before the parent that creates it. The operating system maintains a list of live—stopped, sleeping, or running—processes in the system at any time. In case a child process dies before a parent, the ID of the child process remains in the process table maintained by the operating system even after death. Normally, parents or the kernel takes care of removing dead processes from the process table maintained by the operating system. If the process table is not cleaned up, it is possible that the number of dead processes whose parents are alive, called zombie processes, in the process table could become significantly large. This is because servers on a Unix
machine, such as Web servers, sometimes run for months creating, possibly millions of child processes. In the case of a server, the parent usually lives as the children die. If the number of recorded processes, including zombies, becomes large, the operating system may prohibit the creation of new child processes. Even if the creation of new processes is not hampered, having too many zombie processes in the process table is a nuisance. This is of no concern in our simple example, but it real life, it could happen with a complex and reliable server program.

Thus, there is a potential problem with deaths of child processes created by long-running server processes. The solution to use the fact that, in Unix, the operating system allows a process that creates a child to inquire of the child’s health, in particular of its death. This is one of the very few ways a parent can relate to children it creates. Otherwise, the relationship between a parent and a child is quite tenuous. There are two functions or methods, wait and waitpid, that a parent process can use to wait till a child dies. wait or waitpid blocks the parent till a child (a child with a specific process ID in case of waitpid) dies and then it cleans up the process table for the system, and returns control to the parent.

Therefore, wait or waitpid suspends execution of the current process until a child (a specific child for waitpid) has exited, or a signal has been delivered to it whose action is to terminate the current process or call a signal handling function. If a child has already exited by the time of the call, the function returns immediately. In the case of waitpid, if the specific child requested has already exited by the time of the call, the function returns immediately. In both case, resources used by the child are freed. This is called reaping a child. The parent reaps a child by using wait or waitpid.

To handle the problem of zombified children processes, we use signals. A process can send and receive signals. A signal can be generated intentionally by a user from a keyboard sequence like CONTROL-C and sent to a process, can be sent to a process by another process, or can be automatically sent by the kernel when special events take place, such as a process running out of activation stack space, or file hitting its size limit. One signal a process can receive is called the CHLD signal. The signal is sent by the kernel of the operating system to a parent process when a child process dies or is stopped.

To trap a signal that comes to a process, the process has to set up a handler. The handler should be small in size and do as little as possible so that the handler is efficient, and does not cause problems itself. Handlers run very close to the functioning of the operating system and should be written with caution. If a process does not trap a certain signal coming to it, the process has no control over what the signal does.

To trap or handle a signal coming to a process, the process must maintain an associative array or hash called %SIG and store various values for specific signals. The signal names are the keys in the hash table. The values are usually subroutines or references to subroutines. The values represent the handlers. The operating system monitors signals coming to a process, and if the signal has an entry in %SIG in a process, the signal is redirected to the process for handling. The process calls the corresponding handler routine to take care of it.

In our program, we do not want our parent process to block and wait for the death of its children because this could mess up a server that is supposed to handle many clients over time. However, we want the parent process to help clean up the process table maintained by the operating system. In our program, the handler for the CHLD signal, in the parent process (before the children are forked), is set with the following line of code,

 

$SIG{CHLD} = sub {wait ();};

 

toward the beginning of the program. The subroutine simply contains a call to wait. When a child process dies, the kernel sends the parent a CHLD signal. The handler for the CHLD signal calls wait to clean up the process table. Because wait is called only after a child has died, there is no real waiting on the part of the parent for a child to die. Thus, the effect of
wait is to clear up the process table after a child does. The delay due to use of wait is minimal.

However, there is a subtle and potential problem in using wait in the handler for the CHLD signal because wait waits for any child process to exit, not a specific child. If several children processes die simultaneously, the Unix operating system combines two CHLD signals into one and gives it to the parent process. This is because Unix can handle only one signal of a particular type at any one time. So, even if two children are dead, only one
CHLD signal may be sent, and only one dead child reaped. This is sometimes called a zombie leak. Thus, it is possible that we have many dead or zombie processes in the process table after a server has lived and worked for a while. A subsequent call to wait simply looks to see if there is one, any one dead process. If it finds such a dead process, it cleans up after it, and then returns. If there are many zombie processes at a certain time, wait cleans up after only one, leaving the other dead or zombie processes in the process table.

Problems also arise when a child is stopped or restarted using a signal. Stopping of a child causes the operating system to send a CHLD signal to the parent although no child has actually died or exited. This can cause a process to hang, making the whole server wait for nothing.

There are ways to solve these problems, but we do not discuss them here.

In summary, there is a whole list of signals that can be created and received in a Unix system. In this chapter, we have already seen the kill command that sends a signal to a process. The name kill is, in some sense, a misnomer, because kill can send other signals as well. We have also seen how specific signals such as CHLD can be trapped and handled by a process.