Linux Applications Performance: Part II: Forking Servers

This chapter is part of a series of articles on Linux application performance.

In Part I: Iterative servers, we took a look at a server which deals with one client request at a time. This server called accept() whenever it was done serving one client so that it could accept more client connections and process them one after the other. We know the problem with this approach. If there are many concurrent client connections, they’d all be queued, waiting for the server to call accept(). In this “forking” server, we create a new child process every time we accept a client request. This child process deals with the client request, while the parent can immediately go back to blocking on accept(). This means that while several requests could be in the process of being served by child processes, the parent is always ready to accept more client connections. This means that the clients don’t have to wait for their request to be accepted and processed while the server is busy processing requests from other clients.

While this approach seems to be great and it looks like we’re all set up for the rest of our lives as legendary Linux programmers, we will see the reason why performance takes a big hit with this server architecture and how to fix it. In the remaining parts of this article series, we will explore further more server architectures and measure their performance to understand how one is better or worse than the other.

void enter_server_loop(int server_socket)
{
    struct sockaddr_in client_addr;

    socklen_t client_addr_len = sizeof(client_addr);
    while (1)
    {
        int client_socket = accept(
                server_socket,
                (struct sockaddr *)&client_addr,
                &client_addr_len);
        if (client_socket == -1)
            fatal_error("accept()");

        int pid = fork();
        if (pid == 0) {
            /* child process*/
            signal(SIGINT, SIG_DFL);
            connect_to_redis_server(redis_host_ip);
            handle_client(client_socket);
            close(client_socket);
            exit(0);
        }
        else {
            /* parent process */
            close(client_socket);
        }
    }
}

The main change to the forking server as compared to the iterative server is in the enter_server_loop() function. And the changes are astonishingly simple. It is because of the simplicity of the fork()system call. The other changes related to zombie process handling, which we describe in a later section here.

I don’t think it would be a great idea to bring into scope how fork()works in this article series. That’s for another article by itself. Enough to say for now that fork()creates an exact duplicate of the process and stats executing the next statement after fork()in both the parent and the child. Everything is duplicated, including open files and sockets. While fork() returns the PID or process ID of the new child in the parent, in the child it returns 0. With this simple but crucial difference, you conditionally do what you need to do in the parent and in the child. In this example, we check the return value of fork()and if it is not zero, it means your program is executing in the parent process and we close the client socket. Remember, since everything is duplicated for the child, the child process also has the client socket. In the child, which we test by checking the return value of fork(), we see that it is zero and we simply call handle_client(), passing in client_socket. You can also see that in the child process, once handle_client()returns, we close the client socket and exit, which ends the child process.

To summarize, earlier we called handle_client()directly after accept() and now, we call fork(), check to ensure that we are in the child process and we call handle_client() in the child. In the parent, we close client_socket and go back to blocking on accept(), which means new clients don’t have to wait for their request to be accepted and served by our server.

Pro Tip #1

Technically, in Unix, fork() is a system call. Under Linux however, when you make the call from your programs, glibc, the main C library most Linux distributions ship with, actually makes the clone() system call. In Linux, the clone()system call is the Swiss army knife of process creation. Under the hood, it is the same system call that is used to create threads.

Zombies!

The model where parent processes get real work done with the help of child processes or with threads is pretty much the way almost everything works in Linux and in almost all other operating systems. In most cases, the parent also needs to know if a child successfully did what it was supposed to do. To this end, child processes can pass a single integer value back to the parent. This is the int you see returned from the main() function of Linux programs. The convention being that any process that returns a 0 succeeded and any other value indicating things didn’t go all that well. Parent processes collect this return value explicitly and other things like process accounting (CPU and other resource usage) implicitly when they call any one of the wait() family of system calls. When the children of any process die, they are kept around by Linux in a weird state called the “zombie” state until the parent process explicitly calls wait(). Calling wait()is also known as reaping children, since once wait()is called, the operating system can give the parent process the child’s return value and process accounting information and remove the child’s process entry from the process table, which has a limited number of slots.

Reaping children

Calling wait() in the parent avoids zombies, but wait() can block and remember, we need to be, as much as possible, blocked on accept() and not blocked on wait(). If we are blocked on wait(), we won’t be able to quickly accept and process client connections. Thankfully, there is a way in Linux to reap children, as this is called. When a child process terminates, the operating system sends its parent process a SIGCHD signal. Even when you are not handling SIGCHLD, it is being sent to the parent process. Nothing happens, since by default SIGCHLD is ignored by processes. You could setup a signal handler for this signal and call wait() in the signal handler. This should collect the child’s return status, clear its process table entry and solve our problem. This however, introduces another problem. When a signal handler is done running, back in our program, which was mostly likely blocked on accept(), it returns with a -1, setting the global errno to EINTR which presumably tells us that accept() was interrupted by a signal. Luckily for us, you can ask the operating system to automatically restart the accept()system call if it is interrupted by the need to run a signal handler.

Apart from the changes to enter_server_loop(), the other main change is to the main() function, where we install the signal handler with the SA_RESTART flag, which tells the operating system to restart any interrupted system calls.

    /* setup the SIGCHLD handler with SA_RESTART */
    struct sigaction sa;
    sa.sa_handler = sigchld_handler;
    sa.sa_flags = SA_RESTART;
    sigfillset(&sa.sa_mask);
    sigaction(SIGCHLD, &sa, NULL);

And here is the signal handler itself:

void sigchld_handler(int signo) {

    /* Let's call wait() for all children so that process accounting gets stats */
    while (1) {
        int wstatus;
        pid_t pid = waitpid(-1, &wstatus, WNOHANG);
        if (pid == 0 || pid == -1) {
            break;
        }
        child_processes++;
    }
}

The signal handler as defined in the function sigchld_handler()is quite simple, really. Remember earlier that we did not want to call wait() since we would block? Turns out that Linux has a version of the wait()system call named waitpid()that allows parent processes to wait for a child with a particular PID and it can take an option WNOHANG which tells waitpid()not to wait if there are no children available to reap, which means it returns immediately. waitpid()also allows us to pass -1for the process ID meaning that it will wait for any child–not just a child with a particular process ID.

But, why are we calling waitpid()in a loop? This is because the SIGCHLD signal handler can get called just once as a result of the termination of multiple child processes. So, we call waitpid()in a loop, reaping as many children as possible.

Pro Tip #2

Another way to solve the problem of accept() being interrupted by signals is to wrap the call to accept() in a while loop where we check if errno is EINTR and call accept() again. This should be a good exercise for a budding Linux programmer. 250BPM has a good article on EINTR which makes for some very good reading. Remember that not all system calls can be asked to be automatically restarted after interruption by a signal. See the section “Interruption of system calls and library functions by signal handlers” in this man page for a list of system calls that can be automatically restarted. Since the accept() call is in the list, we were able to get away with SA_RESTART. If you are stuck with system calls that do not support SA_RESTART however, the technique described here should come in handy.

Observing Zombies

It is a good exercise to observe zombies. By commenting out the sigaction()system call in main(), recompiling and running, you can do this. Once you’ve served some requests, do a ps aux in another terminal while ZeroHTTPd is running and you’ll see some zombie processes. In ps‘s state column, you can see the value “Z+” indicating that they are all zombies. ps also encloses the process name is enclosed in brackets.

 USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
shuveb 12252 0.0 0.0 4516 796 pts/0 S+ 09:43 0:00 forking
shuveb 12322 0.0 0.0 0 0 pts/0 Z+ 09:44 0:00 [forking]
shuveb 12361 0.0 0.0 0 0 pts/0 Z+ 09:44 0:00 [forking]
shuveb 12364 0.0 0.0 0 0 pts/0 Z+ 09:44 0:00 [forking]
shuveb 12471 0.0 0.0 0 0 pts/0 Z+ 10:01 0:00 [forking]
shuveb 12472 0.0 0.0 0 0 pts/0 Z+ 10:01 0:00 [forking]

Pro Tip #3

Under Unix-like operating systems, unhandled signals are either ignored by default or they terminate the process. We saw earlier that SIGCHLDis ignored by default if not handled by the process by setting up a signal handler for it. For example, SIGINT, the signal the shell sends the foreground process when you hit ctrl+c on your keyboard, kills the process by default. But your programs can either handle it or set it to be ignored. If you set it to be ignored, as in the following code snippet, ctrl+c will no longer kill the foreground process:

signal(SIGINT, SIG_IGN);

It is worth mentioning that there are some signals, like SIGKILL that cannot be ignored. The main point of this tip is this: although the SIGCHLD signal is ignored by default, explicitly setting it to be ignored tells Linux to reap child processes automatically on the parent’s behalf without the parent having to call any of the wait() family of system calls. If you are not interested in collecting the return value or gathering resource usage information of the children, this is an easy way out. You don’t have to worry about restarting system calls and such.

signal(SIGCHLD, SIG_IGN);

Please be aware that explicitly telling the operating system to ignore SIGCHLD, although works on Linux by automatically reaping children, is not very portable across Unix-like operating systems. See this excerpt from the sigaction man page:

POSIX.1-1990 disallowed setting the action for SIGCHLD to SIG_IGN. POSIX.1-2001 and later allow this possibility, so that ignoring SIGCHLD can be used to prevent the creation of zombies (see wait(2)).        Nevertheless, the historical BSD and System V behaviors for ignoring SIGCHLD differ, so that the only completely portable method of ensuring that terminated children do not become zombies is to catch        the SIGCHLD signal and perform a wait(2) or similar.

Pro Tip #4

Signals are a real pain to deal with and in the case of reaping children, signals are pretty much the only way to tackle this, especially if you are doing something in the parent, like in our case, accepting new client connections with accept(). Linux has an interesting system call signalfd(), which makes life slightly better. You should read more about it. It lets you handle signals in a more synchronous fashion.

Forking Server Performance

Let’s take a look at our performance numbers table.

requests/second
concurrency iterative forking preforked threaded prethreaded poll epoll
20 7 112 2,100 1,800 2,250 1,900 2,050
50 7 190 2,200 1,700 2,200 2,000 2,000
100 7 245 2,200 1,700 2,200 2,150 2,100
200 7 330 2,300 1,750 2,300 2,200 2,100
300 380 2,200 1,800 2,400 2,250 2,150
400 410 2,200 1,750 2,600 2,000 2,000
500 440 2,300 1,850 2,700 1,900 2,212
600 460 2,400 1,800 2,500 1,700 2,519
700 460 2,400 1,600 2,490 1,550 2,607
800 460 2,400 1,600 2,540 1,400 2,553
900 460 2,300 1,600 2,472 1,200 2,567
1,000 475 2,300 1,700 2,485 1,150 2,439
1,500 490 2,400 1,550 2,620 900 2,479
2,000 350 2,400 1,400 2,396 550 2,200
2,500 280 2,100 1,300 2,453 490 2,262
3,000 280 1,900 1,250 2,502 wide variations 2,138
5,000 wide variations 1,600 1,100 2,519 2,235
8,000 1,200 wide variations 2,451 2,100
10,000 wide variations 2,200 2,200
11,000 2,200 2,122
12,000 970 1,958
13,000 730 1,897
14,000 590 1,466
15,000 532 1,281

While the iterative server maxes out at 7 requests/second, with the forking server, we are able to wring out more performance. We see the server struggle after a concurrency of 1,500. For our measurements, we always run each test at any concurrency level 3 times and then average the requests/second. Performance begins to worsen as we increase concurrency from 2,000 to 3,000. At 5,000 each run gives us numbers that are vary widely from each other as far as the requests/second go, making it too unreliable to measure.

While the forking server is way better than the iterative server which is stuck at 7 requests/second no matter what the concurrency, choosing a forking server design can let you handle many more concurrent users. Also, there are minimal changes required to you program to go from iterative to forking. The design of the fork()system call is very powerful and elegant, letting us keep the program structure simple while being able to parcel out work to child processes. If you are feeling really lazy, you can move up from an iterative server to the forking server with minimal effort while still seeing significant benefits.

Articles in this series

  1. Series Introduction
  2. Part I. Iterative Servers
  3. Part II. Forking Servers
  4. Part III. Pre-forking Servers
  5. Part IV. Threaded Servers
  6. Part V. Pre-threaded Servers
  7. Part VI: poll-based server
  8. Part VII: epoll-based server