os_cp:fork_exec

Fork
Exec…
Duplicating file descriptors
Blocking vs non-blocking operations
Pipes & FIFOs

Fork

Forking

To create a new process, POSIX defines the fork function: Needs header:unistd.h pid_t fork()
fork creates a copy¹⁾ of the running process.
Upon success, fork returns the pid of the child process in the parent process, and the value of 0 in the child process.
Fork may fail if the resource limits are exhausted, and in such case the expected -1 is returned.

There is a list of things that fork does not clone or that are reset for the child process upon fork. See POSIX standard or Linux manual on fork for details. Notice that:

all stack and heap memory is copied
the list of open files is copied
memory mappings are copied (unless one provides an explicit flag not to copy a mapping upon creating it)
signal handlers are copied
pending signals and timers are not copied
threads are not copied
(but state of mutexes and values of semaphores are copied)
forking in a multithreaded application is dangerous

Learning one's own pid

To learn its own process identifier, the process can execute Needs header: unistd.h pid_t getpid()
To learn the process identifier of its parent, the process can execute Needs header: unistd.h pid_t getppid()

Waiting for a child

The parent process shall care for its child processes once they terminate.
The programmer may either: 1) wait for the child to terminate, 2) set up a signal handler for SIGCHILD (it suffices to ignore the signal), 3) force the child out of the parent processes session (by using setsid() or double fork).

To wait for the termination of any child, one can call wait. To wait for the termination of a specific pid, one can call waitpid:
Needs header:
sys/wait.h pid_t wait(int *stat_loc)
pid_t waitpid(pid_t pid, int *stat_loc, int options)
These functions return the process identifier of the child process which terminated.

The functions write the status of the termination to the memory pointed by stat_loc (provided it's not NULL) . The status provides information whether the process terminated normally (by calling exit or returning from main) and if it did, then which value it returned. See the documentation for a list of macros that extract the information.

If the pid in waitpid is positive, then it is considered the pid of the child to wait for. Non-positive values have special meanings. Among others, -1 waits for any child.

The options argument of waitpid is a combination of flags. The value of 0 simply waits until the pid terminates. The flag WNOHANG makes waitpid return immediately (and the return value indicates whether a child has terminated).

Exercises

Exercise 1 Write a program that:
1. sleeps five seconds,
2. prints a text with write,
3. forks,
4. prints another text with write,
5. sleeps another five seconds.
Run the program and observe it in a live process viewer.
sleep(int sec) sleeps with second resolution. (The other POSIX sleep function is nanosleep; C standard includes now the equivalent thrd_sleep)
To keep an eye on the application, either use htop / top, or try watch -n 0.1 ps -lHC executable_name.

Exercise 2 Fork and print the result of fork, getpid and getppid.

Exercise 3 Fork. Print child in the child process and parent in the parent process.

Exercise 4 Fork. In the child process, exit immediately. In the parent, wait for input (e.g., do a read from 0 or invoke getchar).
Run the program and observe the zombie using htop / top / ps.

Exec…

In order to run an executable file, an existing process has to 'exec' into it – that is, the process has to ask the kernel to replace its memory with the code (and data) of the executable file.
So, typically to start a new process, one has to fork and then exec… in the child:

prog1              ,------.   prog1
-------------------| fork |--------------------------------------------------
pid: x (ppid: y)   `------\   pid: x (ppid: y)
                           \
                            \ prog1              ,-------. prog2
                             `-------------------| exec… |-------------------
                              pid: z (ppid: x)   `-------' pid: z (ppid: x)

To this end, a family of functions starting with exec is provided:
Needs header:
unistd.h int execlp(const char *file, const char *arg0, ... /*, (char *)0 */)
int execl (const char *path, const char *arg0, ... /*, (char *)0 */)
int execle(const char *path, const char *arg0, ... /*, (char *)0, char *const envp[]*/)
int execvp(const char *file, char *const argv[])
int execv (const char *path, char *const argv[])
int execve(const char *path, char *const argv[], char *const envp[])
(Those functions are documented more cleanly in the Linux manual).

After a successful execution of the exec… function the next instructions of the process are those of the new executable file. Hence, it is pointless to check the return value of exec… – it may only return -1, and if the instructions following exec… do execute, then exec… must have failed.

Importantly, upon executing exec… the list of open files is retained²⁾.
Almost all resources are released. See the documentation for other exceptions.

execl… vs execv…
The former takes a list of arguments (terminated by a NULL sentinel), the latter takes a pointer to the argument vector (the last element must be a NULL sentinel as well).

char arg0[] = "ls";     char arg2[] = "-a";
char arg1[] = "-l";     char arg3[] = "/tmp";
char *argv[] = {arg0, arg1, arg2, arg3, NULL};
# Argument list:
execlp("ls",    arg0, arg1, arg2, arg3, NULL);
# Argument vector:
execvp("ls", argv);

Notice that the 0th argument is the program name.

exec…p vs exec… (without p):
The latter requires a path to the executable, and the former searches for the executable within directories specified by the PATH environmental variable if a name rather than a path was provided (ls is a name, /bin/ls is a path, ./ls is a path).

# succeeds iff there is an executable file named 'ls' in the current working directory
execl ("ls", "ls", "-la", "/tmp",  NULL);
# succeeds iff there is an executable file named 'ls' in one of the directories in the PATH list
execlp("ls", "ls", "-la", "/tmp",  NULL);
# succeeds iff '/bin/ls' is an executable file
execl ("/bin/ls", "ls", "-la", "/tmp",  NULL);
# as above - searching the PATH is abandoned if the argument is a path
execlp("/bin/ls", "ls", "-la", "/tmp",  NULL);

UNIX-like and/or POSIX-compliant operating systems use environment variables. By default the values of all such variables are inherited upon exec….
exec…e functions have an extra argument that should point to an array of the environment variables, allowing thereby to override them for newly exec'ed process.
To access an unprocessed array of environment variables for the current process, one must have in the source code the following lines:

#include <unistd.h>
extern char **environ;

The environ external variable and the envp argument of exec…e functions use a NULL sentinel.
(Normally, to access such variables of the running process one shall use the getenv and setenv functions.
The environ variable is useful when one wants to pass a slightly modified set of environment variables.)

Exercise 5 Write a program that executes ps with the argument -F. (Remember that the arguments include the command name.)

Exercise 6 Write a program prog that, when executed as prog [arg]..., executes the ls -l -t -r [arg]... command.

Exercise 7 Write a program prog that, when executed as prog cmd [arg]..., executes and measures the runtime (as wall clock time) of cmd [arg]....
To measure time, you can use the following C11 code:

#include <time.h>
...
    struct timespec start, end;
    timespec_get(&start, TIME_UTC);
    ...
    timespec_get(&end, TIME_UTC);
    double elapsedSec = (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec) / 1e9;

Exercise 8 Modify the code of the previous exercise:
• close the standard input and output of the parent process right after fork
• write output from parent process to the standard error
• output whether the child process executed normally and output its return value

Duplicating file descriptors

POSIX defines the following functions:
Needs header:
unistd.h int dup(int fildes)
int dup2(int fildes, int target)
to duplicate file descriptors.

Duplicating a file descriptor is something different than opening the same file twice. When opening the same file twice, one can choose different set of flags (such as O_RDONLY, O_RDWR, O_APPEND), and the descriptors have a different position in file (byte that will be read/written upon next read/write). Duplicated file descriptors refer to the same state of the file (flags, position etc.).

Hover mouse over lines of code to see what happens in the OS upon `open`/`dup`:
Opening twice:

Duplicating:

The dup2 function atomically closes the descriptor target, and then duplicates the descriptor fildes to the descriptor target.
This is commonly used to replace standard streams, as in the following example (that shows redirection of the standard output):

Each file descriptor has to be closed separately.

Exercise 9 Write a program that executes ps with the argument -F and writes its output to a file output.

Exercise 10 Write a program that, when executed as prog fname [arg]..., will act as tr [arg]... < fname.

Exercise 11 Write a program that:
• assigns ELOOP to errno
• executes perror function so that the error is output on the terminal
• executes perror function so that the error is written to a file
• assigns EMFILE to errno
• executes perror function so that the error is output on the terminal
• executes perror function so that the error is written to a file
Notice that perror always writes to standard error, so to make it write to a file, one has to replace the file descriptor 2.

Blocking vs non-blocking operations

Most POSIX functions complete in a limited number of steps.

But when the user invokes certain functions, then in well-defined circumstances it makes sense to wait until a particular thing happens.

For instance, when a php -S 0:8080 2>&1 | tee -a log command is executed in a shell, then it is desired that when the tee program attempts to read from the output of php program, then the read function waits until the php wrote something.

When a function that interfaces with the operating system does not return because the OS waits for something to happen before it may respond, then one says that the function blocks.
Functions that may ever block are called blocking. (Cf. definition of blocking in POSIX standard.)

Blocking may take indefinite time. When a blocking function is used, the programmer must always account that a call to the function may stop the thread that invoked it for an arbitrary time.

There is usually a way to invoke blocking functions in a non-blocking mode.
When a blocking functions is used in non-blocking mode, then it either does what it's supposed to do without waiting, or it returns -1 and sets errno to EWOULDBLOCK or EAGAIN.
When one uses non-blocking mode, one must handle the case when the function failed to do (a part of) what it was supposed to do.

For functions related to file descriptors, the blocking / non-blocking mode is selected by a O_NONBLOCK flag for a file descriptor.
To set/clear the O_NONBLOCK flag, one shall first read the flags with int flags = fcntl(fd, F_GETFL);, then set/clear the flag (e.g., flags |= O_NONBLOCK;) and finally set the new flags with fcntl(fd, F_SETFL, flags);.

Pipes & FIFOs

A pipe is an unidirectional communication channel – a pair of file descriptors such that any data written to the second descriptor can be read from the first descriptor. A pipe is created with the following function:
Needs header: unistd.h int pipe(int fildes[2])
The fildes[0] is opened for reading, and the fildes[1] is opened for writing. (Cf. with standard input being 0 and standard output being 1)
Pipes can be used to send data from one process to another process, or from one thread to another thread of the same process³⁾.

By default, pipes are blocking, that is reading data from a pipe will stall the thread that invoked read until some data is written to a pipe. Also, writing data to a pipe will block when sufficiently many bytes were already written and are not yet read from the pipe.
In non-blocking mode the write function may write only a part of the data.

When all file descriptors that allowed writing to a pipe are closed, a read from the pipe will return 0.
When all file descriptors that allowed reading from a pipe are closed, a write to the pipe will first raise SIGPIPE, then return -1 and set errno to EPIPE (provided the process did not terminate upon SIGPIPE).

To share a pipe between two processes, one must create a pipe in one process and fork – file descriptors are copied upon forking.
A FIFO file (or a named pipe) is a special file that allows opening either end of a pipe by providing a path to the file. A FIFO file can be created with mkfifo shell utility or by the mkfifo function.
A call to open on a path to a FIFO file is blocking. open returns only once at least one process invoked open with O_RDONLY and at least one process invoked open with O_WRONLY⁴⁾. From that point on the file descriptors act as those of an (anonymous) pipe.

Pipe is unidirectional. The unix socket is its bidirectional equivalent. See man 7 unix for details.

Exercise 12 Write a program that:
• creates a pipe
• forks
• in the child process:
· calculates a computationally expensive mathematical equation (say, "2+2")
· writes the result to the pipe
· terminates
• in the parent process:
· reads from pipe
· writes result to the standard output

Exercise 13 Write a program that:
• creates a pipe
• creates three child processes
• in each child process:
· calculates a computationally expensive mathematical equation (say, "2+2")
· writes the result to the pipe as a four-byte integer
· terminates
• in the parent process:
· calculates a computationally expensive mathematical equation (say, "2+2")
· reads from the pipe all three results (and calls wait() to reap defunct children)
· writes the sum of all four results to the standard output

Exercise 14 Write a program that prints the result of ls -l in uppercase.
You may do this e.g., by: pipe, fork, in one process: dup2 and exec, in the other process: reading from pipe and changing case.
The toupper function (available from ctype.h) converts a single character to upper case.

Exercise 15 Write a program that does ps -eF | sort -nk6.

¹⁾ Nowadays forking is considered cheap, for it uses copy-on-write to reduce costs.

²⁾ With the exception of files on which the programmer explicitly set the close-on-exec flag for the file descriptor, for instance by passing O_CLOEXEC among the flags upon opening the file or by issuing a fcntl(…, F_SETFD, FD_CLOEXEC|…) for the file.

³⁾ While using a pipe to communicate threads is inefficient, it has its use cases. For instance, it enables waking a thread that is waiting for I/O from multiple sources with select or poll.

⁴⁾ POSIX defines only what happens with FIFO files opened with O_RDONLY and O_WRONLY. Result of using O_RDWR with a FIFO file is explicitly undefined in POSIX.

Table of Contents