===== Fork ===== === Forking === To create a new process, POSIX defines the ''fork'' function: Needs header:unistd.h ''pid_t **fork**()'' \\ **''fork'' creates a copy((Nowadays forking is considered cheap, for it uses [[https://en.wikipedia.org/wiki/Copy-on-write|copy-on-write]] to reduce costs.)) of the running process.** \\ **Upon success, ''fork'' returns the pid of the child process in the parent process, and the value of ''0'' in the child process.** \\ Fork may fail if the resource limits are exhausted, and in such case the expected ''-1'' is returned. There is a list of things that ''fork'' does not clone or that are reset for the child process upon ''fork''. See POSIX standard or Linux manual on fork for details. Notice that: * all stack and heap memory is copied * the list of open files is copied * memory mappings are copied (unless one provides an explicit flag not to copy a mapping upon creating it) * signal handlers are copied * pending signals and timers are **not** copied * threads are **not** copied \\ (but state of mutexes and values of semaphores are copied) \\ forking in a multithreaded application is dangerous === Learning one's own pid === To learn its own process identifier, the process can execute Needs header: unistd.h ''pid_t **getpid**()'' \\ To learn the process identifier of its parent, the process can execute Needs header: unistd.h ''pid_t **getppid**()'' === Waiting for a child === The parent process shall care for its child processes once they terminate. \\ The programmer may either: 1) wait for the child to terminate, 2) set up a signal handler for SIGCHILD (it suffices to ignore the signal), 3) force the child out of the parent processes session (by using ''setsid()'' or double ''fork''). To wait for the termination of any child, one can call ''wait''. To wait for the termination of a specific pid, one can call ''waitpid'': \\ Needs header:
sys/wait.h ''pid_t **wait**(int *//stat_loc//)'' \\ ''pid_t **waitpid**(pid_t //pid//, int *//stat_loc//, int //options//)'' \\ These functions return the process identifier of the child process which terminated. The functions write the status of the termination to the memory pointed by ''//stat_loc//'' (provided it's not ''NULL'') . The status provides information whether the process terminated normally (by calling ''exit'' or returning from ''main'') and if it did, then which value it returned. See the documentation for a list of macros that extract the information. If the ''//pid//'' in ''waitpid'' is positive, then it is considered the pid of the child to wait for. Non-positive values have special meanings. Among others, ''-1'' waits for any child. The ''//options//'' argument of ''waitpid'' is a combination of flags. The value of ''0'' simply waits until the pid terminates. The flag ''WNOHANG'' makes ''waitpid'' return immediately (and the return value indicates whether a child has terminated). === Exercises === ~~Exercise.#~~ Write a program that: \\ 1. sleeps five seconds, \\ 2. prints a text with ''write'', \\ 3. forks, \\ 4. prints another text with ''write'', \\ 5. sleeps another five seconds. \\ Run the program and observe it in a live process viewer. \\ ''sleep(int //sec//)'' sleeps with second resolution. (The other POSIX sleep function is ''[[https://pubs.opengroup.org/onlinepubs/9799919799/functions/nanosleep.html|nanosleep]]''; C standard includes now the equivalent ''[[https://en.cppreference.com/w/c/thread/thrd_sleep|thrd_sleep]]'') \\ To keep an eye on the application, either use ''htop'' / ''top'', or try ''watch -n 0.1 ps -lHC //executable_name//''. ~~Exercise.#~~ Fork and print the result of ''fork'', ''getpid'' and ''getppid''. ~~Exercise.#~~ Fork. Print ''child'' in the child process and ''parent'' in the parent process. ~~Exercise.#~~ Fork. In the child process, exit immediately. In the parent, wait for input (e.g., do a ''read'' from 0 or invoke ''getchar''). \\ Run the program and observe the zombie using ''htop'' / ''top'' / ''ps''. ===== Exec… ===== In order to run an executable file, an existing process has to 'exec' into it – that is, the process has to ask the kernel to replace its memory with the code (and data) of the executable file. \\ So, typically to start a new process, one has to ''fork'' and then ''exec…'' in the child:

prog1              ,------.   prog1
-------------------| fork |--------------------------------------------------
pid: x (ppid: y)   `------\   pid: x (ppid: y)
                           \
                            \ prog1              ,-------. prog2
                             `-------------------| exec… |-------------------
                              pid: z (ppid: x)   `-------' pid: z (ppid: x)

To this end, a family of functions starting with ''exec'' is provided: \\ Needs header:
unistd.h ''int **execlp**(const char *//file//, const char *arg0, ... /*, (char *)**0** */)'' \\ ''int **execl **(const char *//path//, const char *arg0, ... /*, (char *)**0** */)'' \\ ''int **execle**(const char *//path//, const char *arg0, ... /*, (char *)**0**, char *const //envp//[]*/)'' \\ ''int **execvp**(const char *//file//, char *const //argv//[])'' \\ ''int **execv **(const char *//path//, char *const //argv//[])'' \\ ''int **execve**(const char *//path//, char *const //argv//[], char *const //envp//[])'' \\ (Those functions are documented more cleanly in [[https://man7.org/linux/man-pages/man3/exec.3.html|the Linux manual]]). After a successful execution of the ''exec…'' function the next instructions of the process are those of the new executable file. Hence, it is pointless to check the return value of ''exec…'' – it may only return ''-1'', and if the instructions following ''exec…'' do execute, then ''exec…'' must have failed. Importantly, upon executing ''exec…'' the list of open files is retained((With the exception of files on which the programmer explicitly set the //close-on-exec// flag for the file descriptor, for instance by passing ''O_CLOEXEC'' among the flags upon opening the file or by issuing a ''fcntl(…, F_SETFD, FD_CLOEXEC|…)'' for the file.)). \\ Almost all resources are released. See the documentation for other exceptions. ''exec**l**…'' vs ''exec**v**…''\\ The former takes a list of arguments (terminated by a NULL [[https://en.wikipedia.org/wiki/Sentinel_value|sentinel]]), the latter takes a pointer to the argument vector (the last element must be a NULL sentinel as well).


char arg0[] = "ls";     char arg2[] = "-a";
char arg1[] = "-l";     char arg3[] = "/tmp";
char *argv[] = {arg0, arg1, arg2, arg3, NULL};
# Argument list:
execlp("ls",    arg0, arg1, arg2, arg3, NULL);
# Argument vector:
execvp("ls", argv);

Notice that the 0th argument is the program name. ''exec…**p**'' vs ''exec…'' (without ''p''): \\ The latter requires a path to the executable, and the former searches for the executable within directories specified by the ''PATH'' environmental variable if a name rather than a path was provided (//ls// is a name, ///bin/ls// is a path, //./ls// is a path).


# succeeds iff there is an executable file named 'ls' in the current working directory
execl ("ls", "ls", "-la", "/tmp",  NULL);
# succeeds iff there is an executable file named 'ls' in one of the directories in the PATH list
execlp("ls", "ls", "-la", "/tmp",  NULL);
# succeeds iff '/bin/ls' is an executable file
execl ("/bin/ls", "ls", "-la", "/tmp",  NULL);
# as above - searching the PATH is abandoned if the argument is a path
execlp("/bin/ls", "ls", "-la", "/tmp",  NULL);

UNIX-like and/or POSIX-compliant operating systems use [[https://en.wikipedia.org/wiki/Environment_variable|environment variables]]. By default the values of all such variables are inherited upon ''exec…''. \\ ''exec…**e**'' functions have an extra argument that should point to an array of the environment variables, allowing thereby to override them for newly exec'ed process. \\ To access an unprocessed array of environment variables for the current process, one must have in the source code the following lines:


#include 
extern char **environ;

The ''environ'' external variable and the ''//envp//'' argument of ''exec…e'' functions use a NULL sentinel. \\ (Normally, to access such variables of the running process one shall use the ''[[https://en.cppreference.com/w/cpp/utility/program/getenv|getenv]]'' and ''[[https://pubs.opengroup.org/onlinepubs/9799919799/functions/setenv.html|setenv]]'' functions. \\ The ''environ'' variable is useful when one wants to pass a slightly modified set of environment variables.) \\ ~~Exercise.#~~ Write a program that executes ''ps'' with the argument ''-F''. (Remember that the arguments include the command name.) ~~Exercise.#~~ Write a program ''//prog//'' that, when executed as ''//prog// [//arg//]...'', executes the ''ls -l -t -r [//arg//]...'' command. ~~Exercise.#~~ Write a program ''//prog//'' that, when executed as ''//prog// //cmd// [//arg//]...'', executes and measures the runtime (as wall clock time) of ''//cmd// [//arg//]...''. \\ To measure time, you can use the following C11 code:


#include 
...
    struct timespec start, end;
    timespec_get(&start, TIME_UTC);
    ...
    timespec_get(&end, TIME_UTC);
    double elapsedSec = (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec) / 1e9;

~~Exercise.#~~ Modify the code of the previous exercise: \\ • close the standard input and output of the parent process right after ''fork'' \\ • write output from parent process to the standard error \\ • output whether the child process executed normally and output its return value ===== Duplicating file descriptors ===== POSIX defines the following functions: \\ Needs header:
unistd.h ''int **dup**(int //fildes//)'' \\ ''int **dup2**(int //fildes//, int //target//)'' \\ to duplicate file descriptors. Duplicating a file descriptor is something different than opening the same file twice. When opening the same file twice, one can choose different set of flags (such as O_RDONLY, O_RDWR, O_APPEND), and the descriptors have a different position in file (byte that will be read/written upon next read/write). Duplicated file descriptors refer to the same state of the file (flags, position etc.). | Hover mouse over lines of code to see what happens in the OS upon ''open''/''dup'':|| | Opening twice: |

| || | | Duplicating: |

| The ''dup2'' function **atomically** closes the descriptor //target//, and then duplicates the descriptor //fildes// to the descriptor //target//. \\ This is commonly used to replace standard streams, as in the following example (that shows redirection of the standard output):

Each file descriptor has to be closed separately. ~~Exercise.#~~ Write a program that executes ''ps'' with the argument ''-F'' and writes its output to a file ''//output//''. ~~Exercise.#~~ Write a program that, when executed as ''//prog// //fname// [//arg//]...'', will act as ''tr [//arg//]... < fname''. ~~Exercise.#~~ Write a program that: \\ • assigns ''ELOOP'' to ''errno'' \\ • executes ''perror'' function so that the error is output on the terminal \\ • executes ''perror'' function so that the error is written to a file \\ • assigns ''EMFILE'' to ''errno'' \\ • executes ''perror'' function so that the error is output on the terminal \\ • executes ''perror'' function so that the error is written to a file \\ Notice that ''perror'' always writes to standard error, so to make it write to a file, one has to replace the file descriptor ''2''. ===== Blocking vs non-blocking operations ===== Most POSIX functions complete in a limited number of steps. But when the user invokes certain functions, then in well-defined circumstances it makes sense to wait until a particular thing happens. For instance, when a ''php -S 0:8080 2>&1 | tee -a log'' command is executed in a shell, then it is desired that when the ''tee'' program attempts to read from the output of ''php'' program, then the ''read'' function waits until the ''php'' wrote something. When a function that interfaces with the operating system does not return because the OS waits for something to happen before it may respond, then one says that the function **blocks**. \\ Functions that may ever block are called **blocking**. (Cf. definition of [[https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap03.html#tag_03_48|blocking]] in POSIX standard.) **Blocking may take indefinite time. When a blocking function is used, the programmer must always account that a call to the function may stop the thread that invoked it for an arbitrary time.** There is usually a way to invoke blocking functions in a non-blocking mode. \\ **When a blocking functions is used in non-blocking mode, then it either does what it's supposed to do without waiting, or it returns ''-1'' and sets ''errno'' to EWOULDBLOCK or EAGAIN**. \\ When one uses non-blocking mode, one must handle the case when the function failed to do (a part of) what it was supposed to do. For functions related to file descriptors, the blocking / non-blocking mode is selected by a O_NONBLOCK flag for a file descriptor. \\ To set/clear the O_NONBLOCK flag, one shall first read the flags with ''int flags = [[https://pubs.opengroup.org/onlinepubs/9799919799/functions/fcntl.html|fcntl]](fd, F_GETFL);'', then set/clear the flag (e.g., ''flags |= O_NONBLOCK;'') and finally set the new flags with ''fcntl(fd, F_SETFL, flags);''. ===== Pipes & FIFOs ===== A **pipe** is an unidirectional communication channel – a pair of file descriptors such that any data written to the second descriptor can be read from the first descriptor. A pipe is created with the following function: \\ Needs header: unistd.h ''int **pipe**(int //fildes//[2])'' \\ The ''//fildes//[0]'' is opened for reading, and the ''//fildes//[1]'' is opened for writing. (Cf. with standard input being ''0'' and standard output being ''1'') \\ Pipes can be used to send data from one process to another process, or from one thread to another thread of the same process((While using a pipe to communicate threads is inefficient, it has its use cases. For instance, it enables waking a thread that is waiting for I/O from multiple sources with ''select'' or ''poll''.)). By default, pipes are **blocking**, that is reading data from a pipe will stall the thread that invoked ''read'' until some data is written to a pipe. Also, writing data to a pipe will block when sufficiently many bytes were already written and are not yet read from the pipe. \\ In non-blocking mode the ''write'' function may write only a part of the data. When all file descriptors that allowed writing to a pipe are closed, a read from the pipe will return ''0''. \\ When all file descriptors that allowed reading from a pipe are closed, a write to the pipe will first raise SIGPIPE, then return ''-1'' and set ''errno'' to ''EPIPE'' (provided the process did not terminate upon SIGPIPE). To share a pipe between two processes, one must create a pipe in one process and ''fork'' – file descriptors are copied upon forking. \\ A **FIFO** file (or a named pipe) is a special file that allows opening either end of a pipe by providing a path to the file. A FIFO file can be created with ''mkfifo'' shell utility or by the ''[[https://pubs.opengroup.org/onlinepubs/9799919799/functions/mkfifo.html|mkfifo]]'' function. \\ A call to ''open'' on a path to a FIFO file is blocking. ''open'' returns only once at least one process invoked ''open'' with O_RDONLY and at least one process invoked ''open'' with O_WRONLY((POSIX defines only what happens with FIFO files opened with O_RDONLY and O_WRONLY. Result of using O_RDWR with a FIFO file is explicitly undefined in POSIX.)). From that point on the file descriptors act as those of an (anonymous) pipe. Pipe is unidirectional. The unix //socket// is its bidirectional equivalent. See ''[[https://man7.org/linux/man-pages/man7/unix.7.html|man 7 unix]]'' for details. ~~Exercise.#~~ Write a program that: \\ • creates a pipe \\ • forks \\ • in the child process: \\ · calculates a computationally expensive mathematical equation (say, "2+2") \\ · writes the result to the pipe \\ · terminates \\ • in the parent process: \\ · reads from pipe \\ · writes result to the standard output ~~Exercise.#~~ Write a program that: \\ • creates a pipe \\ • creates three child processes \\ • in each child process: \\ · calculates a computationally expensive mathematical equation (say, "2+2") \\ · writes the result to the pipe as a four-byte integer \\ · terminates \\ • in the parent process: \\ · calculates a computationally expensive mathematical equation (say, "2+2") \\ · reads from the pipe all three results (and calls ''wait()'' to reap defunct children) \\ · writes the sum of all four results to the standard output ~~Exercise.#~~ Write a program that prints the result of ''ls -l'' in uppercase. \\ You may do this e.g., by: ''pipe'', ''fork'', in one process: ''dup2'' and ''exec'', in the other process: reading from pipe and changing case. \\ The ''[[https://en.cppreference.com/w/c/string/byte/toupper|toupper]]'' function (available from ''ctype.h'') converts a single character to upper case. ~~Exercise.#~~ Write a program that does ''ps -eF | sort -nk6''. ~~META: language = en ~~