This shows you the differences between two versions of the page.
— |
os_cp:fork_exec_pipes [2024/05/08 13:31] (current) jkonczak utworzono |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ===== Fork ===== | ||
+ | |||
+ | === Forking === | ||
+ | |||
+ | To create a new process, POSIX defines the ''fork'' function: | ||
+ | <html><span style="float:right"><small>Needs header:<code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html"></html> | ||
+ | ''pid_t **fork**()'' | ||
+ | <html></a></html> | ||
+ | \\ | ||
+ | **''fork'' creates a copy((Nowadays forking is considered cheap, for it uses | ||
+ | [[https://en.wikipedia.org/wiki/Copy-on-write|copy-on-write]] to reduce costs.)) | ||
+ | of the running process.** | ||
+ | \\ | ||
+ | **Upon success, ''fork'' returns the pid of the child process in the parent | ||
+ | process, and the value of ''0'' in the the child process.** | ||
+ | \\ | ||
+ | Fork may fail if the resource limits are exhausted, and in such case the | ||
+ | expected ''-1'' is returned. | ||
+ | |||
+ | There is a list of things that ''fork'' does not clone or that are reset for the | ||
+ | child process upon ''fork''. See POSIX standard or Linux manual on fork for details. | ||
+ | Notice that: | ||
+ | * all stack and heap memory is copied | ||
+ | * the list of open files is copied | ||
+ | * memory mappings are copied (unless one provides an explicit flag not to copy a mapping upon creating it) | ||
+ | * signal handlers are copied | ||
+ | * pending signals and timers are **not** copied | ||
+ | * threads are **not** copied \\ (but state of mutexes and values of semaphores are copied) \\ forking in a multithreaded application is dangerous | ||
+ | |||
+ | === Learning one's own pid === | ||
+ | |||
+ | To learn its own process identifier, the process can execute | ||
+ | <html><span style="float:right"><small>Needs header: <code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/getpid.html"></html> | ||
+ | ''pid_t **getpid**()'' | ||
+ | <html></a></html> | ||
+ | \\ | ||
+ | To learn the process identifier of its parent, the process can execute | ||
+ | <html><span style="float:right"><small>Needs header: <code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/getppid.html"></html> | ||
+ | ''pid_t **getppid**()'' | ||
+ | <html></a></html> | ||
+ | |||
+ | === Waiting for a child === | ||
+ | |||
+ | The parent process shall care for its child processes once they terminate. | ||
+ | \\ | ||
+ | The programmer may either: | ||
+ | 1) wait for the child to terminate, | ||
+ | 2) set up a signal handler for SIGCHILD (it suffices to ignore the signal), | ||
+ | <small>3) force the child out of the parent processes session (by using | ||
+ | ''setsid()'' or double ''fork''). </small> | ||
+ | |||
+ | To wait for the termination of any child, one can call ''wait''. To wait for the | ||
+ | termination of a specific pid, one can call ''waitpid'': | ||
+ | \\ | ||
+ | <html><span style="float:right"><small>Needs header:<br><code>sys/wait.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/wait.html"></html> | ||
+ | ''pid_t **wait**(int *//stat_loc//)'' \\ | ||
+ | ''pid_t **waitpid**(pid_t //pid//, int *//stat_loc//, int //options//)'' | ||
+ | <html></a></html> | ||
+ | \\ | ||
+ | These functions return the process identifier of the child process which | ||
+ | terminated. | ||
+ | |||
+ | The functions write the status of the termination to the memory pointed by | ||
+ | ''//stat_loc//'' (provided it's not ''NULL'') . | ||
+ | The status provides information whether the process terminated normally (by | ||
+ | calling ''exit'' or returning from ''main'') and if it did, then which value | ||
+ | it returned. | ||
+ | See the documentation for a list of macros that extract the information. | ||
+ | |||
+ | If the ''//pid//'' in ''waitpid'' is positive, then it is considered the pid | ||
+ | of the child to wait for. Non-positive values have special meanings. Among | ||
+ | others, ''-1'' waits for any child. | ||
+ | |||
+ | The ''//options//'' argument of ''waitpid'' is a combination of flags. | ||
+ | The value of ''0'' simply waits until the pid terminates. | ||
+ | The flag ''WNOHANG'' makes ''waitpid'' return immediately (and the return value | ||
+ | indicates whether a child has terminated). | ||
+ | |||
+ | === Exercises === | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that: | ||
+ | \\ 1. sleeps five seconds, | ||
+ | \\ 2. prints a text with ''write'', | ||
+ | \\ 3. forks, | ||
+ | \\ 4. prints another text with ''write'', | ||
+ | \\ 5. sleeps another five seconds. | ||
+ | \\ | ||
+ | Run the program and observe it in a live process viewer. \\ | ||
+ | <small> <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/sleep.html"></html> | ||
+ | ''sleep(int //sec//)''<html></a></html> sleeps with second resolution. (The other POSIX sleep function is | ||
+ | ''[[https://pubs.opengroup.org/onlinepubs/9699919799/functions/nanosleep.html|nanosleep]]''; | ||
+ | C standard includes now the equivalent ''[[https://en.cppreference.com/w/c/thread/thrd_sleep|thrd_sleep]]'') | ||
+ | \\ | ||
+ | To keep an eye on the application, either use ''htop'' / ''top'', or try ''watch -n 0.1 ps -lHC //executable_name//''. | ||
+ | </small> | ||
+ | |||
+ | ~~Exercise.#~~ Fork and print the result of ''fork'', ''getpid'' and ''getppid''. | ||
+ | |||
+ | ~~Exercise.#~~ Fork. Print ''child'' in the child process and ''parent'' in the | ||
+ | parent process. | ||
+ | |||
+ | ~~Exercise.#~~ Fork. In the child process, exit immediately. In the parent, | ||
+ | wait for input (e.g., do a ''read'' from 0 or invoke ''getchar''). \\ | ||
+ | Run the program and observe the zombie using ''htop'' / ''top'' / ''ps''. | ||
+ | |||
+ | |||
+ | ===== Exec… ===== | ||
+ | |||
+ | In order to run an executable file, an existing process has to 'exec' into it – | ||
+ | that is, the process has to ask the kernel to replace its memory with the code | ||
+ | (and data) of the executable file. | ||
+ | \\ | ||
+ | So, typically to start a new process, one has to ''fork'' and then ''exec…'' in the child: | ||
+ | <html><div style="width: fit-content"><pre style="line-height: 1"> | ||
+ | prog1 ,------. | ||
+ | -------------------| fork |-------------------------------------- | ||
+ | pid: x (ppid: y) `------\ | ||
+ | \ prog1 ,-------. prog2 | ||
+ | `-------------------| exec… |-------- | ||
+ | pid: z (ppid: x) `-------' | ||
+ | </pre></div></html> | ||
+ | |||
+ | To this end, a family of functions starting with ''exec'' is provided: \\ | ||
+ | <html><span style="float:right"><small>Needs header:<br><code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html"></html> | ||
+ | ''int **execlp**(const char *//file//, const char *arg0, ... /*, (char *)**0** */)'' \\ | ||
+ | ''int **execl **(const char *//path//, const char *arg0, ... /*, (char *)**0** */)'' \\ | ||
+ | ''int **execle**(const char *//path//, const char *arg0, ... /*, (char *)**0**,*/ char *const //envp//[])'' \\ | ||
+ | ''int **execvp**(const char *//file//, char *const //argv//[])'' \\ | ||
+ | ''int **execv **(const char *//path//, char *const //argv//[])'' \\ | ||
+ | ''int **execve**(const char *//path//, char *const //argv//[], char *const //envp//[])'' | ||
+ | <html></a></html> | ||
+ | \\ | ||
+ | (Those functions are documented more cleanly in | ||
+ | [[https://man7.org/linux/man-pages/man3/exec.3.html|the Linux manual]]). | ||
+ | |||
+ | After a successful execution of the ''exec…'' function the next instructions of | ||
+ | the process are those of the new executable file. Hence, it is pointless to check | ||
+ | the return value of ''exec…'' – it may only return ''-1'', and if the instructions | ||
+ | following ''exec…'' do execute, then ''exec…'' must have failed. | ||
+ | |||
+ | Importantly, upon executing ''exec…'' the list of open files is retained. | ||
+ | \\ | ||
+ | Almost all resources are released. See the documentation for other exceptions. | ||
+ | |||
+ | ''exec**l**…'' vs ''exec**v**…''\\ | ||
+ | The former takes a list of arguments (terminated by a NULL | ||
+ | [[https://en.wikipedia.org/wiki/Sentinel_value|sentinel]]), the latter takes | ||
+ | a pointer to the argument vector (the last element must be a NULL sentinel | ||
+ | as well). | ||
+ | <html><div style="margin-bottom:-1.4em"></div></html> | ||
+ | <code c> | ||
+ | char arg0[] = "ls"; char arg2[] = "-a"; | ||
+ | char arg1[] = "-l"; char arg3[] = "/tmp"; | ||
+ | # Argument list: | ||
+ | execlp("ls", arg0, arg1, arg2, arg3, NULL); | ||
+ | # Argument vector: | ||
+ | char *argv[] = {arg0, arg1, arg2, arg3, NULL}; | ||
+ | execvp("ls", argv); | ||
+ | </code> | ||
+ | Notice that the 0th argument is the program name. | ||
+ | |||
+ | ''exec…**p**'' vs ''exec…'' (without ''p''): \\ | ||
+ | The latter requires a path to the | ||
+ | executable, and the former searches for the executable within directories | ||
+ | specified by the ''PATH'' environmental variable if a name rather than a path was | ||
+ | provided (//ls// is a name, ///bin/ls// is a path, //./ls// is a path). | ||
+ | <html><div style="margin-bottom:-1.4em"></div></html> | ||
+ | <code c> | ||
+ | # succeeds iff there is an executable file named 'ls' in the current working directory | ||
+ | execl ("ls", "ls", "-la", "/tmp", NULL); | ||
+ | # succeeds iff there is an executable file named 'ls' in one of the directories in the PATH list | ||
+ | execlp("ls", "ls", "-la", "/tmp", NULL); | ||
+ | # succeeds iff '/bin/ls' is an executable file | ||
+ | execl ("/bin/ls", "ls", "-la", "/tmp", NULL); | ||
+ | # as above - searching the PATH is abandoned if the argument is a path | ||
+ | execlp("/bin/ls", "ls", "-la", "/tmp", NULL); | ||
+ | </code> | ||
+ | |||
+ | UNIX-like and/or POSIX-compliant operating systems use | ||
+ | [[https://en.wikipedia.org/wiki/Environment_variable|environment variables]]. | ||
+ | By default the values of all such variables are inherited upon ''exec…''. | ||
+ | \\ | ||
+ | ''exec…**e**'' functions have an extra argument that should point to an array | ||
+ | of the environment variables, allowing thereby to override them for newly | ||
+ | exec'ed process. | ||
+ | \\ | ||
+ | <small>To access an unprocessed array of environment variables for the current process, | ||
+ | one must have in the source code the following lines: | ||
+ | <code c> | ||
+ | #include <unistd.h> | ||
+ | extern char **environ; | ||
+ | </code> | ||
+ | The ''environ'' external variable and </small> the ''//envp//'' argument of ''exec…e'' functions use a NULL sentinel. | ||
+ | \\ | ||
+ | (Normally, to access such variables of the running process one shall use the | ||
+ | ''[[https://en.cppreference.com/w/cpp/utility/program/getenv|getenv]]'' and | ||
+ | ''[[https://pubs.opengroup.org/onlinepubs/9699919799/functions/setenv.html|setenv]]'' | ||
+ | functions. \\ | ||
+ | <small> The ''environ'' variable is useful when one wants to pass a slightly | ||
+ | modified set of environment variables.</small>) \\ | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that executes ''ps'' with the argument ''-F''. | ||
+ | (Remember that the arguments include the command name.) | ||
+ | |||
+ | ~~Exercise.#~~ Write a program ''//prog//'' that, when executed as | ||
+ | ''//prog// [//arg//]...'', executes the ''ls -l -t -r [//arg//]...'' command. | ||
+ | |||
+ | ~~Exercise.#~~ Write a program ''//prog//'' that, when executed as | ||
+ | ''//prog// //cmd// [//arg//]...'', executes and measures the runtime (as wall clock | ||
+ | time) of ''//cmd// [//arg//]...''. | ||
+ | \\ | ||
+ | <small> | ||
+ | To measure time, you can use the following C11 code: | ||
+ | <code c> | ||
+ | #include <time.h> | ||
+ | ... | ||
+ | struct timespec start, end; | ||
+ | timespec_get(&start, TIME_UTC); | ||
+ | ... | ||
+ | timespec_get(&end, TIME_UTC); | ||
+ | double elapsedSec = (end.tv_sec - start.tv_sec) + (end.tv_nsec - start.tv_nsec) / 1e9; | ||
+ | </code> | ||
+ | </small> | ||
+ | |||
+ | <small> | ||
+ | ~~Exercise.#~~ Modify the code of the previous exercise: | ||
+ | \\ • close the standard input and output of the parent process right after ''fork'' | ||
+ | \\ • write output from parent process to the standard error | ||
+ | \\ • output whether the child process executed normally and output its return value | ||
+ | </small> | ||
+ | |||
+ | ===== Duplicating file descriptors ===== | ||
+ | |||
+ | POSIX defines the following functions: \\ | ||
+ | <html><span style="float:right"><small>Needs header:<br><code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/dup.html"></html> | ||
+ | ''int **dup**(int //fildes//)'' \\ | ||
+ | ''int **dup2**(int //fildes//, int //target//)'' | ||
+ | <html></a></html> \\ | ||
+ | to duplicate file descriptors. | ||
+ | |||
+ | Duplicating a file descriptor is something different than opening the same file | ||
+ | twice. When opening the same file twice, one can choose different set of flags | ||
+ | (such as O_RDONLY, O_RDWR, O_APPEND), and the descriptors have a different position | ||
+ | in file (byte that will be read/written upon next read/write). Duplicated file | ||
+ | descriptors refer to the same state of the file (flags, position etc.). | ||
+ | |||
+ | | Hover mouse over lines of code to see what happens in the OS upon ''open''/''dup'':|| | ||
+ | | Opening twice: | <html><object id="svg-object" data="/jkonczak/_media/fd_maint_open.svg" type="image/svg+xml"></object></html> | | ||
+ | || | | ||
+ | | Duplicating: | <html><object id="svg-object" data="/jkonczak/_media/fd_maint_dup.svg" type="image/svg+xml"></object></html> | | ||
+ | |||
+ | The ''dup2'' function **atomically** closes the descriptor //target//, and then | ||
+ | duplicates the descriptor //fildes// to the descriptor //target//. | ||
+ | \\ | ||
+ | This is commonly used to replace standard streams, as in the following example | ||
+ | (that shows redirection of the standard output): | ||
+ | |||
+ | <html><p style="text-align: center"><object id="svg-object" data="/jkonczak/_media/fd_maint_dup2.svg" type="image/svg+xml"></object></p></html> | ||
+ | |||
+ | Each file descriptor has to be closed separately. | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that executes ''ps'' with the argument ''-F'' | ||
+ | and writes its output to a file ''//output//''. | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that, when executed as ''//prog// //fname// [//arg//]...'', | ||
+ | will act as ''tr [//arg//]... < fname''. | ||
+ | |||
+ | <small> | ||
+ | ~~Exercise.#~~ Write a program that: | ||
+ | \\ • assigns ''ELOOP'' to ''errno'' | ||
+ | \\ • executes ''perror'' function so that the error is output on the terminal | ||
+ | \\ • executes ''perror'' function so that the error is written to a file | ||
+ | \\ • assigns ''EMFILE'' to ''errno'' | ||
+ | \\ • executes ''perror'' function so that the error is output on the terminal | ||
+ | \\ • executes ''perror'' function so that the error is written to a file | ||
+ | \\ Notice that ''perror'' always writes to standard error, so to make it write to | ||
+ | a file, one has to replace the file descriptor ''2''. | ||
+ | </small> | ||
+ | |||
+ | |||
+ | ===== Blocking vs non-blocking operations ===== | ||
+ | |||
+ | Most POSIX functions complete in a limited number of steps. | ||
+ | |||
+ | But when the user invokes certain functions, then in well-defined circumstances | ||
+ | it makes sense to wait until a particular thing happens. | ||
+ | |||
+ | For instance, when a ''php -S 0:8080 2>&1 | tee -a log'' command is executed in | ||
+ | a shell, then it is desired that when the ''tee'' program attempts to read from | ||
+ | the output of ''php'' program, then the ''read'' function waits until the ''php'' | ||
+ | wrote something. | ||
+ | |||
+ | When a function stops not because it is not given the CPU time but because it | ||
+ | waits for something to happen, then it **blocks**. | ||
+ | \\ | ||
+ | Functions that may block are called **blocking**. (Cf. definition of | ||
+ | [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_77|blocking]] | ||
+ | in POSIX standard.) | ||
+ | |||
+ | **Blocking may take indefinite time. When a blocking function is used, the | ||
+ | programmer must always account that a call to the function may stop the thread | ||
+ | that invoked it for an arbitrary time.** | ||
+ | |||
+ | There is usually a way to invoke blocking functions in a non-blocking mode. | ||
+ | \\ | ||
+ | **When a blocking functions is used in non-blocking mode, then it either does | ||
+ | what it's supposed to do without waiting, or it returns ''-1'' and sets | ||
+ | ''errno'' to EWOULDBLOCK or EAGAIN**. | ||
+ | \\ | ||
+ | When one uses non-blocking mode, one must handle the case when the function | ||
+ | failed to do (a part of) what it was supposed to do. | ||
+ | |||
+ | For functions related to file descriptors, the blocking / non-blocking mode | ||
+ | is selected by a O_NONBLOCK flag for a file descriptor. | ||
+ | \\ | ||
+ | To set/clear the O_NONBLOCK flag, one shall first read the flags with | ||
+ | ''int flags = [[https://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html|fcntl]](fd, F_GETFL);'', | ||
+ | then set/clear the flag (e.g., ''flags |= O_NONBLOCK;'') and finally set the | ||
+ | new flags with ''fcntl(fd, F_SETFL, flags);''. | ||
+ | |||
+ | ===== Pipes & FIFOs ===== | ||
+ | |||
+ | A **pipe** is an unidirectional communication channel – a pair of file descriptors | ||
+ | such that any data written to the second descriptor can be read from the first | ||
+ | descriptor. A pipe is created with the following function: | ||
+ | \\ | ||
+ | <html><span style="float:right"><small>Needs header: <code>unistd.h</code></small></span> | ||
+ | <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/pipe.html"></html> | ||
+ | ''int **pipe**(int //fildes//[2])'' | ||
+ | <html></a></html> | ||
+ | \\ | ||
+ | The ''//fildes//[0]'' is opened for reading, and the ''//fildes//[1]'' is opened | ||
+ | for writing. | ||
+ | \\ | ||
+ | Pipes can be used to send data from one process to another process, or from | ||
+ | one thread to another thread of the same process((While using a pipe to communicate | ||
+ | threads is inefficient, it has its use cases. For instance, it enables waking a | ||
+ | thread that is waiting for I/O from multiple sources with ''select'' or ''poll''.)). | ||
+ | |||
+ | By default, pipes are **blocking**, that is reading data from a pipe will | ||
+ | stall the thread that invoked ''read'' until some data is written to a pipe. | ||
+ | Also, writing data to a pipe will block when sufficiently many bytes were | ||
+ | already written and are not yet read from the pipe. | ||
+ | \\ | ||
+ | <small> | ||
+ | In non-blocking mode the ''write'' function may write only a part of the data. | ||
+ | </small> | ||
+ | |||
+ | When all file descriptors that allowed writing to a pipe are closed, a read from | ||
+ | the pipe will return ''0''. | ||
+ | \\ | ||
+ | When all file descriptors that allowed reading from a pipe are closed, a write | ||
+ | to the pipe will first raise SIGPIPE, then return ''-1'' and set ''errno'' to | ||
+ | ''EPIPE'' (provided the process did not terminate upon SIGPIPE). | ||
+ | |||
+ | To share a pipe between two processes, one must create a pipe in one process and | ||
+ | ''fork'' – file descriptors are copied upon forking. | ||
+ | \\ | ||
+ | A **FIFO** file (or a named pipe) is a special file that allows opening either | ||
+ | end of a pipe by providing a path to the file. A FIFO file can be created with | ||
+ | ''mkfifo'' shell utility or by the | ||
+ | ''[[https://pubs.opengroup.org/onlinepubs/9699919799/functions/mkfifo.html|mkfifo]]'' | ||
+ | function. | ||
+ | \\ | ||
+ | A call to ''open'' on a path to a FIFO file is blocking. ''open'' returns only | ||
+ | once at least one process invoked ''open'' with O_RDONLY and at least one | ||
+ | process invoked ''open'' with O_WRONLY((POSIX defines only what happens with FIFO | ||
+ | files opened with O_RDONLY and O_WRONLY. Result of using O_RDWR with a FIFO file | ||
+ | is explicitly undefined in POSIX.)). From that point on the file descriptors | ||
+ | act as those of an (anonymous) pipe. | ||
+ | |||
+ | <small>Pipe is unidirectional. The unix //socket// is its bidirectional | ||
+ | equivalent. See ''[[https://man7.org/linux/man-pages/man7/unix.7.html|man 7 unix]]'' | ||
+ | for details.</small> | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that: | ||
+ | \\ • creates a pipe | ||
+ | \\ • forks | ||
+ | \\ • in the child process: | ||
+ | \\ · calculates a computationally expensive mathematical equation (say, "2+2") | ||
+ | \\ · writes the result to the pipe | ||
+ | \\ · terminates | ||
+ | \\ • in the parent process: | ||
+ | \\ · reads from pipe | ||
+ | \\ · writes result to the standard output | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that: | ||
+ | \\ • creates a pipe | ||
+ | \\ • creates three child processes | ||
+ | \\ • in each child process: | ||
+ | \\ · calculates a computationally expensive mathematical equation (say, "2+2") | ||
+ | \\ · writes the result to the pipe as a four-byte integer | ||
+ | \\ · terminates | ||
+ | \\ • in the parent process: | ||
+ | \\ · calculates a computationally expensive mathematical equation (say, "2+2") | ||
+ | \\ · reads from the pipe all three results <small>(and calls ''wait()'' to reap defunct children)</small> | ||
+ | \\ · writes the sum of all four results to the standard output | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that prints the result of ''ls -l'' in uppercase. | ||
+ | \\ | ||
+ | <small> You may do this e.g., by: ''pipe'', ''fork'', in one process: ''dup2'' | ||
+ | and ''exec'', in the other process: reading from pipe and changing case. | ||
+ | \\ | ||
+ | The ''[[https://en.cppreference.com/w/c/string/byte/toupper|toupper]]'' function (available from ''ctype.h'') converts a single character to upper case. | ||
+ | |||
+ | ~~Exercise.#~~ Write a program that does ''ps -eF | sort -nk6''.</small> | ||
+ | |||
+ | |||
+ | |||
+ | ~~META: | ||
+ | language = en | ||
+ | ~~ | ||