===== POSIX ===== POSIX – the Portable Operating System Interface standard – apart from standardising shell and utilities, it also defines functions, macros, and external variables to support applications portability at the C-language source level. POSIX [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/contents.html|specifies]] what should be present in the [[https://en.cppreference.com/w/c/header|C standard libraries]] (e.g., ''stdio.h'' or ''stdlib.h''), and also specifies user-level API to the operating system (those two do intersect). Libraries required by POSIX are cleanly summarised at [[https://en.wikipedia.org/wiki/C_POSIX_library|C POSIX library]] Wikipedia page. The ''[[https://en.wikipedia.org/wiki/Unistd.h|unistd.h]]'' library defines a number of basic constants and functions for interfacing the operating system. \\ The ''[[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/fcntl.h.html|fcntl.h]]'' library defines constants and functions for __f__ile __c__o__nt__ro__l__. ==== Data types ==== POSIX [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html|defines]] certain data types that are usually fancy names for certain integer types. For instance, ''ssize_t'' shall be used to store signed size of arbitrary data, ''pid_t'' shall be used to store process identifiers, ''uid_t'' shall be used for user identifiers, ''time_t'' shall be used for storing numbers of seconds etc. Programmers shall use these types, even if they know that ''time_t'' (as well as ''ssize_t'') is in fact a ''long int'' in the POSIX implementation they use. \\ This contributes to code portability and legibility. [[https://en.wikipedia.org/wiki/Strong_and_weak_typing|Type system]] of C/C%%++%% checks types after resolving ''typedef''s, therefore variables of such 'types' won't even generate compiler warnings when used interchangeably. ==== Return value conventions ==== **The majority of POSIX functions upon unsuccessful execution return ''-1'' and set the ''errno'' variable accordingly to the failure reason.** To access the ''errno'' (error number) variable, one shall ''#include ''. \\ ''errno'' is a [[https://en.cppreference.com/w/c/error/errno|part of C]] standard (since 1989). \\ Possible values of ''errno'' after executing some function are explained in the function's documentation. \\ All standard values of ''errno'' are documented [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html|here]]. To get a human-readable error explanation of the number //errnum//, one can use: * ''char * strerror(int errnum)'' defined by [[https://en.cppreference.com/w/c/string/byte/strerror|C standard]], might not be thread-safe, * ''strerror_r'' that requires the programmer to provide a buffer for the message, and ''strerror_l'' that allows to specify locale(=language); both are defined by [[https://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror.html|POSIX]] and are thread-safe, * ''void perror(const char *//str//)'' a [[https://en.cppreference.com/w/c/io/perror|C standard]] function that always uses ''errno'' to obtain //errnum// and prints ''//str//: //explanation//'' to standard error (or just ''//explanation//'' if //str// is ''NULL''). ===== File descriptors ===== A program can (obviously) use multiple files concurrently. Upon each I/O operation (read, write, …) the programmer must indicate which file should it involve. To this end, the POSIX API assigns non-negative integers to each file used by the program. Such number is called a **file descriptor**. \\ Notice that Unix introduced the idea that [[https://en.wikipedia.org/wiki/Everything_is_a_file|everything is a file]], and for POSIX a file might be just an ordinary file, but a directory is also a file, a pipe is also a file, a network connection is also a file, and so on. Internally, the operating system maintains for each process an array of open files. The file descriptor is an index in this array. \\ You can learn more [[https://en.wikipedia.org/wiki/File_descriptor|here]]. POSIX assumes that each newly started program has already three files open. These files are called the [[https://en.wikipedia.org/wiki/Standard_streams|standard streams]] and occupy first three indices in the file array, that is the numbers **0**, **1** and **2**. or, more verbosely, the equivalent fancy constants ''STDIN_FILENO'', ''STDOUT_FILENO'' and ''STDERR_FILENO''. ===== Reading & writing data ===== To read or write data, POSIX defines: \\ Needs header:
unistd.h
''ssize_t **read** (int //fildes//,       void *//buf//, size_t //nbyte//)'' \\ ''ssize_t **write**(int //fildes//, const void *//buf//, size_t //nbyte//)'' ''//fildes//'' is the file descriptor. That is, ''//fildes//'' is a number that indicates which file should be read/written. ''//buf//'' is the location in the memory where the data to be written is read from, or where the data read from file should be written to. ''//nbyte//'' tells how many bytes shall be read/written. \\ Notice that ''//buf//'' must point to sufficient space. **The functions return number of bytes successfully read/written (unless an error occurred).** \\ **Both ''read'' and ''write'' may return less bytes than they were ordered to read/write.** \\ When the files are ordinary files, this usually means that either (upon read) the file has ended, or (upon write) that the disk is full. The thread executing read/write blocks if the file is in the (default) blocking mode until the operation completes. \\ Two concurrent I/O function calls are guaranteed to execute atomically. Reading/writing advances the position in the file. \\ In some files (this include ordinary files), one may change the position within the file with: \\ Needs header: unistd.h ''off_t **lseek**(int //fd//, off_t //offset//, int //whence//)''; \\ In this function, ''//fd//'' selects the file which position should be changed, ''//whence//'' chooses if the new position is given relative to beginning of the file, current position or the end of the file by, respectively, providing ''SEEK_SET'', ''SEEK_CUR'' and ''SEEK_END'', and finally ''//offset//'' chooses the offset from the chosen ''//whence//''. **An attempt to read a file when the position is at (or beyond) the end of file returns 0.** \\ You must not confuse the constant 0 indicating the end of file in POSIX with a C file API constant called EOF that has the value of -1. An example of a basic use of read/write on standard streams:
#include #include #include #include int main() { const char prompt[] = "Tell me your name: "; write(STDOUT_FILENO, prompt, strlen(prompt)); char response[64]; int readBytes = read(STDIN_FILENO, response, 64); if (readBytes <= 0) { char msg[256]; int len = snprintf(msg, 256, "\nCould not read your name: %s\n", readBytes == -1 ? strerror(errno) : "EOF reached"); write(STDERR_FILENO, msg, len); return 1; } write(1, "Hello ", 6); write(1, response, readBytes); return 0; }
~~Exercise.#~~ Read standard input until the end of file, and write the read data to standard output. Test this both by using //Ctrl+d// to indicate end of file and by redirecting a file to the standard input. ~~Exercise.#~~ Read standard input until the end of file, and write the read data to standard output. When you reach end of file, use ''lseek'' to set position in the file to its beginning (that is, 0 bytes from ''SEEK_SET'') and repeat reading&writing. Test this by redirecting some file as standard input. ~~Exercise.#~~ Replace the standard output with a file descriptor of value ''4''. Test whether the program works if you tell the shell to open file number 4 for your program by doing a ''4>//file//'' redirection. ===== Creating, opening and closing files ===== To create or open a file, POSIX defines: \\ Needs header:
fcntl.h
''int **open**(const char *//pathname//, int //flags//)'' \\ ''int **open**(const char *//pathname//, int //flags//, mode_t //mode//)'' \\ and a ''creat'' function that is a shorthand to ''open(//pathname//, O_WRONLY|O_CREAT|O_TRUNC, //mode//)''. ''//pathname//'' is a path. ''//mode//'' is used only if ''open'' creates a new file, and it defines its permissions. Either use [[https://en.cppreference.com/w/c/language/integer_constant|octal]] number, or use symbolic constants described in the manual. ''//flags//'' is a rat's nest.\\ **''//flags//'' must contain exactly one of the following** flags: ''**O_RDONLY**'', ''**O_WRONLY**'', or ''**O_RDWR**'' that choose whether file is opened for reading, writing or both. \\ ''//flags//'' may additionally contain other flags, including the following: * ''**O_APPEND**'' sets file position to the end of a file before every write * ''**O_TRUNC**'' (shall be used only in conjunction with ''O_WRONLY'' or ''O_RDWR'') truncates the file (sets its size to 0) * ''**O_CREAT**'' tells ''open'' that if the file does not exist, it should be created * ''**O_EXCL**'' (shall be used only in conjunction with ''O_CREAT'') makes open fail when the file exists * and at least a dozen of other flags For example:
int fd1 = open("/tmp/foo", O_RDONLY); if (fd1 == -1) perror("Opening /tmp/foo for reading failed"); int fd2 = open("/tmp/baz", O_WRONLY|O_APPEND); if (fd2 == -1) perror("Opening /tmp/baz for appending (write-only) failed"); int fd3 = open("/tmp/bar", O_RDWR|O_CREAT|O_EXCL, 0600); if (fd3 == -1) perror("Creating a new file /tmp/bar failed"); // if open succeeds, the file is open for reading and writing and has permissions of 0600 To close a file, POSIX defines: \\ Needs header:unistd.h ''int **close**(int //filedes//)'' \\ that closes a file number ''//filedes//''. \\ On Linux, invoking ''close(fd)'' always closes ''fd'', even if ''close'' returns ''-1''. ~~Exercise.#~~ Open a file with hardcoded filename, read its contents and write it to standard output. ~~Exercise.#~~ Open a file specified as the first argument of your program, read its contents and write it to standard output. ~~Exercise.#~~ Open a file specified as the first argument of your program, read its contents and write it to standard output with line numbers (just like ''cat -n //file//''). \\ ''memchr'' looks up a character (e.g., ''\n'') in memory. ~~Exercise.#~~ Implement a program that checks if two files have the same contents. \\ ''memcmp'' compares two memory areas. ~~Exercise.#~~ Implement a program that works as ''paste [//file//]...''. \\ Hint: the dirty solution reads single character a time. ===== Signals ===== Disclaimer: these materials contain the very basics of signals. For comprehensive informations, see the [[https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04|POSIX standard]]. ==== Sending signals ==== To send a signal, one can use the function Needs header: signal.h ''int **kill**(pid_t //pid//, int //sig//)'' \\ This function works just like the shell ''kill'' utility. ==== Setting up signal handlers ==== To handle a signal delivered to the process, one can replace the default signal handler with a custom function. \\ The function must return ''void'' and take an ''int'' argument, that is its prototype has to be as follows: \\ ''**void** //your_function_name//**(int** //the_number_of_the_signal_delivered//**)**;''
**The following paragraph explains C syntax for function pointers.** \\ Notice that in C, to represent the type of (pointer to) such function, one has to write ''void (*)(int)''. \\ To take (a pointer to) such function as a parameter (or declare a variable of this type), one has to write: \\ ''void (*//some_name//)(int)'' \\ So a function //some_function_1// that takes a ''void (*)(int)'' function as an argument and returns, say, ''float'', looks like this: \\ ''float //some_function_1//(void (*//some_name//)(int))'' \\ And a function ''//some_function_2(…)//'' that returns a function of ''void (*)(int)'' type is written as: \\ ''void (*//some_function_2//(…))(int)''
++++ An example code that uses function pointers | int foo(char a) {return a;} int baz(char a) {return a;} int (* getFoo (void) )(char) {return foo;} int (* getFooOrBaz(long w))(char) {if(w) return foo; return baz;} int (*savedFunc)(char); // a variable that stores (a pointer to) a function int (*getFunc())(char){ // a function returning (a pointer to) a function return savedFunc; } void setFunc(int (*arg)(char)) {savedFunc = arg;} int (*getOldSetNewFunc(int (*newFunc)(char)))(char) { int (*oldFunc)(char) = savedFunc; savedFunc = newFunc; return oldFunc; } void doSomething(){ setFunc(&foo); // both lines are correct, a function name is an implicit setFunc(baz); // pointer to the function int result1 = savedFunc('a'); // calls the function int (*x)(char) = getFooOrBaz(0); // saves (a pointer to) a function returned by getFooOrBaz int result2 = x('a'); // calls the x (that points to baz) with the argument 'a' // getFooOrBaz(1) returns a function foo, then foo('a') is called and returns the result int result3 = getFooOrBaz(1)('a'); // passes 'foo' (= result of getFooOrBaz) to getOldSetNewFunc // and calls the resut of getOldSetNewFunc ('baz') with argument 'a' int result4 = getOldSetNewFunc(getFooOrBaz(1))('a'); } ++++
C+%%%%+ has a MUCH better support for functional programming.
To replace a signal handler, one can use the following function: \\ Needs header: signal.h ''void (***signal**(int //sig//, void (*//func//)(int)))(int)''
That definition looks ugly, but hey, it's even a part of [[https://en.cppreference.com/w/c/program/signal|the C standard]]. \\ There's a way to make it look better: let's define a type that refers to functions taking an ''int'' and returning nothing:\\ ''typedef void (*sighandler_t)(int);''\\ Now, __the same__ ''signal'' function looks like this: \\ ''sighandler_t **signal**(int //signum//, sighandler_t //func//);'' \\ (This way of writing ''signal'' function is preferred in the [[https://man7.org/linux/man-pages/man2/signal.2.html|Linux manual]].)
The ''signal'' function replaces the handler for signal //signum// with the function //func//. A single function can be used for multiple signals, hence when the function is called, the signal number is passed as an argument. See the following example:
#include #include #include // A fprintf in a signal handler void handleSignals(int num) { // is an undefined behaviour, if (num == SIGINT) // hence this program is not fprintf(stderr, "You probably pressed Ctrl+c\n"); // correct. It's still probably else if (num == SIGTERM) // going to work as expected in fprintf(stderr, "You probably run 'kill %d'\n", getpid()); // in this simple case. } int main() { signal(SIGINT, handleSignals); signal(SIGTERM, handleSignals); while (1) pause(); } There are two special values for signal handlers: ''SIG_DFL'' and **''SIG_IGN''**. The former resets the signal handler to the default handler, and the latter ignores the signal. For instance, when one wants the program to ignore the HUP signal, one can simply write: \\ ''signal(SIGHUP, SIG_IGNORE);'' ==== Signal handlers ==== === How they work === When a signal is delivered to a process by the operating system, the operating system stops one of the threads of the process, modifies the stack so that it looks as if the previous instruction called the signal handler, and sets the program counter (the CPU register that tells which instruction should be executed next) to the first instruction of the signal handler. \\ **This means that a signal handler can be executed in any, possibly completely inconvenient, moment.** \\ When the signal handler returns, the control flow returns to the place where it's been interrupted by the signal. \\ By default, interrupted system calls restart – for instance, when a signal is received while waiting in ''read'' for input, then after returning from the signal handler the ''read'' is continued. === Restrictions === Since a signal handler can be called anytime, possibly in the middle of executing a complex action, only a subset of functions can be safely called from within the signal handler. Such functions are marked as **async-signal-safe** functions. \\ [[https://man7.org/linux/man-pages/man7/signal-safety.7.html|Linux manual]] and [[https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03|POSIX standard]] provide a list of async-signal-safe functions. Even worse, "anytime" means even in the middle of an assignment – did you know that an innocently looking ''i = 0;'' is split into two machine instructions if ''i'' is a ''long long int'' and you run this code on a 32-bit machine? \\ Thus, naturally, there are restrictions on accessing variables declared outside the signal handler from the handler code. \\ It is only guaranteed that variables of the ''volatile sig_atomic_t'' type (and [[https://en.cppreference.com/w/c/atomic/ATOMIC_LOCK_FREE_consts|lock-free atomic variables]]) can be safely accessed. \\ One may also use ''errno'' provided one saves it beforehand and restores its previous value after use. **These restrictions are not enforced anyhow.** If you don't follow these rules, you end up in the broad field of undefined behaviour.
==== [Extra] ==== To control actions done by the operating system on signal delivery more accurately, one can use the ''sigaction'' function. \\ The ''sigaction'' can also be used to set up a signal handler that gets more information on the incoming signal. \\ Delivery of signals can be blocked (and later unblocked) e.g., by calling ''sigprocmask''. \\ Signal handlers can have their own list of blocked signals (to prevent nesting signals). \\ There is a number of functions that wait until a signal occurs, e.g., ''sigwait'', ''sigsuspend'', ''pause''. \\ ''kill'' delivers a signal to an arbitrary thread of a given process. \\ ''raise'' delivers a signal to the current thread. ''pthread_kill'' delivers a signal to a specified thread of the current process. \\ While there is no POSIX function that delivers signal to a specified thread of another process, there is a Linux-specific ''tgkill'' function that does that.
==== Exercises ==== ~~Exercise.#~~ Write a program that prints ''Shutting down...'' and exits when it receives SIGINT (Ctrl+c). \\ The program should either sleep (''sleep''), or wait for an input (''read'' / ''getchar''), or explicitly wait for a signal (''pause''). ~~Exercise.#~~ Write a program that on getting a USR1 signal: \\     • opens a file called ''signal_log'', \\     • appends current [[https://en.wikipedia.org/wiki/Unix_time|timestamp]] to it, \\     • closes the file. \\ Use to following code to write the current timestamp: #include #include void writeDateTo(int fd) { struct timespec now; char buf[22]; clock_gettime(CLOCK_REALTIME, &now); buf[20] = '\n'; for (int i = 0; i < 10; ++i, now.tv_nsec /= 10) buf[19 - i] = '0' + now.tv_nsec % 10; buf[10] = '.'; for (int i = 0; i < 10; ++i, now.tv_sec /= 10) buf[9 - i] = '0' + now.tv_sec % 10; write(fd, buf, 21); } ~~META: language = en ~~