Teaching:
FeedbackPOSIX – the Portable Operating System Interface standard – apart from standardising shell and utilities, it also defines functions, macros, and external variables to support applications portability at the C-language source level.
POSIX specifies
what should be present in the C standard libraries
(e.g., stdio.h
or stdlib.h
), and also specifies user-level API to the
operating system (those two do intersect).
Libraries required by POSIX are cleanly summarised at C POSIX library Wikipedia page.
The unistd.h
library defines
a number of basic constants and functions for interfacing the operating system.
The fcntl.h
library defines constants and functions for file control.
POSIX defines certain data types that are usually fancy names for certain integer types.
For instance, ssize_t
shall be used to store signed size of arbitrary data,
pid_t
shall be used to store process identifiers, uid_t
shall be used
for user identifiers, time_t
shall be used for storing numbers of seconds etc.
Programmers shall use these types, even if they know that time_t
(as well as
ssize_t
) is in fact a long int
in the POSIX implementation they use.
This contributes to code portability and legibility.
Type system of C/C++
checks types after resolving typedef
s, therefore variables of such 'types'
won't even generate compiler warnings when used interchangeably.
The majority of POSIX functions upon unsuccessful execution return -1
and
set the errno
variable accordingly to the failure reason.
To access the errno
(error number) variable, one shall #include <errno.h>
.
errno
is a part of C standard (since 1989).
Possible values of errno
after executing some function are explained in the
function's documentation.
All standard values of errno
are documented
here.
To get a human-readable error explanation of the number errnum, one can use:
char * strerror(int errnum)
defined by C standard,
might not be thread-safe,strerror_r
that requires the programmer to provide a buffer for the message,
and strerror_l
that allows to specify locale(=language); both are
defined by POSIX
and are thread-safe,void perror(const char *str)
a C standard
function that always uses errno
to obtain errnum and prints
str: explanation
to standard error (or just explanation
if str is NULL
).
A program can (obviously) use multiple files concurrently.
Upon each I/O operation (read, write, …) the programmer must indicate which
file should it involve. To this end, the POSIX API assigns non-negative integers
to each file used by the program. Such number is called a file descriptor.
Notice that Unix introduced the idea that everything is a file,
and for POSIX a file might be just an ordinary file, but a directory is also a file,
a pipe is also a file, a network connection is also a file, and so on.
Internally, the operating system maintains for each process an array of open
files. The file descriptor is an index in this array.
You can learn more here.
POSIX assumes that each newly started program has already three files open. These
files are called the standard streams
and occupy first three indices in the file array, that is the numbers 0, 1 and 2.
or, more verbosely, the equivalent fancy constants STDIN_FILENO
, STDOUT_FILENO
and
STDERR_FILENO
.
To read or write data, POSIX defines:
Needs header:unistd.h
ssize_t read (int fildes, void *buf, size_t nbyte)
ssize_t write(int fildes, const void *buf, size_t nbyte)
fildes
is the file descriptor. That is, fildes
is a number that
indicates which file should be read/written.
buf
is the location in the memory where the data to be written is
read from, or where the data read from file should be written to.
nbyte
tells how many bytes shall be read/written.
Notice that buf
must point to sufficient space.
The functions return number of bytes successfully read/written (unless an error
occurred).
Both read
and write
may return less bytes than they were ordered to
read/write.
When the files are ordinary files, this usually means that either (upon
read) the file has ended, or (upon write) that the disk is full.
The thread executing read/write blocks if the file is in the (default)
blocking mode until the operation completes.
Two concurrent I/O function calls are guaranteed to execute atomically.
Reading/writing advances the position in the file.
In some files (this include ordinary files), one may change the position within the file with:
Needs header: unistd.h
off_t lseek(int fd, off_t offset, int whence)
;
In this function, fd
selects the file which position should be changed,
whence
chooses if the new position is given relative to beginning of the
file, current position or the end of the file by, respectively, providing
SEEK_SET
, SEEK_CUR
and SEEK_END
, and finally offset
chooses
the offset from the chosen whence
.
An attempt to read a file when the position is at (or beyond) the end of file
returns 0.
You must not confuse the constant 0 indicating the end of file in POSIX
with a C file API constant called EOF that has the value of -1.
An example of a basic use of read/write on standard streams:
#include <errno.h> #include <stdio.h> #include <string.h> #include <unistd.h> int main() { const char prompt[] = "Tell me your name: "; write(STDOUT_FILENO, prompt, strlen(prompt)); char response[64]; int readBytes = read(STDIN_FILENO, response, 64); if (readBytes <= 0) { char msg[256]; int len = snprintf(msg, 256, "\nCould not read your name: %s\n", readBytes == -1 ? strerror(errno) : "EOF reached"); write(STDERR_FILENO, msg, len); return 1; } write(1, "Hello ", 6); write(1, response, readBytes); return 0; }
Exercise 1 Read standard input until the end of file, and write the read data to standard output. Test this both by using Ctrl+d to indicate end of file and by redirecting a file to the standard input.
Exercise 2 Read standard input until the end of file, and write the read data
to standard output. When you reach end of file, use lseek
to set position
in the file to its beginning (that is, 0 bytes from SEEK_SET
) and repeat
reading&writing. Test this by redirecting some file as standard input.
Exercise 3 Replace the standard output with a file descriptor of value 4
.
Test whether the program works if you tell the shell to open file number 4 for
your program by doing a 4>file
redirection.
To create or open a file, POSIX defines:
Needs header:fcntl.h
int open(const char *pathname, int flags)
int open(const char *pathname, int flags, mode_t mode)
and a creat
function that is a shorthand to open(pathname, O_WRONLY|O_CREAT|O_TRUNC, mode)
.
pathname
is a path.
mode
is used only if open
creates a new file, and it defines its
permissions. Either use octal
number, or use symbolic constants described in the manual.
flags
is a rat's nest.
flags
must contain exactly one of the following flags: O_RDONLY
, O_WRONLY
, or O_RDWR
that choose whether file is opened for reading, writing or both.
flags
may additionally contain other flags, including the following:
O_APPEND
sets file position to the end of a file before every writeO_TRUNC
(shall be used only in conjunction with O_WRONLY
or O_RDWR
) truncates the file (sets its size to 0)O_CREAT
tells open
that if the file does not exist, it should be createdO_EXCL
(shall be used only in conjunction with O_CREAT
) makes open fail when the file existsFor example:
int fd1 = open("/tmp/foo", O_RDONLY); if (fd1 == -1) perror("Opening /tmp/foo for reading failed"); int fd2 = open("/tmp/baz", O_WRONLY|O_APPEND); if (fd2 == -1) perror("Opening /tmp/baz for appending (write-only) failed"); int fd3 = open("/tmp/bar", O_RDWR|O_CREAT|O_EXCL, 0600); if (fd3 == -1) perror("Creating a new file /tmp/bar failed"); // if open succeeds, the file is open for reading and writing and has permissions of 0600
To close a file, POSIX defines:
Needs header:unistd.h
int close(int filedes)
that closes a file number filedes
.
On Linux, invoking close(fd)
always closes fd
, even if close
returns -1
.
Exercise 4 Open a file with hardcoded filename, read its contents and write it to standard output.
Exercise 5 Open a file specified as the first argument of your program, read its contents and write it to standard output.
Exercise 6 Open a file specified as the first argument of your program, read
its contents and write it to standard output with line numbers (just like
cat -n file
).
memchr
looks up a character (e.g., \n
) in memory.
Exercise 7 Implement a program that checks if two files have the same contents.
memcmp
compares two memory areas.
Exercise 8 Implement a program that works as paste [file]...
.
Hint: the dirty solution reads single character a time.
Disclaimer: these materials contain the very basics of signals. For comprehensive informations, see the POSIX standard.
To send a signal, one can use the function
Needs header: signal.h
int kill(pid_t pid, int sig)
This function works just like the shell kill
utility.
To handle a signal delivered to the process, one can replace the default signal
handler with a custom function.
The function must return void
and take an int
argument, that is its
prototype has to be as follows:
void your_function_name(int the_number_of_the_signal_delivered);
An example code that uses function pointers
Notice that in C, to represent the type of (pointer to) such function, one has
to write void (*)(int)
.
To take (a pointer to) such function as a parameter (or declare a variable of
this type), one has to write:
void (*some_name)(int)
So a function some_function_1 that takes a void (*)(int)
function as an
argument and returns, say, float
, looks like this:
float some_function_1(void (*some_name)(int))
And a function some_function_2(…)
that returns a function of
void (*)(int)
type is written as:
void (*some_function_2(…))(int)
int foo(char a) {return a;}
int baz(char a) {return a;}
int (* getFoo (void) )(char) {return foo;}
int (* getFooOrBaz(long w))(char) {if(w) return foo; return baz;}
int (*savedFunc)(char); // a variable that stores (a pointer to) a function
int (*getFunc())(char){ // a function returning (a pointer to) a function
return savedFunc;
}
void setFunc(int (*arg)(char)) {savedFunc = arg;}
int (*getOldSetNewFunc(int (*newFunc)(char)))(char) {
int (*oldFunc)(char) = savedFunc;
savedFunc = newFunc;
return oldFunc;
}
void doSomething(){
setFunc(&foo); // both lines are correct, a function name is an implicit
setFunc(baz); // pointer to the function
int result1 = savedFunc('a'); // calls the function
int (*x)(char) = getFooOrBaz(0); // saves (a pointer to) a function returned by getFooOrBaz
int result2 = x('a'); // calls the x (that points to baz) with the argument 'a'
// getFooOrBaz(1) returns a function foo, then foo('a') is called and returns the result
int result3 = getFooOrBaz(1)('a');
// passes 'foo' (= result of getFooOrBaz) to getOldSetNewFunc
// and calls the resut of getOldSetNewFunc ('baz') with argument 'a'
int result4 = getOldSetNewFunc(getFooOrBaz(1))('a');
}
To replace a signal handler, one can use the following function:
Needs header: signal.h
void (*signal(int sig, void (*func)(int)))(int)
There's a way to make it look better: let's define a type that refers to
functions taking an int
and returning nothing:
typedef void (*sighandler_t)(int);
Now, the same signal
function looks like this:
sighandler_t signal(int signum, sighandler_t func);
(This way of writing signal
function is preferred in the Linux manual.)
The signal
function replaces the handler for signal signum with the
function func. A single function can be used for multiple signals, hence
when the function is called, the signal number is passed as an argument.
See the following example:
#include <signal.h> #include <stdio.h> #include <unistd.h> // A fprintf in a signal handler void handleSignals(int num) { // is an undefined behaviour, if (num == SIGINT) // hence this program is not fprintf(stderr, "You probably pressed Ctrl+c\n"); // correct. It's still probably else if (num == SIGTERM) // going to work as expected in fprintf(stderr, "You probably run 'kill %d'\n", getpid()); // in this simple case. } int main() { signal(SIGINT, handleSignals); signal(SIGTERM, handleSignals); while (1) pause(); }
There are two special values for signal handlers: SIG_DFL
and SIG_IGN
.
The former resets the signal handler to the default handler, and the latter
ignores the signal.
For instance, when one wants the program to ignore the HUP signal, one can simply write:
signal(SIGHUP, SIG_IGNORE);
When a signal is delivered to a process by the operating system, the operating
system stops one of the threads of the process, modifies the stack so that it
looks as if the previous instruction called the signal handler, and sets the
program counter (the CPU register that tells which instruction should be
executed next) to the first instruction of the signal handler.
This means that a signal handler can be executed in any, possibly completely
inconvenient, moment.
When the signal handler returns, the control flow returns to the place where
it's been interrupted by the signal.
By default, interrupted system calls restart – for instance, when a signal is
received while waiting in read
for input, then after returning from the
signal handler the read
is continued.
Since a signal handler can be called anytime, possibly in the middle of executing
a complex action, only a subset of functions can be safely called from within
the signal handler. Such functions are marked as async-signal-safe functions.
Linux manual and
POSIX standard
provide a list of async-signal-safe functions.
Even worse, "anytime" means even in the middle of an assignment – did you know
that an innocently looking i = 0;
is split into two machine instructions if
i
is a long long int
and you run this code on a 32-bit machine?
Thus, naturally, there are restrictions on accessing variables declared outside
the signal handler from the handler code.
It is only guaranteed that variables of the volatile sig_atomic_t
type
(and lock-free atomic variables)
can be safely accessed.
One may also use errno
provided one saves it beforehand and restores its
previous value after use.
These restrictions are not enforced anyhow. If you don't follow these rules, you end up in the broad field of undefined behaviour.
To control actions done by the operating system on signal delivery more accurately,
one can use the
[Extra]
sigaction
function.
The sigaction
can also be used to set up a signal handler that gets more
information on the incoming signal.
Delivery of signals can be blocked (and later unblocked) e.g., by calling
sigprocmask
.
Signal handlers can have their own list of blocked signals (to prevent nesting
signals).
There is a number of functions that wait until a signal occurs, e.g., sigwait
,
sigsuspend
, pause
.
kill
delivers a signal to an arbitrary thread of a given process.
raise
delivers a signal to the current thread. pthread_kill
delivers a signal to a specified thread of the current process.
While there is no POSIX function that delivers signal to a specified thread of
another process, there is a Linux-specific tgkill
function that does that.
Exercise 9 Write a program that prints Shutting down...
and exits when
it receives SIGINT (Ctrl+c).
The program should either sleep (sleep
), or wait for an input
(read
/ getchar
), or explicitly wait for a signal (pause
).
Exercise 10 Write a program that on getting a USR1 signal:
• opens a file called signal_log
,
• appends current timestamp to it,
• closes the file.
Use to following code to write the current timestamp:
#include <time.h> #include <unistd.h> void writeDateTo(int fd) { struct timespec now; char buf[22]; clock_gettime(CLOCK_REALTIME, &now); buf[20] = '\n'; for (int i = 0; i < 10; ++i, now.tv_nsec /= 10) buf[19 - i] = '0' + now.tv_nsec % 10; buf[10] = '.'; for (int i = 0; i < 10; ++i, now.tv_sec /= 10) buf[9 - i] = '0' + now.tv_sec % 10; write(fd, buf, 21); }