Różnice między wybraną wersją a wersją aktualną.
Both sides previous revision Poprzednia wersja Nowa wersja | Poprzednia wersja | ||
os_cp:open [2023/04/05 12:34] jkonczak [Creating, opening and closing files] |
— (aktualna) | ||
---|---|---|---|
Linia 1: | Linia 1: | ||
- | ===== Warm-up ===== | ||
- | |||
- | ~~Exercise.#~~ Create a program, compile it under the name //prog//. | ||
- | |||
- | ~~Exercise.#~~ Print a static text with: | ||
- | - C standard I/O library (''#include <stdio.h>''): | ||
- | - ''printf'', | ||
- | - ''fprintf'', | ||
- | - POSIX ''write'' function (see ''man 3p write'') that is made available by ''#include <unistd.h>'' \\ For now, use the following syntax: ''write(STDOUT_FILENO, //pointer_to_data//, //number_of_bytes_to_write//);'' | ||
- | |||
- | ~~Exercise.#~~ Print the first argument (== the one that has number 1) using the ''write'' function. | ||
- | \\ <small>The ''strlen'' function from the ''string.h'' library calculates c-__str__ing (== [[https://en.wikipedia.org/wiki/Null-terminated_string|null-terminated string]]) __len__gth.</small> | ||
- | | ||
- | ~~Exercise.#~~ Print the first argument omitting its first three bytes. | ||
- | |||
- | ~~Exercise.#~~ Print all arguments (including the program name) using the ''write'' function. | ||
- | |||
- | ~~Exercise.#~~ Copy the first argument to an automatic (== on stack) array of pre-defined size, then display it. | ||
- | \\ <small>The ''strncpy'' function from the ''string.h'' library copies data up to first null byte, but no more than //n// characters. </small> | ||
- | |||
- | ~~Exercise.#~~ Copy the first argument to a manually allocated (== on heap) array of a correct size, then display it. | ||
- | \\ <small>''malloc'' (and ''calloc'') from ''stdlib.h'' allocates memory, and ''free'' releases it. \\ ''memcpy'' (or ''memmove'') from ''string.h'' copies data.</small> | ||
- | |||
- | ~~Exercise.#~~ Output the first argument in uppercase. | ||
- | \\ | ||
- | <small>The ''toupper'' function from ''ctype.h'' covers a single character to upper case.</small> | ||
- | |||
- | ~~Exercise.#~~ Read some text from standard output (and then display it), using the following functions: | ||
- | - C standard I/O library: | ||
- | - ''scanf'', | ||
- | - ''fscanf'', | ||
- | - POSIX ''read'' function from ''unistd.h''. For now, use the following syntax: \\ ''int //actual_number_of_read_bytes// = read(STDIN_FILENO, //pointer_where_to_store_read_data//, //max_number_of_bytes_to_read//);'' | ||
- | |||
- | ===== POSIX ===== | ||
- | |||
- | POSIX – the Portable Operating System Interface standard – apart from standardising shell and utilities, it also defines functions, macros, and external variables to support applications portability at the C-language source level. | ||
- | |||
- | POSIX [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/contents.html|specifies]] what should be present in the [[https://en.cppreference.com/w/c/header|C standard libraries]] (e.g., ''stdio.h'' or ''stdlib.h''), and also specifies user-level API to the operating system (those two do intersect). | ||
- | |||
- | Libraries required by POSIX are cleanly summarised at [[https://en.wikipedia.org/wiki/C_POSIX_library|C POSIX library]] Wikipedia page. | ||
- | |||
- | The ''[[https://en.wikipedia.org/wiki/Unistd.h|unistd.h]]'' library defines a number of basic constants and functions for interfacing the operating system. | ||
- | \\ | ||
- | The ''[[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/fcntl.h.html|fcntl.h]]'' library defines constants and functions for __f__ile __c__o__nt__ro__l__. | ||
- | |||
- | ==== Data types ==== | ||
- | |||
- | POSIX [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html|defines]] | ||
- | certain data types that are usually fancy names for certain integer types. | ||
- | |||
- | For instance, ''ssize_t'' shall be used to store signed size of arbitrary data, | ||
- | ''pid_t'' shall be used to store process identifiers, ''uid_t'' shall be used | ||
- | for user identifiers, ''time_t'' shall be used for storing numbers of seconds etc. | ||
- | |||
- | Programmers shall use these types, even if they know that ''time_t'' (as well as | ||
- | ''ssize_t'') is in fact a ''long int'' in the POSIX implementation they use. | ||
- | \\ | ||
- | This contributes to code portability and legibility. | ||
- | |||
- | [[https://en.wikipedia.org/wiki/Strong_and_weak_typing|Type system]] of C/C++ | ||
- | checks types after resolving ''typedef''s, therefore variables of such 'types' | ||
- | won't even generate compiler warnings when used interchangeably. | ||
- | |||
- | ==== Return value conventions ==== | ||
- | |||
- | **The majority of POSIX functions upon unsuccessful execution return ''-1'' and | ||
- | set the ''errno'' variable accordingly to the failure reason.** | ||
- | |||
- | To access the ''errno'' (error number) variable, one shall ''#include <errno.h>''. | ||
- | \\ | ||
- | ''errno'' is a [[https://en.cppreference.com/w/c/error/errno|part of C]] standard (since 1989). | ||
- | \\ | ||
- | Possible values of ''errno'' after executing some function are explained in the | ||
- | function's documentation. | ||
- | \\ | ||
- | <small> | ||
- | All standard values of ''errno'' are documented [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html|here]]. | ||
- | </small> | ||
- | |||
- | To get a human-readable error explanation of the number //errnum//, one can use: | ||
- | * ''char * strerror(int errnum)'' defined by [[https://en.cppreference.com/w/c/string/byte/strerror|C standard]], might not be thread-safe, | ||
- | * ''strerror_r'' that requires the programmer to provide a buffer for the message, and ''strerror_l'' that allows to specify locale(=language); both are defined by [[https://pubs.opengroup.org/onlinepubs/9699919799/functions/strerror.html|POSIX]] and are thread-safe, | ||
- | * ''void perror(const char *//str//)'' a [[https://en.cppreference.com/w/c/io/perror|C standard]] function that always uses ''errno'' to obtain //errnum// and prints ''//str//: //explanation//'' to standard error (or just ''//explanation//'' if //str// is ''NULL''). | ||
- | |||
- | ===== Reading & writing data ===== | ||
- | |||
- | To read or write data, POSIX defines: \\ | ||
- | <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html"></html> | ||
- | ''ssize_t **read** (int //fildes//, void *//buf//, size_t //nbyte//)'' | ||
- | <html></a></html> | ||
- | \\ | ||
- | <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html"></html> | ||
- | ''ssize_t **write**(int //fildes//, const void *//buf//, size_t //nbyte//)'' | ||
- | <html></a></html> | ||
- | |||
- | ''//fildes//'' is the file descriptor. The operating system maintains for each | ||
- | process an array of open files. A file descriptor is an index in this array. | ||
- | Functions that open files return this number. | ||
- | \\ | ||
- | For the standard stream (that are assumed to be open upon start) one may use | ||
- | numbers 0, 1 and 2, or, more verbosely, the equivalent fancy constants | ||
- | ''STDIN_FILENO'', ''STDOUT_FILENO'' and ''STDERR_FILENO''. | ||
- | |||
- | ''//buf//'' is the location in the memory where the data to be written is | ||
- | read from, or where the data read from file should be written to. | ||
- | |||
- | ''//nbyte//'' tells how many bytes shall be read/written. | ||
- | \\ | ||
- | Notice that ''//buf//'' must point to sufficient space. | ||
- | |||
- | The functions return number of bytes successfully read/written (unless an error | ||
- | occurred). | ||
- | \\ | ||
- | **Both ''read'' and ''write'' may return less bytes than they were ordered to | ||
- | read/write.** | ||
- | \\ | ||
- | When the files are ordinary files, this usually means that either (upon | ||
- | read) the file has ended, or (upon write) that the disk is full. | ||
- | |||
- | The thread executing read/write blocks <small>if the file is in the (default) | ||
- | blocking mode</small> until the operation completes. | ||
- | \\ | ||
- | Two concurrent I/O function calls are guaranteed to execute atomically. | ||
- | |||
- | Reading/writing advances the position in the file. | ||
- | \\ | ||
- | In some files (this include ordinary files), one may change the position within the file with: | ||
- | \\ | ||
- | <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html"></html> | ||
- | ''off_t **lseek**(int //fd//, off_t //offset//, int //whence//)''; | ||
- | <html></a></html> | ||
- | \\ | ||
- | In this function, ''//fd//'' selects the file which position should be changed, ''//whence//'' chooses if the new position is given relative to beginning of the file, current position or the end of the file by, respectively, providing ''SEEK_SET'', ''SEEK_CUR'' and ''SEEK_END'', and finally ''//offset//'' chooses the offset from the chosen ''//whence//''. | ||
- | |||
- | **An attempt to read a file when the position is at (or beyond) the end of file | ||
- | returns 0.** | ||
- | |||
- | ~~Exercise.#~~ Read standard input until the end of file, and write the read data | ||
- | to standard output. Test this both by using //Ctrl+d// to indicate end of file | ||
- | and by redirecting a file to the standard input. | ||
- | |||
- | ~~Exercise.#~~ Read standard input until the end of file, and write the read data | ||
- | to standard output. When you reach end of file, use ''lseek'' to set position | ||
- | in the file to its beginning (that is, 0 bytes from ''SEEK_SET''). Test this | ||
- | by redirecting some file as standard input. | ||
- | |||
- | <small> | ||
- | ~~Exercise.#~~ Replace the standard output with a file descriptor of value ''4''. | ||
- | Test whether the program works if you tell the shell to open file number 4 for | ||
- | your program by doing a ''4>//file//'' redirection. | ||
- | </small> | ||
- | |||
- | ===== Creating, opening and closing files ===== | ||
- | |||
- | To create or open a file, POSIX defines: \\ | ||
- | <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html"></html> | ||
- | ''int **open**(const char *//pathname//, int //flags//)'' \\ | ||
- | ''int **open**(const char *//pathname//, int //flags//, mode_t //mode//)'' | ||
- | <html></a></html>\\ | ||
- | and a ''creat'' function that is a shorthand to ''open(//pathname//, O_WRONLY|O_CREAT|O_TRUNC, //mode//)''. | ||
- | |||
- | ''//pathname//'' is a path. | ||
- | |||
- | ''//mode//'' is used only if ''open'' creates a new file, and it defines its | ||
- | permissions. Either use [[https://en.cppreference.com/w/c/language/integer_constant|octal]] | ||
- | number, or use symbolic constants described in the manual. | ||
- | |||
- | ''//flags//'' is a rat's nest.\\ | ||
- | ''//flags//'' must contain exactly one of the following flags: ''**O_RDONLY**'', ''**O_WRONLY**'', or ''**O_RDWR**'' that choose whether file is opened for reading, writing or both. \\ | ||
- | ''//flags//'' may additionally contain other flags, including the following: | ||
- | * ''**O_APPEND**'' sets file position to the end of a file before every write | ||
- | * ''**O_TRUNC**'' (shall be used only in conjunction with ''O_WRONLY'' or ''O_RDWR'') truncates the file (sets its size to 0) | ||
- | * ''**O_CREAT**'' tells ''open'' that if the file does not exist, it should be created | ||
- | * ''**O_EXCL**'' (shall be used only in conjunction with ''O_CREAT'') makes open fail when the file exists | ||
- | * and at least a dozen of other flags | ||
- | |||
- | For example: | ||
- | <code c> | ||
- | int fd1 = open("/tmp/foo", O_RDONLY); | ||
- | if (fd1 == -1) perror("Opening /tmp/foo for reading failed"); | ||
- | |||
- | int fd2 = open("/tmp/baz", O_WRONLY|O_APPEND); | ||
- | if (fd2 == -1) perror("Opening /tmp/baz for appending (write-only) failed"); | ||
- | |||
- | int fd3 = open("/tmp/bar", O_RDWR|O_CREAT|O_EXCL, 0600); | ||
- | if (fd3 == -1) perror("Creating a new file /tmp/bar failed"); | ||
- | // if open succeeds, the file is open for reading and writing and has permissions of 0600 | ||
- | </code> | ||
- | |||
- | To close a file, POSIX defines: | ||
- | \\ | ||
- | <html><a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html"></html> | ||
- | ''int **close**(int //filedes//)'' | ||
- | <html></a></html> | ||
- | \\ | ||
- | that closes a file number ''//filedes//''. | ||
- | \\ | ||
- | <small>On Linux, invoking ''close(fd)'' always closes ''fd'', even if ''close'' returns ''-1''.</small> | ||
- | |||
- | ~~Exercise.#~~ Open a file with hardcoded filename, read its contents and | ||
- | write it to standard output. | ||
- | |||
- | ~~Exercise.#~~ Open a file specified as the first argument of your program, read | ||
- | its contents and write it to standard output. | ||
- | |||
- | ~~Exercise.#~~ Open a file specified as the first argument of your program, read | ||
- | its contents and write it to standard output with line numbers (just like | ||
- | ''cat -n //file//''). | ||
- | \\ | ||
- | <small>''memchr'' looks up a character (e.g., ''\n'') in memory.</small> | ||
- | |||
- | ~~Exercise.#~~ Implement a program that checks if two files have the same contents. \\ | ||
- | <small>''memcmp'' compares two memory areas.</small> | ||
- | |||
- | <small> | ||
- | ~~Exercise.#~~ Implement a program that works as ''paste [//file//]...''. \\ | ||
- | Hint: the dirty solution reads single character a time. | ||
- | </small> | ||
- | |||
- | ~~META: | ||
- | language = en | ||
- | ~~ | ||