===== Outputting file / text / sequences ===== ==== Printing file contents ==== ''**cat** [//file//]...'' outputs input files one after another or outputs the standard input if no files are specified. \\ Name comes from //con__cat__enate//. \\ ''cat'' numbers lines with the ''-n'' switch and outputs non-printable characters as ''^//x//'', ''M-//x//'', … with the ''-v'' switch((This notation corresponds to keys one would have to press to input the byte. To see all bytes (''\t'' and ''\n'' is replaced by ''X''), try: \\ ''perl -e'for(0..15){printf"\t%x_",$_};print"\n";for$l(0..15){printf"_%x",$l;for$h(0..15){$c=$h<<4|$l;$c=88 if $c==9||$c==10;printf("\t%c",$c)}print"\n"}'|cat -v'')). ''**paste** //file_1// [//file2//]...'' reads round-robin one line from each input file and outputs them separated by a tab character, then repeats until the end of the longest file. ''**join** [-t //separator//] [-1 //fieldNumberInFile1//] [-2 //fieldNumberInFile2//] //file1// //file2//'' parses two sorted files and prints their lines joined on specified fields. ''**fold** [-w //width//] [//file//]...'' outputs input files (or standard input) forcing a line break whenever a line would exceed //width// (that defaults to 80). \\ With the ''-s'' switch ''fold'' breaks lines on spaces (or at //width// if there are no spaces). ''**column** [-x] [//file//]...'' works just like ''cat'' if the longest line in the //file//s (or standard input) would not fit twice within the terminal width. \\ Else, it prints the input in as many columns as fit the terminal, filling column first (or, with ''-x'', rows first). \\ ''**column -t** [//file//]...'' does a completely different thing: it detects columns in input (by a separator that defaults to whitespace) and outputs the input as a table. ''**od** [-t x1]'', ''**hexdump** [-C]'', and ''**xxd**'' show binary files. ++++ Examples | {{page>so:redirects:cat&inline}} ++++ ~~Exercise.#~~ Print any file with ''cat''. Print two files at once with ''cat''. \\ (You may use ''/etc/SUSE-brand'' and ''/etc/os-release'' if you cannot come up with any other file.) ~~Exercise.#~~ Run the ''cat'' command, input some text, then press //Return// followed by //Ctrl+d//. ~~Exercise.#~~ Cat ''/usr/share/doc/mpich/user.pdf'' with and without ''-v'' switch. ~~Exercise.#~~ Paste a file with itself. ~~Exercise.#~~ Display a binary file ''/usr/share/themes/Breeze/assets/line-h.png''. ==== Printing text ==== ''**echo** //text//'' outputs //text// followed by a newline (unless ''-n'' is specified). \\ The ''-e'' switch turns backslash escapes into corresponding characters, e.g., ''\t'' becomes a tab and ''\n'' a newline (cf. manual). ''**printf** //format// [//arguments//]...'' works roughly the same as the ''printf'' function in C. ''**figlet** [//text//]'' outputs //text// or the standard input by using ascii-art font. ''**cowsay** [//text//]'' makes a cow say the //text// (or the standard input). ++++ Examples | {{page>so:redirects:echo&inline}} ++++ ~~Exercise.#~~ Try ''echo -e 'foo\n\nbaz' '' \\ and ''echo -e '\n\n one \033[A \033[A two \033[B \033[B \n \033[1;31m red \033[0m' '' \\ [[https://en.wikipedia.org/wiki/ANSI_escape_code|ANSI escape codes]] are well summarized [[https://gist.github.com/fnky/458719343aabd01cfb17a3a4f7296797|here]] ~~Exercise.#~~ Try ''printf "|%4.2f|%3s|%-20s|\n|%4.2f|%3s|%-20s|\n" 3.1428 pi circumference/radius 9.8 g gravity'' ~~Exercise.#~~ Install ''cowsay'' and ''figlet'' with ''sudo zypper -q in -y figlet cowsay''. \\ Try ''figlet wololo'' and ''cowsay moo'' ==== Generating number sequences ==== ''**seq** [//from// [//step//]] //to//'' generates a sequence of numbers starting from //from// incrementing it by //step// until it does not exceed //to//. \\ //from// and //step// default to 1. \\ With the ''-w'' switch, ''seq'' makes all number of equal width (e.g., ''seq -w 8 11'' outputs 08, 09, 10 and 11). ++++ Examples | {{page>so:redirects:seq&inline}} ++++ ~~Exercise.#~~ Generate a sequence of numbers from 1 to 15. ~~Exercise.#~~ Generate a sequence of numbers from 64 to 1024 with step of 64. ===== Standard streams ===== | K&R C[[https://archive.org/details/TheCProgrammingLanguageFirstEdition|[1]]] [[https://en.wikipedia.org/wiki/C_(programming_language)#K&R_C|[2]]]

printf("Please type your name:\n")
scanf("%s", name);

| Python

print("Please type your name:")
name = input()

| Did you ever wonder how does a program know where to read input from and where the print-like functions should output? **In UNIX world a program expects to have three files already open upon start – standard input, standard output and standard error. These are called the [[https://en.wikipedia.org/wiki/Standard_streams|standard streams]].** \\ The standard input/output library of the C programming language – ''stdio.h'' – bases on this concept. C was created by one of the authors of UNIX. Basic I/O functions in most programming languages by default read from the standard input, and output data to standard output. \\ The rationale of standard error stream is to convey information on what went wrong while executing a program. Programming languages usually offer dedicated functions to output data to standard error. In Unix-like, as well as POSIX-compatible systems, the operating system is responsible for abstracting files away – the user should not worry about details of accessing a file. \\ When the user wants to open a file, the user provides the file name and gets an identifier – a **file descriptor** in return. \\ (A file descriptor is in fact an index in an array of files maintained for the process by the OS.) \\ To do standard operations such as reading or writing data, the user just tells which operation shall be executed, on which file descriptor, and the user shall provide the details of the operations (such as where to put data read from the file and how many bytes shall be read). A child process inherits all file descriptors from its parent. **The three standard streams are the files represented by first three file descriptors – 0 is always used for standard input, 1 for stand output and 2 for standard error.** The files do not need to be ordinary files – Unix-like systems abstract almost everything with a [[https://en.wikipedia.org/wiki/Everything_is_a_file|file]]. \\ For instance, a terminal device is a file (even if it were a real teletype). By default, a shell opens the terminal as file 0, 1 and 2. ===== Redirections ===== POSIX-compatible shells can replace standard streams with files specified by the user. ==== The commonly used redirections ==== === Output redirections === ''command **>** filename'' - opens file //filename// for writing, - truncates the file, - replaces standard output stream with the file. ''command **2>** filename'' - opens file //filename// for writing, - truncates the file, - replaces standard error stream with the file. ''command **&>** filename'' Warning: this is a Bash extension - opens file //filename// for writing, - truncates the file, - replaces standard output stream AND standard error stream with the file. ''command **>>** filename'' - opens file //filename// for writing in append mode, - replaces standard error stream with the file. **''/dev/null''** is a device that discards any data written to it. ~~Exercise.#~~ The ''date'' command outputs the current date. Redirect its output to a file. ~~Exercise.#~~ Append a new date to the file from the previous exercise. ~~Exercise.#~~ Try the ''cat /etc/motd /etc/shadow'' command. Redirect the standard error to a file. ~~Exercise.#~~ Try the ''find /var/spool/'' command (''find'' will be discussed later on). Redirect the standard error to the ''/dev/null''. ~~Exercise.#~~ Redirect standard output of the ''find /var/spool/'' command to one file and the standard error to another file. ~~Exercise.#~~ Redirect standard output and the standard error of the ''find /var/spool/'' command to the same file. ++++ Examples | {{page>so:redirects:out-en&inline}} ++++ === Input redirections === ''command < filename'' - opens file //filename// for reading, - replaces standard input stream with the file. ''command << delimiter'' (here documents) - before starting the command, shell creates a temporary file, - shell reads data line by line from its standard input and writes it to the temporary file, - until an input line contains only //delimiter//, - then, shell opens the temporary file for reading, - and replaces standard input stream with the temporary file. ''command <<< string'' (here string) Warning: this is a Bash extension - creates a temporary file with //string// followed by a newline as the contents, - replaces standard input stream with the file. ~~Exercise.#~~ Create a file containing ''print("hello " + __file__)''. Run the ''python'' command redirecting input from the file. ~~Exercise.#~~ Use ''hexdump -C'' to display a hex dump of arbitrary text passed as a here document. \\ Include a multi-byte character in the text. ~~Exercise.#~~ ''bc'' is a simple calculator. Use it to calculate ''sqrt(2.0000)''. \\ Then use it to calculate ''sqrt(2.0000)'' in non-interactive mode. ~~Exercise.#~~ Use ''bc'' to calculate ''sqrt(2.0000)'' in non-interactive mode and redirect its output to a file. ++++ Examples | {{page>so:redirects:in&inline}} ++++ ==== The details ==== [[https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07|POSIX documentation on redirection]] \\ [[https://www.gnu.org/software/bash/manual/html_node/Redirections.html|Bash documentation on redirection]] Every redirection consists of: **''[//file_number//]//operator// //word//''** **The //file_number// defaults to 0 if the //operator// contains ''<'', else it defaults to 1.** \\ So ''command < file'' is the same as ''command 0< file'', and ''command >> file'' is the same as ''command 1>> file''. \\ Stream numbers from 0 to 9 are always safe to use. Consult documentation of your shell for other numbers. The //operator// may be one of: | ''<'' | opens //word// for reading and replaces //file_number// with the file | | ''>'' | if a file //word// exists and the noclobber option is set((The noclobber option is unset by default; use ''set -C'' to enable it.)), then fails; else \\ opens //word// for writing, truncates it and replaces //file_number// with the file | | ''>|'' | opens //word// for writing, truncates it and replaces //file_number// with the file (even if //word// exists) | | ''>>'' | opens //word// for appending and replaces //file_number// with the file | | ''<>'' | opens //word// for reading and writing and replaces //file_number// with the file | | ''<<'' | 1) creates a temporary file \\ 2) reads an input line \\ 3) if the line is //word//, goes to step 7 \\ 4) if there are no quotes (a pair of ''"'' or ') enclosing the //word//, performs the expansion((E.g., ''$VAR'' is substituted with its value, ''`date`'' is replaced by output of the date command etc.)) on the line \\ 5) appends the line to the temporary file \\ 6) goes to step 2 \\ 7) opens the temporary file for reading \\ 8) replace //file_number// with the file \\ The command is run once this is done | | ''<<-''| same as ''<<'', but after step 2 adds a step: \\ 2a) erase all leading tab characters (''\t'') \\ warning: spaces are not erased | | ''<<<''| warning: this is a Bash extension \\ 1) creates a temporary file \\ 2) writes //word// to it \\ 3) writes a newline to it \\ 4) opens the temporary file for reading \\ 5) replace //file_number// with the file \\ The command is run once this is done | | ''<&'' | if //word// is a number: duplicates a readable descriptor number //word// to the number //file_number// \\ if //word// is ''-'': closes a descriptor number //file_number// | | ''>&'' | if //word// is a number: duplicates a writeable descriptor number //word// to the number //file_number// \\ if //word// is ''-'': closes a descriptor number //file_number// | | ''&>'' | warning: this is a Bash extension \\ warning: this does not allow providing //file_number// (in fact, ''&'' is the file number) \\ opens //word// for writing, truncates it and replaces streams 1 and 2 with the file | | ''&>>''| warning: this is a Bash extension \\ warning: this does not allow providing //file_number// (in fact, ''&'' is the file number) \\ opens //word// for appending and replaces streams 1 and 2 with the file | ~~Exercise.#~~ Run ''cat /etc/motd'' with closed standard output descriptor. \\ Run ''find /var/spool/'' with closed standard error descriptor. ~~Exercise.#~~ Copy a large text file (e.g., ''/etc/services'') to //file//. \\ Run ''hexdump'' redirecting input from //file// using the ''<>'' operator, and duplicate standard input to standard output. Check what happens then. \\ Warning: do not use ''<>'' twice with the same file for standard input and standard output (unless you dare to face the consequences). ~~Exercise.#~~ Swap standard error with standard output of the ''cat /etc/motd /etc/shadow'' command. Test if you did this correctly by adding ''|rev'' at the end (that will put standard output backwards). === Redirections of current shell standard stream === The ''exec'' command can be used to manipulate standard streams of the current shell.


exec 3>&1           # copies the current standard output to descriptor 3
exec 1>myFile       # replaces standard output with myFile
date                # writes  a date   to standard output (which is now myFile)
fortune             #   "    a fortune      "
exec 1>&3  3>&-     # restores the standard output from 3 and closes 3

=== Process substitution (Bash extension) === [[https://www.gnu.org/software/bash/manual/html_node/Process-Substitution.html|Bash documentation on process substitution]] The syntax ''command1 <(command2)'' and ''command1 >(command2)'' are __not__ redirects. Bash replaces ''<(command2)'' with a name of a temporary file, starts ''command2'' in background and sets its standard output to the file. Bash replaces ''>(command2)'' with a name of a temporary file, starts ''command2'' in background and sets its standard input to the file. ~~Exercise.#~~ Execute and understand the results of: * ''/bin/ls >(echo a) >(echo b) <(echo c)'' * ''stat >(echo c)'' * ''stat >(sleep 3; fortune)'' * ''cat <(echo a)'' * ''echo 'abc' > >(rev)'' * ''cat <(date) > >(rev)'' * ''cat < <(date) > >(rev)'' * ''cat <(date) <(date) <(date) > >(rev)'' * ''cat <(date) <(date) < <(date) > >(rev)'' ===== Pipes ===== First, recall: standard I/O functions read from the standard input, and output data to standard output. Unix-like systems favoured for years ((For some time the "big programs that do all at once and nobody fully comprehends them", such as [[https://en.wikipedia.org/wiki/Systemd|systemd]], are forced as default in distros.)) programs that [[https://en.wikipedia.org/wiki/Unix_philosophy|do one job well]]. \\ Complex tasks can be done easily by combing such programs. For example: say you want to learn how many processes are owned by each user in the system. - You know that ''ps -ef'' lists all processes. So you can do ''ps -ef > ps_output''. - But you need only the first column of the output – the username \\ For this, you can use the command ''cut --delimiter ' ' --field 1 < ps_output > cut_output'' to cut the first space-delimited field in each line. - There is a program that counts how many times a line repeats. \\ So let's ''sort < cut_output > sort_output'' first, so that repeating usernames are one after another. - Now you can use the command ''uniq --count < sort_output'' to leave only non-repeating (unique) lines together with their count. Creating files with each intermediate result is usually a bad idea. UNIX came up with the idea of connecting standard output of one program to standard input of another program. \\ Instead of the commands above, one can run ''ps -ef | cut --delimiter ' ' --field 1 | sort | uniq --count'' that does the same without creating any files on disk.\\ This is technically done using a special kind of file called **pipe** – the shell creates a pipe (in the main memory), replaces standard output of one program with the pipe, and replaces standard input of the other program with the same pipe. **To connect standard output of //cmd_a// with standard input of //cmd_b// one writes ''//cmd_a// | //cmd_b//''.** \\ This is called piping output of one program into another. The programs are run in parallel. \\ ''//cmd_a// | //cmd_b//'' is sometimes referred do as [[https://en.wikipedia.org/wiki/Pipeline_(Unix)|pipeline]]. A common practice for programs following the Unix philosophy is to read from files specified by arguments __or from standard input when no file is specified__. Moreover, typically whenever ''-'' is encountered where a file name is required, standard input is used instead. Using ''//cmd_a// | //cmd_b//'' creates what is called **an anonymous pipe**. \\ One can create **a named pipe** using a command ''**mkfifo** //filename//''. \\ All that is written to a pipe is stored in the main memory (so it never occupies disk space, regardless if it is a named pipe) until some program reads it. ~~Exercise.#~~ Run ''echo '2+2*2' ''. Then, pipe it through ''bc''. ~~Exercise.#~~ ''echo'' some text. Then, echo the text and pipe it through ''xxd''. ~~Exercise.#~~ List files in your home directory. \\ List the files again, piping it through ''cat''. \\ List the files yet another time, now piping it through ''cat -n''. ~~Exercise.#~~ Pipe results of ''ps -eF'' through ''fold'' ~~Exercise.#~~ Create a named pipe //p//. Redirect input of ''fold'' from //p// in one terminal, and redirect output of ''ps -eF'' to //p// in another terminal. \\ Then repeat the commands, running the ''ps'' before the ''fold''. ++++ Examples | {{page>so:redirects:pipes1-en&inline}} ++++ ===== Filters ===== There is a number of programs (stemming from UNIX) that are called collectively [[https://en.wikipedia.org/wiki/Filter_(software)|filters]]. \\ A filter is a program that processes input in an useful way and writes it to its output. ==== head, tail ==== The ''**head**'' and ''**tail**'' programs output a specified number of leading / trailing lines (or bytes). By default ''head'' and ''tail'' output 10 lines. \\ By providing ''-n //count//'' / ''-c //count//'', they output //count// lines / bytes When ''head'' is given a number preceded by '**-**', e.g. ''head -n -//10//'', then it outputs all except last //10// lines. \\ When ''tail'' is given a number preceded by '**+**', e.g. ''tail -n **+**//10//'', then it outputs all lines starting from //10//th. ~~Exercise.#~~ Run ''paste <(seq 15) <(seq 15 -1 1)''. Then, pipe its output through head and/or tail to see: * first three lines * last three lines * all lines but three last * all lines but three first * lines from 6 up to 9 The ''tail'' command accepts a switch ''-f'' / ''--follow''. \\ ''**tail -f …**'' will first output as usual, and then wait and output any data appended to the file. ~~Exercise.#~~ Run ''seq 25 > //file//''. Then run ''tail -f //file//'' in one terminal and append (with output redirection) some data to //file//. ++++ Examples | {{page>so:pipes_filters:head_tail&inline}} ++++ ==== grep ==== The ''**grep** //regex// [//file//]...'' program prints lines of the input //file//s (or standard input) that match the //regex//. When the switch ''-r'' / ''-R'' is specified, //file// can be a directory and grep will match //regex// recursively for all files within the directory. The ''grep'' program accepts several regular expression grammars. POSIX [[https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html|specifies]] basic (default for ''grep'') and extended regular expressions (selectable with ''egrep'' or ''grep -E'' ). See manual for your implementation of ''grep'' for more on the grammars. By default regular expressions perform case-sensitive matching. To switch to case-insensitive mode, one can add ''-i'' switch to grep. With the switch ''-v'', ''grep'' outputs non-matching lines. ''grep'' can output all matching lines (default), first match or only indicate whether a file matches. \\ It can also prepend matching lines with line number and/or file name (the latter being default whenever multiple input files are given). Moreover, ''grep'' can output //N// lines before match (''-B //N//''), after match (''-A //N//'') or before and after (which is called //context//, hence ''-C //N//''). ~~Exercise.#~~ Filter ''seq 75'' with grep to see: * lines containing ''5'' * lines ending with ''5'' * lines ending with ''5'' or ''0'' * line containing ''33'' and 3 lines before it * line containing ''33'' and 4 lines around it ~~Exercise.#~~ Display all lines containing ''10'' in files ''/etc/passwd'' and ''/etc/group''. \\ List all files containing ''ecdsa'' in ''~/.ssh''. ++++ Examples | {{page>so:pipes_filters:grep&inline}} ++++ ==== cut ==== The ''**cut**'' program outputs only selected characters (''-c //spec//'') / bytes (''-b //spec//'') / fields (''-f //spec//'') in each line. \\ A field is any number of characters separated by a single-character delimiter (''-d //delim//'', defaults to tab). To specify fields/bytes/… one shall write ''//range//[,//range//]...'' where a range is ''//num//'', or ''//start//-//end//'', or ''//start//-'', or ''-//end//'' with intuitive meaning. For instance, ''echo 123456789abcdef | cut -c -3,6,9-11,14-'' outputs ''12369abef'' (colors added for clarity). ~~Exercise.#~~ Filter output of the ''mount'' command (or ''/etc/mtab'' file) to cut only the fifth (or third in case of ''/etc/mtab'') space-separated field (that contains the filesystem type). ~~Exercise.#~~ Remove from the output of ''egrep '^[Ee]{2}' /usr/share/myspell/en_US.dic'' a slash and all that follows it. ++++ Examples | {{page>so:pipes_filters:cut-en&inline}} ++++ ==== sort ==== The **''sort''** program by default sorts lines in alphabetical order. The options **''-k''** defines sort keys. * ''sort -k4'' uses columns 4,5,6,7,8,… * ''sort -k4,4'' uses only column 4 * ''sort -k4,6'' uses columns 4, 5 and 6 * ~~''sort -k5,4''~~ is invalid * ''sort -k5,5 -k4,4'' uses columns 5 and 4 Sort keys can have options, e.g., **''-n''** sorts numerically and **''-r''** reverses sort direction. \\ The options can be used for all sort keys, or for selected sort keys only: * ''sort -r -k5,5 -k4,4'' sorts both column 5 and column 4 descending * ''sort -k5,5r -k4,4'' sorts column 5 descending and column 4 ascending * ''sort -k5,5 -k4,4r'' sorts column 5 ascending and column 4 descending ''sort'' is not stable unless ''-s'' (''--stable'') switch is used. The ''sort'' program has much more to offer. See manual for details. ~~Exercise.#~~ Create a file with input data for the next exercises by coying & pasting the following command in your shell:


paste \
  <(perl -e 'printf "%d\n", rand(10) for(1..20)') \
  <(perl -e 'print((K,Q,J)[rand(3)]."\n") for(1..20);') \
  <(perl -e 'printf "%d\n", rand(1500) for(1..20)') \
  <(perl -e 'my @a=("a","b","c"); print $a[rand(@a)] . $a[rand(@a)] ."\n" for(1..20);') \
  <(perl -e 'printf "%d\n", rand(1500) for(1..20)') \
  <(perl -e 'my @a=("x","y","z"); print $a[rand(@a)] . $a[rand(@a)] ."\n" for(1..20);') \
  <(seq -w 20) \
  > random_data

~~Exercise.#~~ Display the file. Then sort it. ~~Exercise.#~~ Sort the file ignoring first two columns. Sort the file numerically ignoring first two columns. ~~Exercise.#~~ Sort the file by the column with K/Q/J and the column with xyz characters (in that order). ~~Exercise.#~~ Sort the file by the second column without and with the ''--stable'' option. ~~Exercise.#~~ Sort the file by the second column (alphabetically) and by the third column (numerically). ++++ Examples | {{page>so:pipes_filters:sort&inline}} ++++ ==== wc, uniq, nl ==== The **''wc''** (word count) program counts lines, words and bytes. \\ When multiple files are provided as arguments, ''wc'' displays information on each file as well as a line with totals. \\ The options ''-l'', ''-w'' and ''-c'' select lines, words and bytes. \\ The option ''-m'' counts all characters (including non-printable characters). \\ This matters for multi-byte characters: ''wc -mc <<< "‡∞♣"'' counts 10 bytes and 4 characters (the three visible and a newline). The **''uniq''** program by default removes repeating lines. With switches, it can among others: * ''-c'' — prefix all lines with repetition count * ''-d'' — print only the repeating lines * ''-u'' — print only lines that do not repeat ''**nl**'' numbers lines. It can also numer lines in text files organized into sections and pages. ~~Exercise.#~~ Pipe ''man wc'' through ''cat''. Then pipe ''man wc'' through ''wc''. How many words are there? ~~Exercise.#~~ See the results of ''wc /etc/motd /etc/SUSE-brand''. ~~Exercise.#~~ The ''perl -e 'printf "%d\n", (int rand(6)+1)+(int rand(6)+1) for(1..100)''' command rolls 100 times 2d6. \\ Pipe it through ''uniq'' to see rolls with same results in a row. \\ Then pipe it through ''sort'' and ''uniq'' so that you see how many times each result was hit. ++++ Examples | {{page>so:pipes_filters:wc_uniq&inline}} ++++ ==== tac, rev ==== ''tac'' outputs lines in reverse order. ''rev'' outputs characters in each line in reverse order. ~~Exercise.#~~ See the result of ''echo -e '1 2 3\n4 5 6\n7 8 9' ''. Then pipe it through ''tac'', and finally pipe it through ''rev''. ==== tr, sed ==== The ''**tr**'' program replaces or deletes characters. The command ''tr -d //LIST//'' deletes all characters that are in the //LIST//. The command ''tr //FROM// //TO//'' translates each n-th character from the list //FROM// to n-th character in the list //TO//. \\ If //TO// is shorted than //FROM//, last character form //TO// is used instead. With the switch ''-s'', whenever consecutive characters translate to the same character //x//, only a single character //x// is output. The switch ''-c'' translates all characters that are not in the //FROM// list. The lists may contain character ranges (e.g., ''[0-9]'', ''[a-f]'') and character classes (e.g., ''[:alnum:]'', ''[:space:]''). ~~Exercise.#~~ Pipe ''ls -l'' through ''tr'' to: * replace all digits with a dash, * make all ASCII letters uppercase, * to squeeze all spaces * remove all the letters ''rwx'' ++++ Examples | {{page>so:pipes_filters:tr-en&inline}} ++++ The ''**sed**'' (__s__tream __ed__itor) program reads input (standard input or files) line by line and executes a user-provided script for transforming the line and by default outputs each line once the script has been fully executed for this line. ''sed'' is Turing-complete. \\ ''sed'' is commonly used for regex-based search & replace. \\ The most basic command for this is: ''sed 's/regexp/replacement/' ''. \\ ''sed'' is out of scope of this course. ==== awk ==== [[https://en.wikipedia.org/wiki/AWK|awk]] is another text processing language. Roughly, it also reads input line by line and executes a user-provided script. An ''awk'' script consists of rules, and each rule has a condition (that matches against line contents or selects start/end of a file or whole execution) and a set of instructions run when the condition is satisfied. \\ ''awk'' is out of scope of this course. ==== more, less ==== To display data that does not fit into terminal one can use one of many programs that are collectively called //pagers//. \\ A pager displays at a time as much text as it fits in the terminal and allows the user to go to a next portion of the text (typically by pressing a key such as //space//). Most operating systems (as well as the POSIX standard) include a program called **''[[https://en.wikipedia.org/wiki/More_(command)|more]]''** as a rudimentary pager. Unix-like systems usually come with a pager called **''less''** which is more than ''more''. ''less'' parses data through a program indicated by the ''$LESSOPEN'' environmental variable. \\ Such program usually outputs human-readable data upon detecting a known not human-readable file format. \\ ''[[https://github.com/wofr06/lesspipe|lesspipe]]'' is the leading implementation of this feature. ''less'' can be used with the following switches: * ''-S'' – don't wrap lines * ''-R'' – output escape sequences that encode colors as themselves rather than erasing them * ''-L'' – disable processing input by whatever ''$LESSOPEN'' indicates * ''-N'' – number lines To display help from within ''less'', type ''h''. A choice of other useful key shortcuts: * //space// or //PgDown// / //PgUp// – go to the next / previous page * any integer – jump to this line * any integer followed by //%// – jump to this part of the document * //g// / //G// – jump to the beginning / end * // /pattern // / //?pattern// — searches for //pattern// forwards/backwards \\ //n// / //N// – repeat search in the same / reverse direction * //s// – saves data to a file (useful when less reads from a pipe) * //v// – opens the file in default editor (available whenever less displays a file) * //F// – waits for more data to be appended to the file (like ''tail -f'') The ''man'' command usually uses ''less'' as the pager. ~~Exercise.#~~ Type ''man less'' to see manual page for ''less'' in ''less''. Test the key shortcuts mentioned above. ~~Exercise.#~~ Open a PDF file (e.g., ''/usr/share/doc/packages/apparmor-docs/techdoc.pdf'') with less, with and without -L option. \\ Open a ''tar'' archive with less (e.g., ''/usr/share/doc/packages/automake/amhello-1.0.tar.gz'').\\ View a directory (e.g., ''/usr/include'') with less. ==== tee ==== The ''**tee** [-a] //file//...'' command writes every byte read from standard input to standard output and to every //file// specified. \\ With the ''-a'' switch ''tee'' appends to the file instead of overwriting it. ''tee'' is commonly used when one wants both to see and to record an output of a long-running command. ~~Exercise.#~~ ''tee'' the output of a ''tree'' to a file. View the file with ''less''. ~~META: language = en ~~