User Tools

Site Tools


Sidebar

os_cp:locate_find

Table of Contents

FHS

To unify standard locations for files in Linux, a document Filesystem Hierarchy Standard has been created. Basing on traditional and commonly employed directory structures in Unix-like operating systems, the document recommends a converged directory structure, specifying intended use for the directories.

Typically in Linux the system manual has an entry hier (and, in case of systemd-based distributions, also file-hierarchy) that summarises the directory hierarchy used in the system.

Exercise 1 On the basis of FHS / man hier refer the purpose of the following directories:
      • /bin, /usr/bin and /sbin, /usr/sbin
      • /lib and /usr/lib
      • /usr/share
      • /etc
      • /opt
      • /root and /home
      • /tmp
      • /var

Searching files

locate

The locate tool looks up words or wildcard expressions provided as arguments in a previously generated list of files.
The list (or base) of files is usually updated daily - most distributions ship alongside locate a scheduled task that daily runs the command updating the database (updatedb).

Contemporary implementations of locate verify, right before displaying the results, whether the matching file still exist and whether the user has sufficient permissions to access the file.
Unrestricted access to the complete list of files present in a system is considered a security vulnerability, hence the systems deploy various access restrictions to the list.

Exercise 2 Use locate --statistics to display file database statistic.

With --regex switch, the locate treats all arguments as regular expressions.
Otherwise, each argument arg that contains no wildcards 1) is actually converted to a wildcard expression *arg* before matching against the entries in the database.

Exercise 3 Compare the results of the following commands:
      • locate 'stdint'
      • locate '*stdint'
      • locate --regex '.*stdint'
      • locate --regex '.*stdint$'

locate matches all arguments against full file path (or only against filename, if -b is given), and outputs a file name if at least one argument matches. To require a match against all arguments, one has to add the -A (--all) switch.

Exercise 4 Compare the following commands:
      • locate -A bin ps to pdf   vs   locate bin ps to pdf
      • locate -b netpbm   vs   locate netpbm

The option -0 makes the locate separate the output file paths with a null byte ('\0') instead of a newline ('\n').

Exercise 5 Locate files which name is exactly bin.

Exercise 6 Locate files that contain pause in the filename and icons in the full path.

Exercise 7 Locate files that contain a [ character in their names.

find

The find utility scouts recursively the directories to look for files matching the provided filters and executes the user-specified action (or just prints the paths of matching files if no action is provided).

While the find utility is part of POSIX standard, most implementations cover more functionality than what the standard mandates.
These materials summarise the implementation of find from GNU findutils (which is commonly used in Linux).

The find command has two noteworthy switches:
-L follows symlinks (default),
-H des not follows symlinks (and only then find may be used to look for symlinks).

find has a fixed and counter-intuitive syntax – options come first, but then one must list paths to search, and only after all paths one may provide tests and actions.

Upon omitting list of paths, find searches in the current directory.

Upon omitting actions, find prints the matching filenames to the standard output.

The order of the arguments matters.
find -ls -name '*bash*' first lists each encountered file, and then checks if the filename contains 'bash',
find -name '*bash*' -ls first checks if the filename contains 'bash', and lists the file only if the test succeeded.

Tests

Basic tests:
find … matches:
 -name pattern
-iname pattern
file name to the pattern
(-i performs case-insensitive match)
 -path pattern
-ipath pattern
file path to the pattern
(-i performs case-insensitive match)
-type {f|d|l|p|s|c|b}file type – f for ordinary file, d for directory, l for symlink, etc.
-user name
-group name
file owner / group
 
-perm  mode
-perm -mode
-perm /mode
 
permissions:
  same as provided
- all provided bits set
/ at least one of the provided bits set
mode can be either octal (22) or symbolic (go+w)
-size ssize; 512c stands for 512B; 512M stands for 512MB;
-512c stands for at most 512B, +512c stands for at least 512B
Warning: the default unit, denoted with b, are dist blocks, not bytes
-atime d / -amin min
-ctime d / -cmin min
-mtime d / -mmin min
times: access, change, modification in days or minutes

The tests can be:

  • joined with -a (logical and, used when no operator is specified) and -o (logical or)
    eg., find -name baz -o -name bar
  • grouped in parentheses – ( and ), but beware – parentheses are treated specially by bash, hence one must escape them as \( and \)
    eg., find -size -1M \( -name baz -o -name bar \)
  • negated by !
    eg., find ! -type f ! -type d

Other important tests include:
-mindepth n and -maxdepth n specify how deep find will scot the directories.
-xdev forbids entering directories where other filesystems are mounted.
That is: if one looks for a file in /, then find will not scout the /mnt/cdrom directory if a compact disc has been mounted there.

Actions

Selected actions:
-print the default action - prints filenames separated by a newline
-print0 prints filenames separated by a null byte
-delete removes matching files
-exec … {} … ; executes provided program, for each matching file separately (see below)
-exec … {} + executes provided program, for all matching files at once (see below)

If find -exec program arg1 arg2 \{\} \; finds the files: one, two i three, then it executes:
program arg1 arg2 one
program arg1 arg2 two
program arg1 arg2 three

If find -exec program arg1 arg2 \{\} + finds the files: one, two i three, then it executes:
program arg1 arg2 one two three

In -exec, each occurrence of the {} expression gets replaced by the filename.
Warning: {} (and ;) are treated specially by the shell, hence they must be escaped (that is, \{\}) or enclosed in quotation marks ("{}").
In the -exec … {} + syntax, the argument {} must appear once, as the last argument.

Exercises

Exercise 8 Issue, from your home directory, the commands find (with no arguments), find .config/.. oraz find ~.
How do the results differ?

Exercise 9 Find all files in the /srv/ and /var/lib/zypp/ directories using one command.

Exercise 10 Find, in your home directory, files which names end with .xml.

Exercise 11 Find empty files in your home directory

Exercise 12 Find and display files in /usr/include that are not greater than 32 bytes.
Warning: for find the unit b or no unit denotes disk blocks. Bytes are denoted c.

Exercise 13 Find all files outside your home directory.

Exercise 14 Find files with no read permission for the group in your home directory.

Exercise 15 Execute the command: find \( -type d -ls \) -o \( -print \). When the action -ls is executed, and when the action -print is executed? What and why would happen of one would remove the parentheses?

1) as understood by the shell – that is, * standing for any text, ? standing for any character , and […] standing for a range.
os_cp/locate_find.txt · Last modified: 2024/03/11 20:28 by jkonczak