User Tools

Site Tools


os_cp:locate_find

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

os_cp:locate_find [2024/03/11 20:28] (current)
jkonczak utworzono
Line 1: Line 1:
 +<​small>​
 +===== FHS =====
 +
 +To unify standard locations for files in Linux, a document ​
 +[[https://​en.wikipedia.org/​wiki/​Filesystem_Hierarchy_Standard|Filesystem Hierarchy Standard]]
 +has been created. Basing on traditional and commonly employed directory structures
 +in Unix-like operating systems, the document recommends a converged directory structure,
 +specifying intended use for the directories.
 +
 +Typically in Linux the system manual has an entry ''​hier''​ (and, in case of
 +systemd-based distributions,​ also ''​file-hierarchy''​) that summarises the 
 +directory hierarchy used in the system.
 +
 +~~Exercise.#​~~ On the basis of FHS / ''​man hier''​ refer the purpose of the following directories:​
 +\\       • ''/​bin'',​ ''/​usr/​bin''​ and ''/​sbin'',​ ''/​usr/​sbin''​
 +\\       • ''/​lib''​ and ''/​usr/​lib''​
 +\\       • ''/​usr/​share''​
 +\\       • ''/​etc''​
 +\\       • ''/​opt''​
 +\\       • ''/​root''​ and ''/​home''​
 +\\       • ''/​tmp''​
 +\\       • ''/​var''​
 +</​small>​
 +
 +===== Searching files =====
 +
 +==== locate ====
 +
 +The ''​**[[https://​en.wikipedia.org/​wiki/​Locate_(Unix)|locate]]**''​ tool 
 +looks up words or wildcard expressions provided as arguments in a previously
 +generated list of files.
 +\\
 +The list (or //base//) of files is usually updated daily - most distributions
 +ship alongside ''​locate''​ a scheduled task that daily runs the command updating
 +the database (''​updatedb''​).
 +
 +Contemporary implementations of ''​locate''​ verify, right before displaying the
 +results, whether the matching file still exist and whether the user has
 +sufficient permissions to access the file.
 +\\
 +<​small>​Unrestricted access to the complete list of files present in a system
 +is considered a security vulnerability,​ hence the systems deploy various access
 +restrictions to the list.</​small>​
 +
 +~~Exercise.#​~~ Use ''​locate --statistics''​ to display file database statistic.
 +
 +<​small>​
 +With ''​--regex''​ switch, the ''​locate''​ treats all arguments as regular
 +expressions.
 +\\
 +Otherwise, each argument ''//​arg//''​ that contains no wildcards ((as understood
 +by the [[https://​pubs.opengroup.org/​onlinepubs/​9699919799/​utilities/​V3_chap02.html#​tag_18_13|shell]]
 +– that is, ''​*''​ standing for any text, ''?''​ standing for any character , and ''​[…]''​ standing for a range.))
 +is actually converted to a wildcard expression ''​*//​arg//​*''​ before matching
 +against the entries in the database.
 +
 +~~Exercise.#​~~ Compare the results of the following commands:
 +\\       • ''​locate '​stdint'​%%%%''​
 +\\       • ''​locate '​*stdint'​%%%%''​
 +\\       • ''​locate --regex '​.*stdint'​%%%%''​
 +\\       • ''​locate --regex '​.*stdint$'​%%%%''​
 +</​small>​
 +
 +<​small>​
 +''​locate''​ matches all arguments against full file path (or only against filename,
 +if ''​-b''​ is given), and outputs a file name if at least one argument matches.
 +To require a match against all arguments, one has to add the ''​-A''​ (''​--all''​)
 +switch.
 +
 +~~Exercise.#​~~ Compare the following commands:
 +\\       • ''​locate -A bin ps to pdf''​   vs   ''​locate bin ps to pdf''​
 +\\       • ''​locate -b netpbm''​   vs   ''​locate netpbm''​
 +
 +The option ''​-0''​ makes the ''​locate''​ separate the output file paths with
 +a null byte ('​\0'​) instead of a newline ('​\n'​).
 +</​small>​
 +
 +~~Exercise.#​~~
 +Locate files which name is exactly ''​bin''​.
 +
 +~~Exercise.#​~~
 +Locate files that contain ''​pause''​ in the filename and ''​icons''​ in the full path.
 +
 +<​small>​
 +~~Exercise.#​~~
 +Locate files that contain a ''​[''​ character in their names.
 +</​small>​
 +
 +==== find ====
 +
 +The ''​**[[https://​en.wikipedia.org/​wiki/​Find_(Unix)|find]]**''​ utility scouts
 +recursively the directories to look for files matching the provided filters and
 +executes the user-specified action (or just prints the paths of matching files
 +if no action is provided).
 +
 +
 +While the ''​find''​ utility is part of
 +[[https://​pubs.opengroup.org/​onlinepubs/​9699919799/​utilities/​find.html|POSIX standard]],
 +most implementations cover more functionality than what the standard mandates.
 +\\
 +These materials summarise the implementation of ''​find''​ from [[https://​www.gnu.org/​software/​findutils/​|GNU findutils]]
 +(which is commonly used in Linux).
 +
 +The ''​find''​ command has two noteworthy switches: \\
 +''​-L''​ follows symlinks (default), \\
 +''​-H''​ des not follows symlinks (and only then ''​find''​ may be used to look for symlinks).
 +
 +**''​find''​ has a fixed and counter-intuitive syntax – options come first, but
 +then one must list paths to search, and only after all paths one may provide
 +tests and actions.**
 +
 +Upon omitting list of paths, ''​find''​ searches in the current directory.
 +
 +Upon omitting actions, ''​find''​ prints the matching filenames to the standard output.
 +
 +**The order of the arguments matters**.\\
 +''​find -ls -name '​*bash*'​%%%%''​ first lists each encountered file, and then checks if the filename contains '​bash',​\\
 +''​find -name '​*bash*'​ -ls''​ first checks if the filename contains '​bash',​ and lists the file only if the test succeeded.
 +
 +=== Tests ===
 +
 +|Basic tests:||
 +^ ''​find …''​ ^ matches: ^
 +|''​ -name //​pattern//''​ \\ ''​-iname //​pattern//''​| file name to the //pattern// \\ (''​-i''​ performs case-insensitive match)|
 +|''​ -path //​pattern//''​ \\ ''​-ipath //​pattern//''​| file path to the  //pattern// \\ (''​-i''​ performs case-insensitive match)|
 +|''​-type {f|d|l|p|s|c|b}''​|file type – ''​f''​ for ordinary file, ''​d''​ for directory, ''​l''​ for symlink, etc.|
 +|''​-user //​name//''​ \\ ''​-group //​name//''​|file owner / group|
 +|  \\ ''​-perm  //​mode//''​ \\ ''​-perm -//​mode//''​ \\ ''​-perm ​ /​%%%%//​mode//''​ \\  |permissions:​ \\ ''​ ''​ same as provided \\ ''​-''​ all provided bits set \\ ''/''​ at least one of the provided bits set \\ mode can be either octal (''​22''​) or symbolic (''​go+w''​) |
 +|''​-size //​s//''​|size;​ ''​512c''​ stands for 512B; ''​512M''​ stands for 512MB; \\ ''​-512c''​ stands for at most 512B, ''​+512c''​ stands for at least 512B \\ Warning: the default unit, denoted with ''​b'',​ are dist blocks, not bytes|
 +|''​-atime //​d//''​ / ''​-amin //​min//''​ \\ ''​-ctime //​d//''​ / ''​-cmin //​min//''​ \\ ''​-mtime //​d//''​ / ''​-mmin //​min//''​ |times: access, change, modification in //d//ays or //​min//​utes|
 +
 +The tests can be:
 +  * joined with ''​-a''​ (logical //and//, used when no operator is specified) and ''​-o''​ (logical //or//) \\ eg., ''​find -name //baz// -o -name //​bar//''​
 +  * grouped in parentheses – ''​(''​ and ''​)'',​ but beware – parentheses are treated specially by bash, hence one must escape them as ''​\(''​ and ''​\)''​ \\ eg., ''​find -size -1M \( -name //baz// -o -name //bar// \)''​
 +  * negated by ''​!''​ \\ eg., ''​find ! -type f ! -type d''​
 +
 +<​small>​
 +Other important tests include:
 +\\
 +''​-mindepth //​n//''​ and ''​-maxdepth //​n//''​ specify how deep find will scot the directories.
 +\\
 +''​-xdev''​ forbids entering directories where other filesystems are mounted. \\ 
 +That is: if one looks for a file in ''/'',​ then ''​find''​ will not scout the ''/​mnt/​cdrom''​ directory if a compact disc has been mounted there.
 +</​small>​
 +
 +<​small>​
 +=== Actions ===
 +
 +|Selected actions:||
 +|''​-print''​| the default action - prints filenames separated by a newline |
 +|''​-print0''​| prints filenames separated by a null byte |
 +|''​-delete''​| removes matching files |
 +|''​-exec … {} … ;''​| executes provided program, for each matching file separately (see below)|
 +|''​-exec … {} +''​| executes provided program, for all matching files at once (see below)|
 +
 +<​html><​div></​html>​
 +If ''​find -exec program arg1 arg2 \{\} \;''​ finds the files: ''​one'',​ ''​two''​ i ''​three'',​ then it executes:\\
 +<​html><​div style="​display:​inline-block;​line-height:​ 1em"></​html>​
 +''​program arg1 arg2 one''​\\
 +''​program arg1 arg2 two''​\\
 +''​program arg1 arg2 three''​
 +<​html></​div></​div></​html>​
 +
 +If ''​find -exec program arg1 arg2 \{\} +''​ finds the files: ''​one'',​ ''​two''​ i ''​three'',​ then it executes:\\
 +''​program arg1 arg2 one two three''​
 +
 +In ''​-exec'',​ each occurrence of the ''​{}''​ expression gets replaced by the filename. \\
 +Warning: ''​{}''​ (and '';''​) are treated specially by the shell, hence they must be 
 +escaped (that is, ''​\{\}''​) or enclosed in quotation marks (''"​{}"''​).
 +\\
 +In the ''​-exec … {} +''​ syntax, the argument ''​{}''​ must appear once, as the last argument.
 +
 +</​small>​
 +
 +=== Exercises ===
 +
 +~~Exercise.#​~~ Issue, from your home directory, the commands ''​find''​ (with no
 +arguments), ''​find .config/​..''​ oraz ''​find ~''​. \\ How do the results differ?
 +
 +~~Exercise.#​~~ Find all files in the ''/​srv/''​ and ''/​var/​lib/​zypp/''​ directories
 +using one command.
 +
 +~~Exercise.#​~~ Find, in your home directory, files which names end with ''​.xml''​.
 +
 +~~Exercise.#​~~ Find empty files in your home directory
 +
 +<​small>​
 +~~Exercise.#​~~ Find and display files in ''/​usr/​include''​ that are not greater
 +than 32 bytes. \\
 +Warning: for ''​find''​ the unit ''​b''​ or no unit denotes disk blocks.
 +Bytes are denoted ''​c''​.
 +</​small>​
 +
 +~~Exercise.#​~~ Find all files outside your home directory.
 +
 +~~Exercise.#​~~ Find files with no read permission for the group in your home
 +directory.
 +
 +<​small>​
 +~~Exercise.#​~~
 +Execute the command: ''​find \( -type d -ls \) -o \( -print \)''​.
 +When the action ''​-ls''​ is executed, and when the action ''​-print''​ is executed?
 +What and why would happen of one would remove the parentheses?​
 +</​small>​
 +
 +
 +~~META:
 +language = en
 +~~
  
os_cp/locate_find.txt · Last modified: 2024/03/11 20:28 by jkonczak