Searching files in Linux

By | October 7, 2009

No doubt every single user from time to time uses search function because it’s almost impossible to remember where all the files are saved. This function is even more useful when you need to find a system file, library and so on.

Most likely the great majority of you are familiar with Windows Search Function that is pretty easy to use. In fact all Linux distributions have such an option (a graphical one) as well, so you won’t lack any functionality. However if you want to be able to find absolutely everything you need, it’s better to choose command line that has some really powerful tools to use.

In the given post we are going to review some of these tools in detail. Nevertheless it should be emphasized that there is no need to worry even if you don’t really like using command line, because no scripts or complex expressions will be discussed – just easy and understandable methods.

First of all we need to define common types of search requests. Here are the most important ones:

  • Search by file name or mask (this type also includes file searching according to a certain path and exclusion of a particular path from the search)
  • Search by file type (extension)
  • Search by file access/creation/modification date/time
  • Search by file size
  • Search by file owner and access permissions
  • Search of system and executable files

As far as the commands used to find files are concerned there are four of them to be reviewed:

1) find

This command is considered to be the most powerful one. It checks the file system in real time based on certain criteria. That means you will always get the most up-to-date results without the necessity to update a database. Of course, you can perform all sorts of operations on the files that were found. In addition to that is should be emphasized that file command allows you to check temp folders as well (in comparison with locate command).

However due to the fact that file command does search through the file system hierarchy, it’s considerably slower than other commands. That’s why it’s recommended to narrow your search (of course, if there is such a possibility) by searching in certain directories.

2) locate

The locate command uses a database (instead of file system itself) for searching, so it’s significantly faster than file command. However using database means that the database has to be updated in order to ensure proper search results (all new files have to be included and deleted ones mustn’t be considered). Once your database is updated you are ready to search. To update the database the only thing you should do is to run the following command:

updatedb

Taking into consideration the fact that many users may forget to update the database each and every time, it’s recommended to set up a cron job for that purpose. Here you can learn more about the cron.

3) whereis

Given command is useful in case you need to find source, binary and/or executable files which are associated with manual pages. As a return of this command you get the path.

Let’s check an example. Suppose that you need to find out where Firefox is installed. Here is what you enter and what you get:

whereis firefox

firefox: /usr/bin/firefox /etc/firefox.cfg

4) which

The which command being pretty simple is very similar to the previous one (whereis command) but it shows you the full path of shell commands instead.

It’s very useful for finding out “which” binary the system would execute if you typed the command out. Since some programs have multiple versions installed the which command comes in handy to tell you which version it is using and where it is located.

For now let’s take a closer look at each of the above mentioned commands as far as their usage is concerned.

Command Line – find

As it was mentioned before the find command is one of the best ways to search files in Linux because it has a great deal of parameters, which make it extremely easy to find the files you need. Still it should be mentioned that in most cases the usage of this powerful command is limited to the basic parameters. Of course, you can find almost everything using just those basic parameters, but you should understand that you lose much more if you don’t learn all about find tool.

Nevertheless it’s not an easy task because the syntax of this command is pretty difficult. Thus in this case you should choose between power and simplicity. If you really need the power of find tool, let’s check its potential.

In fact the best way to learn how you can use find command is to go through a lot of different examples. We will start from the easiest till more complicated ones.

File name

1. find -name 'form.html'

name: means that the attribute is the file name
‘form.html’: entered text must be found.
(!)
Always enclose the filename in single quotes (!)

Thus this case the system would search for a file named form.html in the current directory and any subdirectory.

In fact Current directory can be denoted in two different ways:

find 
find .

2. find / -name 'form.jpg'

Here the system would search for any file named form.jpg on the root and all subdirectories from the root.

(!) If you root as the starting point for a find command, your system can be slowed down significantly. However in case you really must run such a command, it would be better run it during low-use time or overnight. In order to make it easier for you to get familiar with the results you can redirect the output to a file using the following syntax:

find / -name '*.jpg' > allimages.out

3. find /home/fred -name 'index*'

The system would search for files having the letters index as the beginning of the file name in the directory /home/fred and its subdirectories as well.

(!) If you have some doubts about the case of the filename, there is a way not to consider it – use –iname instead of –name. So all files starting with any combination of letters in upper and lower case such as INDEX or indEX or index would be found. However in this case it will be slower, because case insensitive search takes more time (!)

4. find /usr /home /tmp -name "*.png"

As you can see in this example there is a possibility to specify more than one starting directory in a search string. Thus the system would search for a png file in user, home, temp directories and their subdirectories.

5. find . -path '/files' -prune -o -name "*.jpg"

-path: means that the attribute is the directory (path)
-prune -o: used to exclude files or directories from search

In this case the system would search for jpg files in the current directory and its subdirectories excluding files directory.

6. find /usr /home -name "*.jar" 2>/dev/null

If you need to search files in a certain directory, there is a possibility that you will get some errors due to the lack of permissions. Such errors can make it hard to find the line (search result) you need. To prevent them (errors) from appearing you need to use the above mentioned parameter.

2>/dev/null: given parameter is not related to find tool as such. “2” indicates the error stream in Linux, and /dev/null is the device where anything you send simply disappears. So 2>/dev/null sends all error messages to the null file, thereby providing cleaner output.

(!) There is one more thing you can do – replace 2>/dev/null with 2>error.txt. If you use the last parameter after the search you would have a file named error.txt in a proper directory with all the error messages in it (!)

File size

7. find /music -name '*.mp3' -size +3000k

The system would search for any mp3 files that have a size more than 3000 Kilobytes (>3MB) within music directory and its subdirectories.

(!) If you would like to search for files that are, for example, less than 3000 Kilobytes, just use “-3000” instead of “+3000” (!)

File type

8. find /music –type d

In this case the system searches for file type. The command would find all the subdirectories in music directory.

Here are some other file type that find command can locate:

– b – block (buffered) special
– c – character (unbuffered) special
– l – symbolic link
– p – named pipe (FIFO)
– s – socket

Finding time

9. find . -mtime -1

The system would find all the files modified within the last hour in current director.

(!) It should be mentioned that there are three time stamps:

mtime the time that the contents of a file were last modified
atime – the time that a file was read or accessed
ctime – the time that a file’s status was changed

Each of these time options is used with a value n, which is specified as -n, n, or +n.

• -n returns less than n
• +n returns greater than n
• n returns exactly n matches (!)

10. find . -mtime 1

In this case the system has to find all the files that were modified exactly one hour ago in the current directory. As you understand such a command may not return any results.

11. find . -mtime +1

The system will search for files that were modified more than an hour ago in the current directory.

12. find . -amin -5

The system would find all the files modified within the last 5 minutes in current director.

Executing files

13. find / - name 'screenshot*' -exec ls -l {} ;

exec: a very important feature that allows you to execute a particular command on the results of the find command.
ls –l: the command you want to execute
{}: an indicator that the filenames returned by the search should be substituted here
: is the terminating string, and is required at the end of the command

Using Boolean operators

14. find /music -name 'Nirvana*' -and -size +5000k

In this case system would search within music directory for files that have their names beginning with ‘Nirvana’ AND whose size is greater than 5000 Kilobytes.

(!) In fact AND operator is used by default, even if you don’t specify it (!)

15. find /music -size +5000k ! -name 'Nirvana*'

Here the command searches in music directory only for files that are greater than 5MB, but they should NOT have ‘Nirvana’ as the starting of their filenames.

16. find /music -name 'Nirvana*' -or -size +5000k

The system would search within music directory for files that have their names beginning with ‘Nirvana’ OR  all the files that are greater than 5000 Kilobytes.

Of course, there are much more things to say about find command, but the above mentioned examples will help you become more familiar with it in order to understand how powerful it is. Here you can check more parameters for find command.

Command Line – locate

As you probably remember given command is significantly faster than the previous one. In addition to that is much simpler. However you may encounter an error while running this command for the first time.  The problem is that Linux requires a database of all the files, so you need to update your database.

Here are some of the examples that will help you get familiar with locate command.

1. locate form.html

The system would produce a list of the locations where you could find files that are named as ‘form.html’. The result may look like that:

/usr/documents/site/form.html

2. locate '*.jpg' -q

The –q option is used to suppress any error messages, such as permission to access files and etc.

3. locate '*.pdf' -n 10

The system would limit the number of returned results to 10.

4. locate Polly.mp3 -i

The system would perform a case insensitive search, i.e. the case of the filenames wouldn’t be considered.

5. locate form.html -l 0

-l: means security level. “0” turns security checks off. This will make search faster. “1” turns security checks  on.

Such a command would make your search faster. If you replace –l 0 with –l 1, the process would take more time but the result will be more secure. Moreover it should be mentioned that –l 1 is the default action, if you don’t specify something else.

These were the most commonly used parameters.

Command Line – whereis

Due to the fact that we already know what this command is use for, we are going to go ahead with the parameters for the whereis command. Here they are:

-b – Search only for binaries.

-m – Search only for manual sections.
-s – Search only for sources.

-u – Search for unusual entries. A file is said to be unusual if it does not have one entry of each requested type. Thus `whereis -m -u *’ asks for those files in the current directory which have no documentation.

-B – Change or otherwise limit the places where whereis searches for binaries.

-M – Change or otherwise limit the places where whereis searches for manual sections.

-S – Change or otherwise limit the places where whereis searches for sources.

-f – Terminate the last directory list and signals the start of file names, and must be used when any of the -B, -M, or -S options are used.

1. whereis songbird

The system locates the directories where Songbird is stored

2. whereis -u -M /usr/man/man1 -S /usr/src -f *

In this case the system would search for any manual for any file in the current directory.

Command Line – which

No doubt the which command is the simplest one. There are no parameters to mention, so the only thing we would should do is to take a look at the following example:

which perl

The system would locate the executable location of the perl command. You get:

/usr/bin/perl

What to choose?

As you can see using command line to locate files is not as hard as it seemed to be. At the same time it’s significantly more powerful than the standard search options (graphical ones). Of course, if you are the one who is afraid of command line and don’t want to learn some really “cool stuff”, you can just add this post to your bookmarks and wait till “your time” comes. But in case you have an endless amount of files on your PC and from time to time need to find some of them really quickly, you probably know what to choose, don’t you?