In this article, we will be going through some ways to find Large files in Linux. This knowledge can help us to remove some large files that are jacked up in our system’s memory without serving any purpose.
Find Large Files in Linux using the Find command
As we are searching for files all over the system, we need root permission for it. Using 'sudo su'
or 'sudo -s'
and entering the password, we can have superuser status. Read this article for a complete tutorial on sudo.
find / -xdev -type -f -size +200M
Let us try to understand the command:
- find – find is a very powerful command that can be used to search for files and directories in Linux.
- ‘/’ – It denotes the path inside which the
find
command has to operate. Here, forward slash represents all possible paths. - -xdev – This option restricts
find
command to snoop only in the current filesystem. - -type f – This option refers to the specific type of objects we are trying to search. In this case, it happens to be normal files, therefore, the argument used is
-f
. - -size – Denotes the size-related filter.
- +200M – It means that we are filtering all files greater than 200 MB.
We have a complete article dedicated to find command.
How to sort list of large files?
The first step of extracting files larger than 200 MB was a success. The next target is to get the files sorted according to their sizes. This can be done by:
find / -xdev -type f -size +200M | xargs du | sort -k 1 -rh
Piggybacking on the previous output, the upgraded command means:
- ‘|’ – The ‘pipe‘ symbol is used to pass the output of the former command to the next one.
- xargs – The large files piped from the
find
command are passed as arguments to the following command usingxargs
. - du – This command is used for finding out disk usage of files and directories. More on du.
- sort – As the name suggests, it sorts the given data.
- -k 1 – It tells the
sort
command to sort the input on the basis of the first column. - -rh – Sorts the data in reverse (descending), human-readable format.
Record of large directories
To display the 10 largest directories in the Linux system, we can use:
du / 2>/dev/null | sort -k 1 -rh | head -n 10
The explanation for the above command:
- du – The command for listing out disk usage of directories. More on du commmand.
- ‘/’ – It tells the command to list all possible paths in Linux.
- 2>/dev/null – If we try to run the command without this part, we will get the correct output. But along with it, we will get some error messages regarding permission denied to some file systems. To flush these error messages out of the standard output, we direct them (using
>
) to/dev/null
device. - ‘sort -k 1 -rh’ – Sort the directories according to the first column (size) in reverse, human-readable format.
- ‘head -n 10’ – After sorting we pick 10 elements from the top using
head
command.
List of large files in the current directory
Suppose we are in a directory that claims to occupy a large chunk of memory, but we are unable to figure out those files. To get us out of this trouble we can use:
find . -xdev -type f -size +30M
The '.'
following the find
command, restricts the command to search for large files inside the current directory.
How to find useless large files?
In a computer system, a file can be regarded as useless if it has not been modified for a long time, even though the system is used daily. To extract such kind of files, we can use:
find / -xdev -mtime +100 -type f -size +100M
The above command displays all the files that are larger than 100 MB and have last modified date earlier than 100 days.
Conclusion
Using the above commands, we can extract large files according to our interests. The user can always change the threshold values as per the system. You must note that it might not be always in their best interest to remove useless files without complete knowledge about the file.
We hope that this article provided enough information to the user to innovate on these topics. In any case, we can always refer manual pages for any Linux utility using the man command.