The Linux tr command helps a user to delete a set of characters or translate them to another character set. This is especially useful when you know a set of characters have to be replaced but the file size is too large to manually make the changes.
The Linux tr command’s list of functions includes changing the case of the character set, deleting specified characters and squeezing repeated characters from a string. The Linux tr command also allows the system admin to utilize the basic find-and-replace function, with complex character transformation possible when used with Unix pipes.
Getting started
Now that we know what the Linux tr command does, it’s time to understand how to use it. The syntax of the tr command is:
tr [options] "char-set-1" "char-set-2"
Let’s break down the tr command and understand its functionality. The ‘tr’, as you may have guessed, stands for translate. Next, we come to the ‘option’ parameter. This parameter allows us to choose how we wish to translate a string.
The parameters labeled as “char-set-1” and “char-set-2” are two strings given to the Linux tr command for translating. The tr command takes the characters of set1 and replaces it with their corresponding characters from set2.
Options in Linux tr command
The tr command gives us a range of options to choose which will govern how we translate a character set. Here are some options you can use with the Linux tr command:
- -C: used to complement the characters in set1 based on the input condition(s)
- -c: used to complement the values in set1 based on input condition(s)
- -s: used to ‘squeeze’ repeated appearances of all characters present in the last operand (set1 or set2)
- -d: used to delete characters from set1
- -u: used to make sure that the output isn’t buffered
- -t: used to truncate set1 to the length of set2
These options can be used individually or combined to achieve the desired result while using the Linux tr command.
Specifying character sets
When using the tr command, specifying which character sets need to be translated might be necessary for some operations. In this case, we follow the following conventions.
\character: when specific characters are used after a backslash, they hold a special meaning. Here are some examples:
- \a: Sends an ‘alert’ to the terminal
- \b: backspace escape sequence
- \r: carriage return escape sequence
- \t: adds a tab to the string
- \n: defines the end of a line
- \v: adds a vertical tab
[:class:]: These represent the entire set of characters belonging to a defined ‘class’. Following is a sample list of class names:
- alpha: alphabets
- digit: numeric characters
- alnum: alpha-numeric characters
- space: spaces
- lower: lowercase characters
- upper: uppercase characters
- special: special characters
- xdigit: hexadecimal characters
- [#*n]: Represents the appearance of the character in place of ‘#’ in a string ‘n’ number of times. It is only valid in set2.
Using the Linux tr command
Now that we understand the components of the tr command, it’s time to try it for ourselves.
Changing the case of a string
One of the most common cases where Linux tr command is used is when the user wants to convert text to uppercase or lowercase. You can change the text from lowercase to uppercase in a file as follows.
# Passing a file as an input
tr "[:lower:]" "[:upper:]" < File_Name
# Passing the echo command output to tr
echo 'linuxforDeVices' | tr '[:lower:]' '[:upper:]'
Removing specific characters from a string
Sometimes we need to remove a list of characters from a string of text present in a file. For example, when we wish to isolate the numbers stored in a file, we enter the command:
# Taking input from a file
tr -cd [:digit:] < File_Name
#Passing echo output to Linux tr
echo "Contact helpline number 12345-1235" | tr -cd [:digit:]
Output:
123451235
As you may notice, even the newline character is eliminated from the command output
Finding complement of a string
Finding the complement of a string is done using the -C/-c option of the Linux tr command. This removes all the characters in set 1 except those mentioned in the command and replaces them with the characters present in set2. This is how we can complement the text in a given file:
# input file
tr -c 's' 'z' < File_Name
# Pass Echo command output
echo 'mississippi'| tr -c 's' 'z'
In the above example, we’re replacing all the characters that are NOT the letter “s” with the letter “z”. As you can see in the output below, only the letter “s” stays while everything else has become z.
Listing all words in a file
To list all the words present in a file such that each line contains only one word, we use the following Linux tr command:
# input file
tr -cs '[:alnum:]' '\n' < File_Name
# Pass Echo command output
echo 'Linux For Devices' | tr -cs '[:alnum:]' '\n'
To perform the same operation on string entered by the user, we use the command:
Converting spaces to tabs
Another important use of the Linux tr command is to adjust text formatting. Here is how we use the command to replace all the white-spaces with a tab in a given file:
#Passing a file
tr [:space:] '\t' <File_Name>
# Passing echo command output
echo 'Social distancing is important'| tr [:space:] '\t'
Squeezing repeating characters
When a character is used too many times in a string, we wish to remove the repeating instances of the character. This is done using the -s option of the Linux tr command. Here is how we use the command to remove excess spaces from a file:
tr -s '[:space:]' < File_Name
#Passing Echo command output
echo 'This sentence has extra spaces' | tr -s '[:space:]'
SPECIAL NOTE: When we execute a Linux tr command, a successful execution carries the exit status 0. However, any error would give you a different exit status. So it is necessary to pay attention to this detail as you use the tr command.
Conclusion
The Linux tr command is a powerful tool to manipulate character-sets in a hassle-free manner. This tutorial covered the basics of tr command, but its true potential is unlocked only when used with UNIX pipes. I hope you were able to understand how the Linux tr command work. If you have any doubts or queries, drop them in the comment section below.