The comm
command is a Linux utility that compares two sorted files line by line and displays the lines that are unique to each file or common to both files. It is used to find the differences and similarities between two files.
Overview
The basic syntax for using comm
command is:
comm [OPTION]... FILE1 FILE2
where FILE1
and FILE2
are the two files to be compared. The output of the command is divided into three columns: lines unique to FILE1, lines unique to FILE2, and lines common to both files.
For example, let’s say we have two files named file1.txt
and file2.txt
with the following content:
$ cat file1.txt
apple
banana
orange
peach
$ cat file2.txt
apple
cherry
orange
strawberry
To compare these two files, we can use the comm
command as follows:
$ comm file1.txt file2.txt
The output will be:
apple
banana
cherry
peach
strawberry
orange
The first column represents lines unique to file1.txt
, the second column represents lines unique to file2.txt
, and the third column represents lines common to both files.
Specific use cases
- Compare two configuration files to find the differences
- Compare two log files to find common errors
- Compare two lists of packages to find the missing or extra packages
Options
The comm
command has a few options that can be used to modify its behavior. The available options are:
Option | Description |
---|---|
-1 |
Suppress printing of column 1 |
-2 |
Suppress printing of column 2 |
-3 |
Suppress printing of column 3 |
-i |
Ignore case differences |
-u |
Suppress printing of column labels and delimiters |
-z |
Use nulls instead of newlines as the separator between lines |
Troubleshooting tips
- If the files are not sorted,
comm
will not provide the correct output. Make sure to sort the files before comparing them. - If the output is not what you expected, check if you have used the correct options and file names.
Notes
- The
comm
command only works with sorted files. If the files are not sorted, use thesort
command before usingcomm
. - The
comm
command assumes that the input files are ASCII or UTF-8 encoded.