comm – Compare two sorted files line by line

The comm command is a Linux utility that compares two sorted files line by line and displays the lines that are unique to each file or common to both files. It is used to find the differences and similarities between two files.

Overview

The basic syntax for using comm command is:

comm [OPTION]... FILE1 FILE2

where FILE1 and FILE2 are the two files to be compared. The output of the command is divided into three columns: lines unique to FILE1, lines unique to FILE2, and lines common to both files.

For example, let’s say we have two files named file1.txt and file2.txt with the following content:

$ cat file1.txt
apple
banana
orange
peach

$ cat file2.txt
apple
cherry
orange
strawberry

To compare these two files, we can use the comm command as follows:

$ comm file1.txt file2.txt

The output will be:

        apple
banana
        cherry
        peach
        strawberry
orange

The first column represents lines unique to file1.txt, the second column represents lines unique to file2.txt, and the third column represents lines common to both files.

Specific use cases

  • Compare two configuration files to find the differences
  • Compare two log files to find common errors
  • Compare two lists of packages to find the missing or extra packages

Options

The comm command has a few options that can be used to modify its behavior. The available options are:

Option Description
-1 Suppress printing of column 1
-2 Suppress printing of column 2
-3 Suppress printing of column 3
-i Ignore case differences
-u Suppress printing of column labels and delimiters
-z Use nulls instead of newlines as the separator between lines

Troubleshooting tips

  • If the files are not sorted, comm will not provide the correct output. Make sure to sort the files before comparing them.
  • If the output is not what you expected, check if you have used the correct options and file names.

Notes

  • The comm command only works with sorted files. If the files are not sorted, use the sort command before using comm.
  • The comm command assumes that the input files are ASCII or UTF-8 encoded.