BLOG POSTS
    MangoHost Blog / Introduction to comm (Combines the functionality of diff and cmp)
Introduction to comm (Combines the functionality of diff and cmp)

Introduction to comm (Combines the functionality of diff and cmp)

comm is a command-line utility in Linux that combines the functionality of the diff and cmp commands. It is used to compare two sorted files line by line and display the lines that are common, unique, or different between the two files. comm is a useful tool for finding differences or similarities between two files and is commonly used in scripting and automation tasks.

The comm command is written in C programming language and is available as part of the GNU Core Utilities package. It is open-source software and is distributed under the GNU General Public License (GPL).

Official page of comm (Combines the functionality of diff and cmp): https://www.gnu.org/software/coreutils/manual/html_node/comm-invocation.html

Installation

comm is a part of the GNU Core Utilities package, which is usually pre-installed on most Linux distributions. However, if it is not available or you need to install it on a different operating system, you can follow the steps below:

Ubuntu/Debian

sudo apt-get install coreutils

CentOS/RHEL

sudo yum install coreutils

Usage and Examples

The basic syntax of the comm command is:

comm [OPTION]... FILE1 FILE2

Here are some examples of how to use the comm command:

Example 1: Compare two sorted files and display common lines

comm file1.txt file2.txt

This command compares the two sorted files file1.txt and file2.txt and displays the lines that are common to both files.

Example 2: Compare two sorted files and display unique lines

comm -23 file1.txt file2.txt

This command compares the two sorted files file1.txt and file2.txt and displays the lines that are unique to file1.txt. The -23 option suppresses the output of common lines (-1) and lines unique to file2.txt (-2).

Example 3: Compare two sorted files and display lines that are different

comm -3 file1.txt file2.txt

This command compares the two sorted files file1.txt and file2.txt and displays the lines that are different between the two files. The -3 option suppresses the output of common lines (-1 and -2).

Similar Commands and Benefits

There are several other commands and tools available in Linux that serve a similar purpose to comm. Some of them include:

diff

diff is a command-line utility that compares two files line by line and displays the differences between them. Unlike comm, diff does not require the files to be sorted and provides a more detailed output of the differences.

cmp

cmp is a command-line utility that compares two files byte by byte and displays the first byte and line number where the files differ. It is useful for comparing binary files or large files where line-by-line comparison is not practical.

The benefits of using comm over diff or cmp include:

  • Efficiency: comm is optimized for comparing sorted files and can handle large files more efficiently than diff or cmp.
  • Simplicity: comm provides a simple and concise output that shows only the common, unique, or different lines between the files.
  • Flexibility: comm offers various options to customize the output and suppress specific lines, making it suitable for different use cases.

Script Examples

Here are three script examples that demonstrate the usage of comm in automation:

Script 1: Find common lines between two files

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

common_lines=$(comm -12 <(sort "$file1") <(sort "$file2"))

echo "Common lines between $file1 and $file2:"
echo "$common_lines"

This script compares the two files file1.txt and file2.txt and displays the lines that are common to both files.

Script 2: Find unique lines in file1.txt

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

unique_lines=$(comm -23 <(sort "$file1") <(sort "$file2"))

echo "Unique lines in $file1:"
echo "$unique_lines"

This script compares the two files file1.txt and file2.txt and displays the lines that are unique to file1.txt.

Script 3: Compare two files and output differences

#!/bin/bash

file1="file1.txt"
file2="file2.txt"

diff_lines=$(comm -3 <(sort "$file1") <(sort "$file2"))

echo "Lines that are different between $file1 and $file2:"
echo "$diff_lines"

This script compares the two files file1.txt and file2.txt and displays the lines that are different between the two files.

List of comm Functions and Constants

Function/Constant Description
comm The main comm command that compares two files and displays the common, unique, or different lines.
-1 Suppress the output of lines unique to file1.
-2 Suppress the output of lines unique to file2.
-3 Suppress the output of common lines.
-12 Suppress the output of lines unique to file1 and lines unique to file2.
-23 Suppress the output of common lines and lines unique to file2.

Conclusion

The comm command in Linux is a powerful tool for comparing two sorted files and finding common, unique, or different lines. It is widely used in scripting and automation tasks to identify differences or similarities between files. The simplicity, efficiency, and flexibility of comm make it a preferred choice over other similar commands like diff or cmp. Whether you are a developer, system administrator, or data analyst, comm can help you streamline your file comparison tasks and improve your productivity.



This article incorporates information and material from various online sources. We acknowledge and appreciate the work of all original authors, publishers, and websites. While every effort has been made to appropriately credit the source material, any unintentional oversight or omission does not constitute a copyright infringement. All trademarks, logos, and images mentioned are the property of their respective owners. If you believe that any content used in this article infringes upon your copyright, please contact us immediately for review and prompt action.

This article is intended for informational and educational purposes only and does not infringe on the rights of the copyright owners. If any copyrighted material has been used without proper credit or in violation of copyright laws, it is unintentional and we will rectify it promptly upon notification. Please note that the republishing, redistribution, or reproduction of part or all of the contents in any form is prohibited without express written permission from the author and website owner. For permissions or further inquiries, please contact us.

Leave a reply

Your email address will not be published. Required fields are marked