Unix shell script for removing duplicate files

2011-05-16 1 min read Bash Linux

The following shell script finds duplicate (2 or more identical) files and outputs a new shell script containing commented-out rm statements for deleting them (copy-paste from here):

::: updated on 02 May 20121, seems like wordpress did not like it so well so reformatting the code :::::::

#!/bin/bash -
#===============================================================================
#
#          FILE:  a.sh
#
#         USAGE:  ./a.sh
#
#   DESCRIPTION:
#
#       OPTIONS:  ---
#  REQUIREMENTS:  ---
#          BUGS:  ---
#         NOTES:  ---
#        AUTHOR: Amit Agarwal (aka), amit.agarwal@roamware.com
#       COMPANY: blog.amit-agarwal.co.in
#       CREATED: 02/05/12 06:52:08 IST
# Last modified: Wed May 02, 2012  07:03AM
#      REVISION:  ---
#===============================================================================

OUTF=rem-duplicates.sh;
echo "#!/bin/sh" >$OUTF;
find "$@" -type f -exec md5sum {} \; 2>/dev/null | sort --key=1,32 | uniq -w 32 -d |cut -b 1-32 --complement |sed 's/^/rm -f/' >>$OUTF

Pretty good one line, I must say 🙂