Linux-要检查可能的重复目录(可能需要RegEx)

Linux-要检查可能的重复目录(可能需要RegEx),regex,linux,bash,grep,Regex,Linux,Bash,Grep,我有一个包含以下几个目录的目录: /音乐/ /音乐/JoeBlogs-Back\u In\u Black-1980 /音乐/JoeBlogs-Back(黑色)(重新录制)-2003 /音乐/JoeBlogs-Back(黑色)(重新发行)-1987 /音乐/JoeBlogs-Thunder_Man-1947 我需要一个脚本来检查并告诉我何时存在“可能”的重复项,在上面的示例中,它将从目录列表中提取以下可能的重复项: /音乐/JoeBlogs-Back\u In\u Black-1980 /音乐/J

我有一个包含以下几个目录的目录:

/音乐/
/音乐/JoeBlogs-Back\u In\u Black-1980
/音乐/JoeBlogs-Back(黑色)(重新录制)-2003
/音乐/JoeBlogs-Back(黑色)(重新发行)-1987
/音乐/JoeBlogs-Thunder_Man-1947

我需要一个脚本来检查并告诉我何时存在“可能”的重复项,在上面的示例中,它将从目录列表中提取以下可能的重复项:

/音乐/JoeBlogs-Back\u In\u Black-1980
/音乐/JoeBlogs-Back(黑色)(重新录制)-2003
/音乐/JoeBlogs-Back_In_Black-(重新发行)-1987年

1) 这可能吗?

2) 如果是,请帮忙

如果目录名遵循常规结构,例如:

foo-Name_of_Interest-bar
然后你可以做一个简单的正则表达式,去掉“foo-”和“-bar”,并进行直接比较

如果这是不可能的,你将不得不做一个更昂贵的模式匹配算法。也许是类似于或。可能还有其他更合适的技术

Bash(3.2版或更高版本)中的简单匹配可能类似于以下代码段:

dir='/Music/JoeBlogs-Back_In_Black-(Remastered)-2003'
regex='^([^-]*)-([^-]*)-(.*)$'
if [[ ${BASH_REMATCH[1]} == ${prev_dir[1]} &&    #  "/Music/JoeBlogs"
      ${BASH_REMATCH[2]} == ${prev_dir[2]} ]]    #  "Back_In_Black"
then
    echo "we have a match"
fi
此代码段不显示
find…|阅读时…
循环或如何处理以前的条目和匹配列表。

跟进:

我通过编写下面的Perl脚本完成了我需要的工作。这是我有史以来第一个Perl脚本(我必须学习Perl来编写它——所以不要对我太苛刻:)

#/usr/bin/perl
#自述
# 
#检查文件夹中类似的相册
#例如:
#Arist-Back(黑色)(重新录制)-2001-XXX
#艺术家-背面黑色-(重新发行)-2000-YYY
#
#脚本提示您选择“zz”(将zz放在文件名前面,您可以稍后将其删除)
#
#配置
# 
#将mp3目录路径放入$mp3dirpath变量中
#
$mp3dirpath='/data/downloads/MP3';
#结束配置
@txt=qx{ls$mp3dirpath};
排序(@txt);
$re1='.'';
$re2='(?:[a-z][a-z0-9_z9]*);
$re3='.'';
$re4='(((?:[a-z][a-z0-9_z]*));
$re=$re1.$re2.$re3.$re4;
$foreach\u count\u before=0#设置计数器
$foreach\u count\u after=1#设置计数器
$number\u in\u arry=标量(@txt);
while($foreach\u count\u之前<$number\u到达){
如果($txt[$foreach\u count\u before]=~m/$re/is)
{ 
$var1=$1;
}
如果($txt[$foreach\u count\u after]=~m/$re/is)
{ 
$var2=$1;
}
如果($var1 eq$var2)
{
打印“------------------------------------\n”;
打印“$txt[$foreach\u count\u before]\n”;
打印“匹配项\n”;
打印“\n$txt[$foreach\u count\u after]\n”;
打印“我应该删除哪个?\n”;
打印“[1]$txt[$foreach\u count\u before]\n”;
打印“[2]$txt[$foreach\u count\u after]\n”;
打印“[任何其他键]不采取任何操作\n\n”;
$answer=;#获取用户输入,将其分配给变量
如果($answer==“1”){
打印“ZZing$txt[$foreach\u count\u before]”;
$originalfilename=$mp3dirpath.'/'.$txt[$foreach\u count\u before];
$newfilename=$mp3dirpath.'/'.'zz'.$txt[$foreach\u count\u before];
$originalfilename=trim($originalfilename);
$newfilename=trim($newfilename);
qx(mv$originalfilename$newfilename);
} 
elsif($answer==“2”){
打印“ZZing$txt[$foreach\u count\u after]”;
$originalfilename=$mp3dirpath.'/'.$txt[$foreach\u count\u after];
$newfilename=$mp3dirpath.'/'.'zz'.$txt[$foreach\u count\u after];
$originalfilename=trim($originalfilename);
$newfilename=trim($newfilename);
打印“mv$originalfilename$newfilename”;
qx(mv$originalfilename$newfilename);
} 
否则{
打印“不采取行动”;
}
}
$foreach_count_在++之前;
$foreach_count_在++之后;
}
#用于从变量中修剪空白的子例程
次修剪(元)
{
我的$string=shift;
$string=~s/^\s+/;
$string=~s/\s+$/;
返回$string;
}

您说您有几个目录,但只显示了一个目录。其他目录是什么样子的?Black-1980中划线
JoeBlogs-Back\u之间的第二个文本字段是否始终是歌曲名称?
#!/usr/bin/perl

# README
# 
# Checks a folder for Albums that are similar 
# eg : 
# Arist-Back_In_Black-(Remastered)-2001-XXX
# Artist-Back_In_Black-(Reissue)-2000-YYY
#
# Script prompts you for which one to "zz" (putting zz in front of the file name you can delete it later)
#
# CONFIG
# 
# Put your mp3 directory path in the $mp3dirpath variable
#

$mp3dirpath = '/data/downloads/MP3';

# END CONFIG


@txt= qx{ls $mp3dirpath};


sort (@txt);

$re1='.*?'; 
$re2='(?:[a-z][a-z0-9_]*)';
$re3='.*?';
$re4='((?:[a-z][a-z0-9_]*))';

$re=$re1.$re2.$re3.$re4;

$foreach_count_before=0; #Setups up counter
$foreach_count_after=1; #Setups up counter


$number_in_arry = scalar (@txt);

while ($foreach_count_before < $number_in_arry) {
                                        if ($txt[$foreach_count_before] =~ m/$re/is)
                                            { 
                                             $var1=$1;
                                             }
                                         if ($txt[$foreach_count_after] =~ m/$re/is)
                                            { 
                                             $var2=$1;
                                             }
                                         if ($var1 eq $var2)
                                            {
                                             print "-------------------------------------\n";
                                             print "$txt[$foreach_count_before] \n";
                                             print "MATCHES \n";
                                             print "\n$txt[$foreach_count_after] \n";
                                             print "Which Should I Remove? \n";
                                             print "[1] $txt[$foreach_count_before]\n";
                                             print "[2] $txt[$foreach_count_after]\n";
                                             print "[Any Other Key] Take No Action\n\n";

                                             $answer = <>;        # Get user input, assign it to the variable 
                                                if    ( $answer == "1" ) { 
                                                      print "ZZing $txt[$foreach_count_before]";
                                                      $originalfilename = $mp3dirpath . '/' . $txt[$foreach_count_before];
                                                      $newfilename = $mp3dirpath . '/' . 'zz' . $txt[$foreach_count_before];
                                                      $originalfilename = trim($originalfilename);
                                                      $newfilename = trim($newfilename);
                                                      qx(mv $originalfilename $newfilename);
                                                } 
                                                elsif ( $answer == "2" ) { 
                                                      print "ZZing $txt[$foreach_count_after]";
                                                      $originalfilename = $mp3dirpath . '/' . $txt[$foreach_count_after];
                                                      $newfilename = $mp3dirpath . '/' . 'zz' . $txt[$foreach_count_after];
                                                      $originalfilename = trim($originalfilename);
                                                      $newfilename = trim($newfilename);
                                                      print "mv $originalfilename $newfilename";
                                                      qx(mv $originalfilename $newfilename);
                                                } 
                                                else { 
                                                      print "Taking No Action"; 
                                                }

                                            }

                                           $foreach_count_before++;
                                           $foreach_count_after++;

                                        }

# SubRoutine For Trimming White Space from variables
sub trim($)
{
 my $string = shift;
 $string =~ s/^\s+//;
 $string =~ s/\s+$//;
 return $string;
}