将目录中的特定文件扩展名重命名为与bash中的另一个文件匹配_Bash

将目录中的特定文件扩展名重命名为与bash中的另一个文件匹配

bash

将目录中的特定文件扩展名重命名为与bash中的另一个文件匹配,bash,Bash,我在/home/cmccabe/Desktop/NGS/API/2-15-2016目录中有一组特定的下载文件（全部以.bam结尾）。我想做的是在name中使用与$2的匹配项来重命名下载的文件。为了使事情更复杂，文件夹的日期是唯一的，在name的标题中，匹配的日期存在，并且是name中匹配的位置。我不知道该怎么做，也不知道这是否可能。谢谢：）文件夹内容/home/cmccabe/Desktop/NGS/API/2-15-2016 IonXpress_001.bam IonXpress_002.b

我在

/home/cmccabe/Desktop/NGS/API/2-15-2016

目录中有一组特定的下载文件（全部以

.bam

结尾）。我想做的是在

name

中使用与

$2

的匹配项来重命名下载的文件。为了使事情更复杂，文件夹的日期是唯一的，在

name

的标题中，匹配的日期存在，并且是

name

中匹配的位置。我不知道该怎么做，也不知道这是否可能。谢谢：）

文件夹内容
/home/cmccabe/Desktop/NGS/API/2-15-2016

IonXpress_001.bam
IonXpress_002.bam
IonXpress_003.bam
IonXpress_007.bam
file1.gz
file2.gz

名称

2-15-2016
IonXpress_001.bam testname1_12345
IonXpress_002.bam testname2_45678
IonXpress_003.bam testname3_9012
IonXpress_007.bam testname1_12345-
2-19-2016
IonXpress_001.bam testname5_00000
IonXpress_002.bam testname6_11111
IonXpress_003.bam testname7_1213
IonXpress_007.bam testname8_78524

期望的结果

testname1_12345.bam
testname2_45678.bam
testname3_9012.bam
testname1_12345.bam
file1.gz
file2.gz

bash到目前为止

logfile=/home/cmccabe/Desktop/NGS/API/2-15-2016/process.log
for f in /home/cmccabe/Desktop/NGS/API/2-15-2016/*.bam ; do
echo "patient identifier creation: $(date) - File: $f"
bname=$(basename $f)
pref=${bname%%.bam}
while read from to ; do
for i in $f* ; do
if [ "$i" != "${i/$from/$to}" ] ; then
  mv $i ${i/$from/$to}
fi
done < names.txt
echo "End patient identifier creation: $(date) - File: $f"
done >> "$logfile"

你可以这样做我在sed中使用你的f变量：

 cmd=$(sed -n "/$f/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{s/\(.*\.bam\) \(.*\)/mv \1 \2/p}" names.txt)
 # for testing use echo and this will also save what you just tried 
 #to do to your log file :) just in case.
 echo "$cmd"
 # when it works the way you want
 # uncomment the next line and it will execute your command :)
 #eval "$cmd"

这会告诉sed不要打印它用-n读取的行

然后是从匹配日期（$f）的行到下一个数据模式DD-DD-20DD（regex:[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}），在{

{}内的命令是一个替换“s”命令，它将匹配一个模式并用另一个模式替换它

我告诉它把字符串一直带到.bam，然后把它放在\（和\）之间，将它作为一个组，然后匹配行的其余部分，将它放在另一个组中

替换模式是mv字符串，后面是匹配模式中捕获的组1，然后是组2的字符串。有效地创建mv file.bam new_filename命令列表

然后将它们存储在cmd变量中

eval将执行该命令

我以您的name.txt文件的示例内容为例，进行了转换以说明：

 ~$echo "2-12-2016
 IonXpress_001.bam testname1_12345
 IonXpress_002.bam testname2_45678
 IonXpress_003.bam testname3_9012
 IonXpress_007.bam testname1_12345-
 2-19-2016
 IonXpress_001.bam testname5_00000
 IonXpress_002.bam testname6_11111
 IonXpress_003.bam testname7_1213
 IonXpress_007.bam testname8_78524" |sed -n "/$f/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{s/\(.*\.bam\) \(.*\)/mv \1 \2/p}"
 mv IonXpress_001.bam testname1_12345
 mv IonXpress_002.bam testname2_45678
 mv IonXpress_003.bam testname3_9012
 mv IonXpress_007.bam testname1_12345-
 mv IonXpress_001.bam testname5_00000
 mv IonXpress_002.bam testname6_11111
 mv IonXpress_003.bam testname7_1213
 mv IonXpress_007.bam testname8_78524

更新：从您的评论和编辑中，我发现我不太擅长解释：）我这里是您的脚本的编辑版本。我将假设您在运行此命令时位于/home/cmccabe/Desktop/NGS/API/文件夹中。如果不是的话，我相信你会知道如何做出改变，或者让它成为一个论点

 logfile=/home/cmccabe/Desktop/NGS/API/2-15-2016/process.log
 # no need to loop for each file ending in bam as the name file
 # will be our driver. After all if the entry is not present in
 # the name file then we really cannot do anything.

 # First lets get the date from the folder name:
 #    pwd will return the current working directory (which we are supposed 
 #        to be in the directory to process)
 #    basename will strip all but the last folder name, hence the date
 date_to_process=$(basename $(pwd))

 # variable to store name file path (hint change this to where it really is or pass as argument to script)
 name_file_path = "/home/cmccabe/Desktop/NGS/panels/names.txt"

 # from the name file build the file move (mv) commmands using sed 
 # as described before and store that command in the cmd variable.
 # note that I added a couple of echo commands to have the same output you 
 # were trying to do. I also split the command on multiple lines 
 # for clarity (well I hope it makes it more clear at least).
 cmd=$(sed -n "/$date_to_process/,/[0-9]{1,2}-[0-9]{1,2}-20[0-9]{2}/{
    s/\(.*\.bam\) \(.*\)/echo \"Start patient identifier creation: \$(date) - File: \1\"\n mv \1 \2\n echo \"End patient identifier creation: \$(date) - File: \1\"/p
 }" $name_file_path)

 # print the generated commands to you can see what it did.
 echo "about to execute this command: 
 $cmd" 

 # execute the commands to perform the move operations and send the 
 #output to the log file. Make sure to pipe stderr (errors) to the log file 
 # too so you will know what/if something failed. (using 2>&1) this will make all stderr go to the same pipe as stdin. 
 eval "$cmd" >> "$logfile" 2>&1

您可以将此

用于awk
循环：
cd /home/cmccabe/Desktop/NGS/

for file in API/*/*.bam; do
   f="${file##*/}"
   path="${file%/*}"
   dt="${path##*/}"
   mv "$file" "$path/$(awk -v dt="$dt" -v f="$f" 'NF==1 {
               p=$0==dt ? 1 : 0; next} p && $1==f{print $2}' names.txt)"
done

文件夹的结束日期（/home/cmccabe/Desktop/NGS/API/2-15-2016
）将与名称中的标题匹配，这就是匹配所在的位置。非常感谢：）。很抱歉，这些都是打字错误，文件中应该有一个始终匹配的地方。如果没有匹配项，则应该是错误的。谢谢：）。我在帖子中添加了一个编辑，我做得不对。。。。非常感谢：）@Chris我用一个简单的脚本更新了我的帖子，应该可以满足你的需要。我建议推荐eval行，首先确保脚本符合您的要求（它将打印它生成的命令，以便您可以确保它符合您的要求）@Chris很高兴您得到了另一个答案。。我曾考虑使用Awk，一次处理所有目录，但我不确定这是否是您想要的。解决同样的问题有上千种方法。。我倾向于创建一个cmd变量或生成一个temp脚本，因为我可以设置一个参数（通常是-n），它实际上不会做任何事情，但会让我知道它会做什么。。这样，我就可以确保在运行某种可以工作的东西之前，我得到了正确的答案，然后必须找出什么有效，什么失败，等等。for
与awk
的循环似乎有效，但我得到了mv:'API/2-12-2016/IonXpress\u 001\u newheader.bam'和'API/2-12-2016/IonXpress\u 001\u newheader.bam'是相同的文件
，而不是重命名它们。非常感谢：）。我已经测试了这个命令，它在我的系统上运行良好。确保您的names.txt没有DOS行结尾。如果它确实有DOS行结尾，那么首先在该文件上运行dos2unix
。要单独测试awk
，请使用：cd/home/cmccabe/Desktop/NGS/；awk-v dt='2-15-2016'-v f=IonXpress_001_newheader.bam''NF==1{p=$0==dt？1:0；next}p&&1==f{print$2}'names.txt并查看它给出了什么输出。
cd /home/cmccabe/Desktop/NGS/

for file in API/*/*.bam; do
   f="${file##*/}"
   path="${file%/*}"
   dt="${path##*/}"
   mv "$file" "$path/$(awk -v dt="$dt" -v f="$f" 'NF==1 {
               p=$0==dt ? 1 : 0; next} p && $1==f{print $2}' names.txt)"
done