Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/shell/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Linux 如何在unix中比较具有不同列的文件?_Linux_Shell_Awk_Sed_Grep - Fatal编程技术网

Linux 如何在unix中比较具有不同列的文件?

Linux 如何在unix中比较具有不同列的文件?,linux,shell,awk,sed,grep,Linux,Shell,Awk,Sed,Grep,我想比较Today.txt和Main.txt的文件名。 如果存在匹配项,则使用新文件(如matched.txt)打印Main.txt中匹配文件的所有6列 以及与Main.txt不匹配的文件,然后在一个新文件中列出TODAY.txt的文件名和时间,比如unmatched.txt Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24

我想比较Today.txt和Main.txt的文件名。 如果存在匹配项,则使用新文件(如matched.txt)打印Main.txt中匹配文件的所有6列

以及与Main.txt不匹配的文件,然后在一个新文件中列出TODAY.txt的文件名和时间,比如unmatched.txt

Nov 4    +CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16
Main.txt

 date      filename          timestamp space  count   status
Nov 4    +CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
今天.txt

 filename       time
CHCK01_20161104.txt 06:03
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
AR01_20161104.txt   09:36
AR02_20161104.txt   09:36
ifs01_20161104.txt  21:16
TRIPS11_20161104.txt 09:16
所需输出: matched.txt

Nov 4    +CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16
unmatched.txt

Nov 4    +CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16
你能帮我一下吗


提前多谢

对于
awk
,匹配的
和不匹配的
各一个

$ awk 'NR==FNR{a[$1]; next} $3 in a{print > "matched.txt"}' Today.txt Main.txt 
$ cat matched.txt 
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

$ awk 'NR==FNR{a[$3]; next} !($1 in a) && FNR>1{print > "unmatched.txt"}' Main.txt Today.txt 
$ cat unmatched.txt 
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16
  • 逻辑类似,使用第一个文件参数的必需列将数组
    a
    初始化为
    awk
  • 然后,根据第二个文件的文件名是否应该出现在
    a
    中,打印到所需的输出文件

使用
grep
awk
组合:

$ grep -Ff <(awk 'NR>1{print $1}' Today.txt) Main.txt 
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

$ grep -vFf <(awk 'NR>1{print $3}' Main.txt) Today.txt | tail -n+2
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16

$grep-Ff带
awk
,匹配的
和不匹配的
各一个

$ awk 'NR==FNR{a[$1]; next} $3 in a{print > "matched.txt"}' Today.txt Main.txt 
$ cat matched.txt 
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

$ awk 'NR==FNR{a[$3]; next} !($1 in a) && FNR>1{print > "unmatched.txt"}' Main.txt Today.txt 
$ cat unmatched.txt 
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16
  • 逻辑类似,使用第一个文件参数的必需列将数组
    a
    初始化为
    awk
  • 然后,根据第二个文件的文件名是否应该出现在
    a
    中,打印到所需的输出文件

使用
grep
awk
组合:

$ grep -Ff <(awk 'NR>1{print $1}' Today.txt) Main.txt 
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

$ grep -vFf <(awk 'NR>1{print $3}' Main.txt) Today.txt | tail -n+2
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt  21:16

$grep-Ff
awk
救援

$ awk 'FNR==1{next} 
      NR==FNR{a[$1]=$2; next} 
      $3 in a{print; delete a[$3]} 
          END{for(k in a) print k,a[k] > "unmatched"}' today main > matched

$ head *matched

==> matched <==
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

==> unmatched <==
ifs01_20161104.txt 21:16
CHCK09_20161104.txt 21:46
CHCK05_20161104.txt 11:10
$awk'FNR==1{next}
NR==FNR{a[$1]=$2;next}
{打印;删除[$3]}中的$3
END{for(k in a)print k,a[k]>unmatched}'today main>matched
$head*匹配

==>匹配的未匹配的
awk
救援

$ awk 'FNR==1{next} 
      NR==FNR{a[$1]=$2; next} 
      $3 in a{print; delete a[$3]} 
          END{for(k in a) print k,a[k] > "unmatched"}' today main > matched

$ head *matched

==> matched <==
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time

==> unmatched <==
ifs01_20161104.txt 21:16
CHCK09_20161104.txt 21:46
CHCK05_20161104.txt 11:10
$awk'FNR==1{next}
NR==FNR{a[$1]=$2;next}
{打印;删除[$3]}中的$3
END{for(k in a)print k,a[k]>unmatched}'today main>matched
$head*匹配

==>matched unmatched以下是使用管道电源的答案

tail -n +2 /tmp/today | while read a b; do \
    if ! grep $a /tmp/main >> /tmp/matched; then \
        echo $a $b; \
    fi; \
done > /tmp/unmatched
解释

打印/tmp/今天,第一行除外

tail -n +2 /tmp/today
在两个变量中读取文件

while read a b
grep/tmp/main中的$a并存储在文件中

grep $a /tmp/main >> /tmp/matched
如果grep返回非零,则回显$a和$b

echo $a $b
输出:

root@do:~# cat /tmp/matched
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
root@do:~# cat /tmp/unmatched
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt 21:16
root@do:~#

下面是使用管道电源的答案

tail -n +2 /tmp/today | while read a b; do \
    if ! grep $a /tmp/main >> /tmp/matched; then \
        echo $a $b; \
    fi; \
done > /tmp/unmatched
解释

打印/tmp/今天,第一行除外

tail -n +2 /tmp/today
在两个变量中读取文件

while read a b
grep/tmp/main中的$a并存储在文件中

grep $a /tmp/main >> /tmp/matched
如果grep返回非零,则回显$a和$b

echo $a $b
输出:

root@do:~# cat /tmp/matched
Nov 4    CHCK01_20161104.txt  06:39   2.15M  17153    on_time
Nov 4    AR01_20161104.txt    09:31   0.04M  433      On_Time
Nov 4    AR02_20161104.txt    09:31   0.00M  7        On_Time
Nov 4    TRIPS11_20161104.txt 09:03   0.00M  24       On_Time
root@do:~# cat /tmp/unmatched
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt 21:16
root@do:~#

对于制表符分隔的输出,您可以设置
-vofs='\t'
,我有个问题要问您。我正在打印inprogress目录中带有加号(+)的文件,如示例所示。11月4日+CHCK01_20161104.txt 06:39 2.15M 17153准时正在进行的文件将附加加号(+),其他文件将在main.txt中使用相同的名称。我希望在我所需的输出中包含+symbol的文件和其他文件(匹配),请建议如何比较main.txt和Today.txt以获得匹配和未匹配的.txt?谢谢!对于制表符分隔的输出,您可以设置
-vofs='\t'
,我有个问题要问您。我正在打印inprogress目录中带有加号(+)的文件,如示例所示。11月4日+CHCK01_20161104.txt 06:39 2.15M 17153准时正在进行的文件将附加加号(+),其他文件将在main.txt中使用相同的名称。我希望在我所需的输出中包含+symbol的文件和其他文件(匹配),请建议如何比较main.txt和Today.txt以获得匹配和未匹配的.txt?谢谢!