Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/unix/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用UNIX join命令合并两个文件_Unix_Join - Fatal编程技术网

使用UNIX join命令合并两个文件

使用UNIX join命令合并两个文件,unix,join,Unix,Join,我正在尝试合并多个语法与以下示例相似的文件。目前,我只尝试了两个文件。文件将始终具有相同的行数、相同的日期、相同的时间,并以相同的顺序排序。唯一的区别应该在值字段中 File1.csv date,time,value,status 2014/09/10,22:47:25,-0.0000000003542,9 2014/09/10,23:14:25,-0.0000000002892,9 2014/09/10,23:23:46,0.0000000005406,9 2014/09/10

我正在尝试合并多个语法与以下示例相似的文件。目前,我只尝试了两个文件。文件将始终具有相同的行数、相同的日期、相同的时间,并以相同的顺序排序。唯一的区别应该在值字段中

File1.csv

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
File2.csv

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
所需输出

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
我对在合并结果中保留状态值不感兴趣。我尝试了join命令的多种变体,最近的一种是:

join -t, -a 1 -a 2 -o 1.1 1.2 1.3 2.3 File1.csv File2.csv
不幸的是,我得到的输出与下面类似,根本没有显示File1.csv中的数据

电流输出

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
有人有什么建议吗

谢谢


更新

作为后续工作,我返回并更新了输入文件,将日期和时间合并到一个字段中,如下所示

File1.csv

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
日期、时间、值、状态
2014/09/10 22:47:25,-0.0000000003542,9
2014/09/10 23:14:25,-0.0000000002892,9
2014/09/10 23:23:46,0.000000000 5406,9
2014/09/10 23:41:48,-0.0000000000 142,9
2014/09/11 00:18:40,-0.0000000009977,9

File2.csv

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
日期、时间、值、状态
2014/09/10 22:47:25,0.0000000 725578,9
2014/09/10 23:14:25,-0.0000000 283722,9
2014/09/10 23:23:46,-0.0000000 368988,9
2014/09/10 23:41:48,-0.0000000 675033,9
2014/09/11 00:18:40,-0.0000000 774759,9

因此,我已将join命令更新为如下所示:

加入-t,-a 1-a 2-o“1.1 1.2 2.2”文件1.csv文件2.csv

不幸的是,我仍然得到一个似乎忽略了File1.csv内容的输出

电流输出

date,time,value,status  
2014/09/10,22:47:25,-0.0000000003542,9  
2014/09/10,23:14:25,-0.0000000002892,9  
2014/09/10,23:23:46,0.0000000005406,9  
2014/09/10,23:41:48,-0.0000000000142,9  
2014/09/11,00:18:40,-0.0000000009977,9  
date,time,value,status  
2014/09/10,22:47:25,0.0000000725578,9  
2014/09/10,23:14:25,-0.0000000283722,9  
2014/09/10,23:23:46,-0.0000000368988,9  
2014/09/10,23:41:48,-0.0000000675033,9  
2014/09/11,00:18:40,-0.0000000774759,9  
date,time,value,value  
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
date,time,value,value  
,,,0.0000000725578  
,,,-0.0000000283722  
,,,-0.0000000368988  
,,,-0.0000000675033  
,,,-0.0000000774759  
,,,0.0000001042118  
日期\时间、值、值
,0.0000000 725578
,-0.0000000 283722
,-0.0000000368988
,-0.0000000 675033
,-0.0000000774759


更新

问题似乎与每个文件中的头关联。如果我从文件中删除头,然后尝试以下连接字符串:

加入-t,-a 1-a 2-o“1.1 1.2 2.2”文件1.csv文件2.csv

它提供以下所需输出:

2014/09/10 22:47:25,-0.0000000003542,0.0000000725578
2014/09/10 23:14:25,-0.0000000002892,-0.000000028722
2014/09/10 23:23:46,0000000005406,-0000000368988
2014/09/10 23:41:48,-0.0000000000 142,-0.0000000 675033
2014/09/11 00:18:40,-0.0000000009977,-0.0000000774759

有人知道让join忽略输入文件头的方法吗


谢谢,

awk一衬板,无需测试:

awk -F, -v OFS="," '{k=$1 FS $2}NR==FNR{a[k]=$3;next}
                                k in a{print k,a[k],$3}' file1 file2

您需要将所有输出字段规范放在一个参数中,因此必须引用它:

join -t, -a 1 -a 2 -o "1.1 1.2 1.3 2.3" File1.csv File2.csv
但是,这不会产生您想要的输出
join
在一个键字段上进行连接,该字段默认为第一个字段。由于您在多行中有相同的日期,因此这些行都连接在一起,结果是:

date,time,value,value
2014/09/10,22:47:25,-0.0000000003542,0.0000000725578
2014/09/10,22:47:25,-0.0000000003542,-0.0000000283722
2014/09/10,22:47:25,-0.0000000003542,-0.0000000368988
2014/09/10,22:47:25,-0.0000000003542,-0.0000000675033
2014/09/10,23:14:25,-0.0000000002892,0.0000000725578
2014/09/10,23:14:25,-0.0000000002892,-0.0000000283722
2014/09/10,23:14:25,-0.0000000002892,-0.0000000368988
2014/09/10,23:14:25,-0.0000000002892,-0.0000000675033
2014/09/10,23:23:46,0.0000000005406,0.0000000725578
2014/09/10,23:23:46,0.0000000005406,-0.0000000283722
2014/09/10,23:23:46,0.0000000005406,-0.0000000368988
2014/09/10,23:23:46,0.0000000005406,-0.0000000675033
2014/09/10,23:41:48,-0.0000000000142,0.0000000725578
2014/09/10,23:41:48,-0.0000000000142,-0.0000000283722
2014/09/10,23:41:48,-0.0000000000142,-0.0000000368988
2014/09/10,23:41:48,-0.0000000000142,-0.0000000675033
2014/09/11,00:18:40,-0.0000000009977,-0.0000000774759
相反,您可以在
time
字段中加入:

join -1 2 -2 2 -t, -a 1 -a 2 -o "1.1 1.2 1.3 2.3" File1.csv File2.csv

这是因为它需要对行进行排序。因此,如果有重复的时间,它将是无序的,并且与前一天的行不匹配。

似乎您最好在脚本中执行此操作,这样您可以检查前三列是否确实相等,然后进行联接。
join
只能联接一个字段,并且默认为第一个字段。您有多行的日期相同,因此无法使用。谢谢。我曾考虑将日期和时间合并为一个字段值,但我打算在计算出合并后再这样做。看起来我现在也应该考虑这个选择,我相信我可以用sed.Hmm来做。当我尝试基于时间域的连接时,我仍然得到与上面相同的输出。awk:第行附近的语法错误1@TrinityEllis我刚刚用上面的一个线性函数做了一个测试,它给出了准确的预期输出。一行中“}”和“k”之间的分隔符是什么?上面的图片对我来说有两行。当我尝试以空格作为分隔符的单行运行时,就是当我得到awk语法错误时。@TrinityEllis在
}
k
之间没有分隔符,您也可以将其设为空格。它是用GNUAWK测试的,我正在运行Sunos5.10,所以我很好奇awk对同一语句的解释是否会有差异。