Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/bash/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Bash 删除重复的用户名,并合并重复的列_Bash_Awk - Fatal编程技术网

Bash 删除重复的用户名,并合并重复的列

Bash 删除重复的用户名,并合并重复的列,bash,awk,Bash,Awk,现在我有几个不同的列表,我会尽力解释清楚 列表1如下所示: user1,host1:port1 user2,host2:port2 user1,host3:port3 user1 email1 user2 email2 user1 email1 user1 email1 host1:port1, host3:port3 user2 email2 host2:port2 我在数据库中查找用户名并返回以下内容: user1,host1:port1 user2,h

现在我有几个不同的列表,我会尽力解释清楚

列表1如下所示:

user1,host1:port1
user2,host2:port2
user1,host3:port3
user1   email1
user2   email2
user1   email1
user1   email1    host1:port1, host3:port3
user2   email2    host2:port2
我在数据库中查找用户名并返回以下内容:

user1,host1:port1
user2,host2:port2
user1,host3:port3
user1   email1
user2   email2
user1   email1
user1   email1    host1:port1, host3:port3
user2   email2    host2:port2
在我的示例中,两个文件都有重复的用户和电子邮件。但是,主机和端口可能都不同。获得如下输出的最有效方法是什么:

user1,host1:port1
user2,host2:port2
user1,host3:port3
user1   email1
user2   email2
user1   email1
user1   email1    host1:port1, host3:port3
user2   email2    host2:port2
我假设awk的高级使用,但坦率地说,像这样的东西我不知道。如有任何正确方向的帮助/指示,将不胜感激

$ cat file1
user1,host1:port1
user2,host2:port2
user1,host3:port3

$ cat file2
user1   email1
user2   email2
user1   email1

$ cat tst.awk                
BEGIN{ FS="[[:space:],]+" }
NR==FNR { user2hosts[$1][$2]; next }
{ user2email[$1] = $2 }
END {
   for (user in user2email) {
       printf "%s\t%s\t", user, user2email[user]
       sep = ""
       for (host in user2hosts[user]) {
           printf "%s%s", sep, host
           sep = ", "
       }
       print ""
   }
}

$ gawk -f tst.awk file1 file2
user1   email1  host1:port1, host3:port3
user2   email2  host2:port2
上面使用GNU awk 4.*表示二维阵列。

使用此awk:

awk -F '[, ]+' 'FNR==NR {a[$1]=$0; next}
$1 in a {
   if (!seen[a[$1]])
      seen[a[$1]] = $2;
   else
      seen[a[$1]] = seen[a[$1]] ", " $2
}
END { for (i in seen) print i, seen[i]}' list2 list1
user2   email2 host2:port2
user1   email1 host1:port1, host3:port3