如何使用awk对重复行的值求和?

如何使用awk对重复行的值求和?,awk,Awk,我有一个csv文件,有11行,如下所示: Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat 05 Jun 2018,Mildred@email.com,205583995140400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Addres

我有一个csv文件,有11行,如下所示:

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Mildred@email.com,205583995140400,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
05 Jun 2018,Misty@email.com,205588935534900,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,4,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
我想删除该文件中的重复项,并对
数量
行中的值求和。我希望结果是这样的:

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Mildred@email.com,205583995140400,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Martha@email.com,205486016644400,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
05 Jun 2018,Misty@email.com,205588935534900,,1,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address
Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,Martha@email.com,205486016644400,,4,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,,3,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Rutwan Address

我只想对
Quantity
行中的值求和,其余的保持不变。我在中尝试过这个解决方案,但只有当文件只有2行,我有11行时,答案才起作用,因此它不起作用。如何使用awk?

awk
救命

$ awk 'BEGIN{FS=OFS=","} 
       NR==1{print; next} 
            {q=$5; $5="~"; a[$0]+=q} 
       END  {for(k in a) {sub("~",a[k],k); print k}}' file

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Misty@email.com,205588935534900,,3,Gold,05 Jun 2018 - 10:01,In Process,Rp3.000.000,Done,Rutwan Address
05 Jun 2018,Martha@email.com,205486016644400,,4,Gold,05 Jun 2018 - 10:01,In Process,Rp3.000.000,Done,Faishal  Address
05 Jun 2018,Mildred@email.com,205583995140400,,3,Gold,05 Jun 2018 - 10:01,In Process,Rp3.000.000,Done,Syahrul Address
请注意,不保证记录的顺序,但也不要求最初对其进行排序。要保持秩序,有多种解决方案

另外,我使用
~
作为占位符。如果您的数据包含此字符,则可以用未使用的字符替换

更新

保留顺序的步骤(基于行的第一次出现)

$awk'开始{FS=OFS=“,”}
NR==1{打印;下一个}
{q=$5;$5=“~”,如果(!(a中的$0))b[++c]=$0;a[$0]+=q}

END{for(k=1;k根据OP的要求,从Karafka的解决方案中直接进行自适应,并在其中添加一些代码,以获得正确顺序的行(它们在输入_文件中)

awk -F, '
FNR==1{
  print;
  next}
{
  val=$5;
  $5="~";
  a[$0]+=val
}
!b[$0]++{
  c[++count]=$0}
END{
  for(i=1;i<=count;i++){
     sub("~",a[c[i]],c[i]);
     print c[i]}
}' OFS=,   Input_file
awk-F,'
FNR==1{
印刷品;
下一个}
{
瓦尔=5美元;
$5="~";
a[$0]+=val
}
!b[$0]++{
c[++计数]=$0}
结束{

对于(i=1;iJoe,您误解了列的行,您的文件有11列…您确定吗?123456789AB是的,如果您不介意,您可以解释一下awk命令中的每个部分,这样我就不必再问这个问题了。谢谢您这应该是自明的(至少对我来说),请拍一张照片,如果你有具体的问题,我会解释…我喜欢占位符的东西(我写了一个更复杂的答案…)我如何保持顺序?该命令打乱了我的csv文件的顺序主要是因为解释,谢谢你BTW你需要添加
NR==1{print;next}
保持第一行不变。请编辑您的答案again@Joe,根据您的请求,我现在也在解决方案中添加了
FNR==1
条件。