Awk 对第三行的值求和并相应地划分行_Awk_Sed

Awk 对第三行的值求和并相应地划分行

awk sed

Awk 对第三行的值求和并相应地划分行,awk,sed,Awk,Sed,我有一个如下的文件，有n行，我想合计它的总和（基于第3列），并将相应的行分配到3个不同的文件中（基于每个文件的总和）例如，如果我们将第三列的所有值相加，则其总数为516，如果我们将其除以3，则其总数为172 所以我想在一个文件中添加一行，这样它就不会超过172标记，与第二个文件相同，其余所有行都应该移动到第三个文件输入文件 a aa 10 b ab 15 c ac 17 a dy 30 y ae 12 a dl 34 a fk 45 l ah 56 o aj 76 l ai 12 q a

我有一个如下的文件，有n行，我想合计它的总和（基于第3列），并将相应的行分配到3个不同的文件中（基于每个文件的总和）

例如，如果我们将第三列的所有值相加，则其总数为516，如果我们将其除以3，则其总数为172

所以我想在一个文件中添加一行，这样它就不会超过172标记，与第二个文件相同，其余所有行都应该移动到第三个文件

输入文件

a aa 10
b ab 15
c ac 17
a dy 30
y ae 12
a dl 34
a fk 45
l ah 56
o aj 76 
l ai 12 
q al 09
d pl 34
e ik 30
f ll 10
g dl 15 
h fr 17
i dd 23
j we 27
k rt 12
l yt 13
m tt 19

预期产量

file1(total -163)


a   aa  10
b   ab  15
c   ac  17
a   dy  30
y   ae  12
a   dl  34
a   fk  45

文件2（总计-153）

文件3（总数-200）

请您尝试使用GNU

awk

中显示的样本编写并测试以下内容

awk '
FNR==NR{
  sum+=$NF
  next
}
FNR==1{
  count=sum/3
}
{
  curr_sum+=$NF
}
(curr_sum>=count || FNR==1) && fileCnt<=2{
  close(out_file)
  out_file="file" ++fileCnt
  curr_sum=$NF
}
{
  print > (out_file)
}'   Input_file  Input_file

awk'
FNR==NR{
总和+=$NF
下一个
}
FNR==1{
计数=总和/3
}
{
curr_sum+=$NF
}
（curr_sum>=count | | FNR==1）和&fileCnt（out_file）
}'输入文件输入文件

说明：添加上述内容的详细说明

awk '                                               ##Starting awk program from here.
FNR==NR{                                            ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
  sum+=$NF                                          ##Taking sum of last field of all lines here and keep adding them to get cumulative sum of whole Input_file.
  next                                              ##next will skip all further statements from here.
}
FNR==1{                                             ##Checking condition if its first line for 2nd time reading of Input_file.
  count=sum/3                                       ##Creating count with value of sum/3 here.
}
{
  curr_sum+=$NF                                     ##Keep adding lst field sum in curr_sum here.
}
(curr_sum>=count || FNR==1) && fileCnt<=2{          ##Checking if current sum is <= count OR its first line(in 2nd time reading) AND output file count is <=2 here.
  close(out_file)                                   ##Closing output file here, may NOT be needed here since we are having only 3 files here in output.
  out_file="file" ++fileCnt                         ##Creating output file name here.
  curr_sum=$NF                                      ##Keep adding lst field sum in curr_sum here.
}
{
  print > (out_file)                                ##Printing current line into output file here.
}'   Input_file  Input_file                         ##Mentioning Input_file names here.

awk'##从这里启动awk程序。
FNR==NR{{##检查条件FNR==NR，当第一次读取输入文件时，该条件为真。
sum+=$NF##取此处所有行的最后一个字段的总和，并不断相加以获得整个输入文件的累积总和。
next##next将跳过此处的所有进一步语句。
}
FNR==1{##检查第一行是否为第二次读取输入文件的条件。
count=sum/3##在此处创建值为sum/3的count。
}
{
curr_sum+=$NF##继续在curr_sum中添加lst字段sum。
}
（curr_sum>=count | | FNR==1）和&fileCntawk'{L[nr++]=0；sum+=3}
END{sumpf=sum/3；sum=0；file=1；
对于（i in L）{split（L[i]，a）；
if（（sum+a[3]）>sumpf&&file“file”file；
总和+=a[3]；
}
}"输入,


此脚本将所有输入读取到数组L
，并计算sum
在结束块中，计算SUMPEF文件sumpf
，并完成输出

与其他解决方案相比，这只需要一个输入文件。
有没有其他规则，您在这里没有提到？我之所以这样问，是因为行l ah 56在文件2中，而前面的行+56的总数仍然低于172，它也应该适合文件1？@Luuk-没有其他规则，这只是一个例子这是我手工计算的。@TrueEntertainer，请您按照Sundeep先生的要求，添加您为解决此问题所付出的努力（这里没有任何错误或正确之处，因为我们都在这里学习）。我的答案已准备就绪，一旦您添加，我将添加它，干杯。a=$（awk'{s+=$3}END{print s}test7.txt）这将计算总的和，从c=$（$a/3））我们可以得到每个节点的值，我尝试了，但没有得到for循环的逻辑（抱歉，我还在学习Bash）@TrueEntertainer，感谢您展示您的努力，请在您的问题中添加它们，因为评论不是为了代码。非常感谢@RavinderSingh13解决了我的问题。@TrueEntertainer，欢迎您，继续在这个伟大的网站上分享和学习，所以干杯：）还有一件事，只要在你的问题中加上你的努力就可以了，因为这是一个非常受鼓励的问题，所以你也可以在一个问题的评论中加上你的努力（就像我之前在评论中提到的那样）cheers@RavinderSingh13-我们能在所有的3个文件中得到几乎相等的值吗，通过上面的代码，我可以看到所有3个文件的总和有一些不同——file1的总和是163，file2的总和是153，file3的总和是153200@TrueEntertainer，很抱歉，我没有收到，输出文件是根据您显示的样本提供的。你能告诉我你在找什么样的输出文件吗？是的，输出是正确的…我只是想知道，如果值可以以某种方式平均分布，目前我可以看到第一个/第二个和第三个文件总和之间的一些差异，例如-file1总和是163，文件2之和为153，文件3之和为200@Luuk-所以在上面的代码中，如果我对所有三个文件的输出求和，File1的和是143，file2的和是172，file3的和是201，我确实看到第一个和第三个文件的和之间有很大的差异，它们可以调整一下吗，所以这三个文件的总和几乎相同。这就是为什么我问是否有其他规则，而你回答说“没有其他规则”@Luuk-对于上述问题，我还提出了另一个问题，请您帮助-“”
awk '
FNR==NR{
  sum+=$NF
  next
}
FNR==1{
  count=sum/3
}
{
  curr_sum+=$NF
}
(curr_sum>=count || FNR==1) && fileCnt<=2{
  close(out_file)
  out_file="file" ++fileCnt
  curr_sum=$NF
}
{
  print > (out_file)
}'   Input_file  Input_file

awk '                                               ##Starting awk program from here.
FNR==NR{                                            ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
  sum+=$NF                                          ##Taking sum of last field of all lines here and keep adding them to get cumulative sum of whole Input_file.
  next                                              ##next will skip all further statements from here.
}
FNR==1{                                             ##Checking condition if its first line for 2nd time reading of Input_file.
  count=sum/3                                       ##Creating count with value of sum/3 here.
}
{
  curr_sum+=$NF                                     ##Keep adding lst field sum in curr_sum here.
}
(curr_sum>=count || FNR==1) && fileCnt<=2{          ##Checking if current sum is <= count OR its first line(in 2nd time reading) AND output file count is <=2 here.
  close(out_file)                                   ##Closing output file here, may NOT be needed here since we are having only 3 files here in output.
  out_file="file" ++fileCnt                         ##Creating output file name here.
  curr_sum=$NF                                      ##Keep adding lst field sum in curr_sum here.
}
{
  print > (out_file)                                ##Printing current line into output file here.
}'   Input_file  Input_file                         ##Mentioning Input_file names here.

awk '{ L[nr++]=$0; sum+=$3 }
     END{ sumpf=sum/3; sum=0; file=1; 
          for(i in L) { split(L[i],a); 
          if ((sum+a[3])>sumpf && file<3) { file+=1; sum=0; }; 
          print i, L[i] > "file" file;
          sum+=a[3]; 
        }
    }'  input