Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/linux/24.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Linux awk:根据给定列的前3个不同值选择行_Linux_Unix_Awk_Gawk - Fatal编程技术网

Linux awk:根据给定列的前3个不同值选择行

Linux awk:根据给定列的前3个不同值选择行,linux,unix,awk,gawk,Linux,Unix,Awk,Gawk,我希望读取fileIn.txt(逗号分隔)并输出fileOut.txt,其中只有与给定列的前3个不同值匹配的行。例如,我的输入文件如下所示: fileIn.txt #location,day,time home,mon,01:00 office,mon,06:00 home,mon,10:00 office,tues,03:00 home,wed,08:00 home,wed,11:00 home,thurs,02:00 home,fri,01:00 diner,fri,07:00 party,

我希望读取fileIn.txt(逗号分隔)并输出fileOut.txt,其中只有与给定列的前3个不同值匹配的行。例如,我的输入文件如下所示:

fileIn.txt
#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00
home,thurs,02:00
home,fri,01:00
diner,fri,07:00
party,fri,09:00
home,sat,02:00
mall,sat,06:00
home,sat,09:00
beach,sun,01:00
fileOut.txt
#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00
我只想选择前3个不同日期的行,因此我的输出文件如下所示:

fileIn.txt
#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00
home,thurs,02:00
home,fri,01:00
diner,fri,07:00
party,fri,09:00
home,sat,02:00
mall,sat,06:00
home,sat,09:00
beach,sun,01:00
fileOut.txt
#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00

你的问题有点让人困惑。但是如果我理解正确的话,您希望打印出任何一行,其中一周中的某一天与脚本在文件中找到的前3个不同值中的一个匹配。你可以用awk这样做

BEGIN { FS="," }

{
    if(dayCount < 3 && !($2 in days)) { days[$2] = 1; ++dayCount }
    if ($2 in days) { print }
}
BEGIN{FS=“,”}
{
如果(dayCount<3&&!($2天)){days[$2]=1;++dayCount}
如果($2天){print}
}

你的问题有点让人困惑。但是如果我理解正确的话,您希望打印出任何一行,其中一周中的某一天与脚本在文件中找到的前3个不同值中的一个匹配。你可以用awk这样做

BEGIN { FS="," }

{
    if(dayCount < 3 && !($2 in days)) { days[$2] = 1; ++dayCount }
    if ($2 in days) { print }
}
BEGIN{FS=“,”}
{
如果(dayCount<3&&!($2天)){days[$2]=1;++dayCount}
如果($2天){print}
}

awk
救援! 以更惯用的形式包含标题

$ awk -F, 'NR==1{c[$2]} length(c)<4{c[$2]} $2 in c' file

#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00

这取决于短路逻辑操作,直到它被初始化为
NR==1

awk
后才计算长度,以进行救援!
awk -F, '
    /^#/            {print; next}   # keep comments
    ++seen[$2] == 1 {count++}       # incr counter the first time value is seen
    count > 3       {exit}          # quit if we have seen 4 values
                    {print}         # otherwise print this line
' file
以更惯用的形式包含标题

$ awk -F, 'NR==1{c[$2]} length(c)<4{c[$2]} $2 in c' file

#location,day,time
home,mon,01:00
office,mon,06:00
home,mon,10:00
office,tues,03:00
home,wed,08:00
home,wed,11:00

根据短路逻辑运算,在初始化长度为
NR==1

之前,不计算长度取决于短路逻辑运算。是否可以假定输入文件已排序?是的,您可以假定。是否可以假定输入文件已排序?是的,您可以假定。您可以。。。你能帮我们这些新手破译一下吗?哇,是的,请。你这个巫师。。。你能帮我们这些新手破译一下吗?哇,是的,请。
awk -F, '
    /^#/            {print; next}   # keep comments
    ++seen[$2] == 1 {count++}       # incr counter the first time value is seen
    count > 3       {exit}          # quit if we have seen 4 values
                    {print}         # otherwise print this line
' file