Unix 使用数组中的值,键作为awk中列的子字符串

Unix 使用数组中的值,键作为awk中列的子字符串,unix,awk,Unix,Awk,我有一个名为order.csv的文件,数据如下 "Company","New Add Date" "ELECTRICAL INSULATION SUPPLIES","200212" "AVIS BUDGET GROUP","201110" "HONEYWELL AEROSPACE","201307" "AVIS BUDGET GROUP","201110" "MERCK SHARP & DOHME","199608" "PHARMA-BIO SERV INC","200803" "UPS

我有一个名为order.csv的文件,数据如下

"Company","New Add Date"
"ELECTRICAL INSULATION SUPPLIES","200212"
"AVIS BUDGET GROUP","201110"
"HONEYWELL AEROSPACE","201307"
"AVIS BUDGET GROUP","201110"
"MERCK SHARP & DOHME","199608"
"PHARMA-BIO SERV INC","200803"
"UPS STORE","200407"
"PROCTER & GAMBLE","200403"
"W HOLDING CO INC","200712"
"AVIS BUDGET GROUP","201110"
我希望根据第二列的最后2个字符获取日期(月的最后一个日期),为此,我使用以下命令:

awk -F, 'BEGIN{A[01]="31";A[02]="28";A[03]="31";A[04]="30";A[05]="31";A[06]="30";A[07]="31";A[08]="31";A[09]="30";A[10]="31";A[11]="30";A[12]="31";}{ print $1, substr($2,2,6)A[substr($2,6,2)] }' order.txt 
这给出了输出:

"Company" New Ad
"ELECTRICAL INSULATION SUPPLIES" 20021231
"AVIS BUDGET GROUP" 20111031
"HONEYWELL AEROSPACE" 201307
"AVIS BUDGET GROUP" 20111031
"MERCK SHARP & DOHME" 199608
"PHARMA-BIO SERV INC" 200803
"UPS STORE" 200407
"PROCTER & GAMBLE" 200403
"W HOLDING CO INC" 20071231

这不是提取我的结果,我做错了什么。

对不起,伙计们,我只是犯了个错误,现在我纠正了这个错误。我认为0被忽略了,现在我将这些键作为字符串

 awk -F, 'BEGIN{A["01"]="31";A["02"]="28";A["03"]="31";A["04"]="30";A["05"]="31";A["06"]="30";A["07"]="31";A["08"]="31";A["09"]="30";A["10"]="31";A["11"]="30";A["12"]="31";}{ print $1, substr($2,2,6)A[substr($2,6,2)] }' order.txt

因为二月的天数取决于一年是否是闰年,所以每月的天数取决于月份和年份

您可以使用以下
gawk
(GNU awk)脚本来实现这一点:

最后一天.awk:

可以这样称呼:

gawk -F, -f last_day.awk order.csv


顺便说一句,它是
gawk
特定的,因为使用了
mktime()
strftime()。请尝试以下操作:

awk -F'[",]' '
BEGIN{
    split("31,28,31,30,31,30,31,31,30,31,30,31", month,",")
}
{
    month[2]=((substr($5,1,4)%4+0)==0 && (substr($5,1,4)%100+0!=0)) || (substr($5,1,4)%400+0==0)?29:28;
    val=substr($5,5,2)~/^0/?1:2;
    print substr($0,1,length($0)-1)\
          month[substr($0,length($0)-val,val)]\
          substr($0,length($0))
}
'  Input_file

这也将处理闰年2月的问题。

哦,我得到了它,我需要使它像
A[“03”]=“31”
谢谢如果你没有选择所有有31天的日期,你的示例输入会有用得多。有人可以发布一个总是将天数设置为31天的解决方案(或者不考虑闰年或…),它似乎与您的示例输入一起工作。闰年中的二月是什么?好主意,但不是100%正确。如果一年可以除以100,那就不是闰年,除非它可以除以400。抱歉,hek2mgl,你能解释一下吗,我不明白。我会感激你的。只要看看。我刚才也发现了这一点。好的,现在看起来不错+plus应该可以与任何POSIX版本的awk一起使用。加一。我可能还会把它变成一个函数
is\u leap\u year()
。你所做的事情会更加明显。@EdMorton:在生成实际代码之前,实际上打印了一个变量的值,现在将其删除。当你设置一个日期,如4月31日,你将得到5月1日。对于那天(
1
),您可以使用模运算符获取上个月的最后一天。我不知道mktime()会这么做。我本以为它只会返回一个无效日期的
-1
。好的,很高兴知道,thx.:)这是我第一次向EdMorton展示新东西!现在就去度假三天。(需要快点)。。多好的时刻啊!:)
$ cat tst.awk
BEGIN { FS=OFS="\"" }
NR>1 {
    # Get the secs since epoch for the 1st of next month then subtract
    # 1 days worth of seconds to get the last day of this month
    nextMth = substr($4,5) % 12 + 1
    year = substr($4,1,4) + (nextMth == 1 ? 1 : 0)
    secs = mktime(year" "nextMth" 1 0 0 0") - 24*60*60
    $4 = strftime("%Y%m%d",secs)
}
{ print }

$ awk -f tst.awk file
"Company","New Add Date"
"ELECTRICAL INSULATION SUPPLIES","20021231"
"AVIS BUDGET GROUP","20111031"
"HONEYWELL AEROSPACE","20130731"
"AVIS BUDGET GROUP","20111031"
"MERCK SHARP & DOHME","19960831"
"PHARMA-BIO SERV INC","20080331"
"UPS STORE","20040731"
"PROCTER & GAMBLE","20040331"
"W HOLDING CO INC","20071231"
"AVIS BUDGET GROUP","20111031"
$ cat tst.awk
BEGIN { FS=OFS="\"" }
NR>1 {
    # Get the secs since epoch for the 1st of next month then subtract
    # 1 days worth of seconds to get the last day of this month
    nextMth = substr($4,5) % 12 + 1
    year = substr($4,1,4) + (nextMth == 1 ? 1 : 0)
    secs = mktime(year" "nextMth" 1 0 0 0") - 24*60*60
    $4 = strftime("%Y%m%d",secs)
}
{ print }

$ awk -f tst.awk file
"Company","New Add Date"
"ELECTRICAL INSULATION SUPPLIES","20021231"
"AVIS BUDGET GROUP","20111031"
"HONEYWELL AEROSPACE","20130731"
"AVIS BUDGET GROUP","20111031"
"MERCK SHARP & DOHME","19960831"
"PHARMA-BIO SERV INC","20080331"
"UPS STORE","20040731"
"PROCTER & GAMBLE","20040331"
"W HOLDING CO INC","20071231"
"AVIS BUDGET GROUP","20111031"