Shell 为隐含的空白字段追加分隔符_Shell_Awk_Delimiter

Shell 为隐含的空白字段追加分隔符

shell awk

Shell 为隐含的空白字段追加分隔符,shell,awk,delimiter,Shell,Awk,Delimiter,我正在寻找一个简单的解决方案，使每行在文件（CSV文件）中具有相同数量的逗号 e、 g 文件示例： 1,1 A,B,C,D,E,F 2,2, 3,3,3, 4,4,4,4 预期： 1,1,,,, A,B,C,D,E,F 2,2,,,, 3,3,3,,, 4,4,4,4,, 在本例中，逗号数最多的行有5个逗号（第2行）。因此，我想在所有行中添加其他逗号，使每行具有相同的编号（即5个逗号）使用awk： $ awk 'BEGIN{FS=OFS=","} {$6=$6} 1' file 1,1,,,

我正在寻找一个简单的解决方案，使每行在文件（CSV文件）中具有相同数量的逗号

e、 g

文件示例：

1,1
A,B,C,D,E,F
2,2,
3,3,3,
4,4,4,4

预期：

1,1,,,,
A,B,C,D,E,F
2,2,,,,
3,3,3,,,
4,4,4,4,,

在本例中，逗号数最多的行有5个逗号（第2行）。因此，我想在所有行中添加其他逗号，使每行具有相同的编号（即5个逗号）

使用awk：

$ awk 'BEGIN{FS=OFS=","} {$6=$6} 1' file
1,1,,,,
A,B,C,D,E,F
2,2,,,,
3,3,3,,,
4,4,4,4,,

如上所述，在这种方法中，必须在命令中硬编码最大字段数。

请尝试以下更通用的方法。即使输入_文件中的字段数不相同，此代码也可以工作，并且将首先从整个文件中读取并获取最大数量的字段，然后第二次读取文件时，它将重置字段（因为我们设置了OFS as，所以如果当前行的字段数小于nf值，那么许多逗号将添加到该行）。@oguz ismail答案的增强版

awk '
BEGIN{
 FS=OFS=","
}
FNR==NR{
 nf=nf>NF?nf:NF
 next
}
{
 $nf=$nf
}
1
'  Input_file  Input_file

解释：添加上述代码的详细解释

awk '                ##Starting awk program frmo here.
BEGIN{               ##Starting BEGIN section of awk program from here.
 FS=OFS=","          ##Setting FS and OFS as comma for all lines here.
}
FNR==NR{             ##Checking condition FNR==NR which will be TRUE when first time Input_file is being read.
 nf=nf>NF?nf:NF      ##Creating variable nf whose value is getting set as per condition, if nf is greater than NF then set it as NF else keep it as it is,
 next                ##next will skip all further statements from here.
}
{
 $nf=$nf             ##Mentioning $nf=$nf will reset current lines value and will add comma(s) at last of line if NF is lesser than nf.
}
1                    ##1 will print edited/non-edited lines here.
' Input_file Input_file      ##Mentioning Input_file names here.

另一种做法是使CSV文件中的所有行具有相同数量的字段。字段的数量不需要知道。将计算

max

字段，并将所需逗号的子字符串附加到每个记录中，例如

awk -F, -v max=0 '{
    lines[n++] = $0             # store lines indexed by line number
    fields[lines[n-1]] = NF     # store number of field indexed by $0
    if (NF > max)               # find max NF value
        max = NF
}
END {
    for(i=0;i<max;i++)          # form string with max commas
        commastr=commastr","
    for(i=0;i<n;i++)            # loop appended substring of commas 
        printf "%s%s\n", lines[i], substr(commastr,1,max-fields[lines[i]])
}' file

awk-F，-v max=0'{
行[n++]=$0#存储按行号索引的行
字段[行[n-1]]=NF#存储由$0索引的字段数
如果（NF>最大值）#找到最大NF值
最大值=NF
}
结束{
对于（i=0；i行[n++]=$0#存储按行号索引的行
>字段[行[n-1]]=NF#存储由$0索引的字段数
>如果（NF>最大值）#找到最大NF值
>最大值=NF
> }
>结束{
>对于（i=0；i commastr=commastr“，”
>对于（i=0；i printf“%s%s\n”，行[i]，子行（commastr，1，最大字段[lines[i]]）
>}文件
1,1,,,,
A、 B、C、D、E、F
2,2,,,,
3,3,3,,,
4,4,4,4,,

@DavidC.Rankin，实际上我正在使用2个输入文件运行代码。第一个扫描并获取最大字段数值，第二个简单地执行

$nf=$nf

其oguzismail认为我使其更通用：）当我们执行

$nf=$nf

时，它会重新创建行并添加

，

如果当前行的字段数小于最大值，请随时进行查询，并且您在awk方面比我好：）我明白了——这也是一个很好的方法！我看到了两个输入文件，但没有捕捉到，这只是一个简单的方法，可以在同一个文件中运行两次，在第一次传递时收集最大值

：）

$ awk -F, -v max=0 '{
>     lines[n++] = $0             # store lines indexed by line number
>     fields[lines[n-1]] = NF     # store number of field indexed by $0
>     if (NF > max)               # find max NF value
>         max = NF
> }
> END {
>     for(i=0;i<max;i++)          # form string with max commas
>         commastr=commastr","
>     for(i=0;i<n;i++)            # loop appended substring of commas
>         printf "%s%s\n", lines[i], substr(commastr,1,max-fields[lines[i]])
> }' file
1,1,,,,
A,B,C,D,E,F
2,2,,,,
3,3,3,,,
4,4,4,4,,