Shell 如何在读取文件时忽略空白和逗号_Shell_Unix_Awk_Ksh

Shell 如何在读取文件时忽略空白和逗号

shell unix awk

Shell 如何在读取文件时忽略空白和逗号,shell,unix,awk,ksh,Shell,Unix,Awk,Ksh,我有一个逗号分隔的文件，需要从每行提取第三个字段。file test.txt包含以下内容： 6,STRING TO DECIMAL WITHOUT DEFAULT,cast($src_fld as DECIMAL(15,2) $tgt_fld 7,STRING TO INTERGER WITHOUT DEFAULT,cast($src_fld as integer) $tgt_fld 10,DEFAULT NO RULE,'$default' $tgt_fld

我有一个逗号分隔的文件，需要从每行提取第三个字段。file test.txt包含以下内容：

6,STRING TO DECIMAL WITHOUT DEFAULT,cast($src_fld as DECIMAL(15,2) $tgt_fld 
7,STRING TO INTERGER WITHOUT DEFAULT,cast($src_fld as integer) $tgt_fld                
10,DEFAULT NO RULE,'$default' $tgt_fld

cat test.txt | awk-F，“{print$3}”

如果我写上面的命令，我会得到一个不正确的输出，如下所示：

> cast($src_fld as DECIMAL(15
> cast($src_fld as integer) $tgt_fld
> '$default' $tgt_fld

有人能告诉我如何实现它吗？我需要把它写在一个循环中，以便以后可以做进一步的处理。请注意，第三个字段可能包含空格和逗号（，）。

如果前两个字段中有逗号，则您的任务不可能完成

1,second,field,with,commas,third,field,with,commas

您无法知道第二个字段结束和第三个字段开始的位置

您必须使用实际的CSV语法，并使用CSV解析器解析文件

1,"second,field,with,commas","third,field,with,commas"

如果可以确定前两个字段中没有逗号，则可以执行以下操作：

sed 's/^[^,]\+,[^,]\+,//' file

如果前两个字段中有逗号，则无法完成任务

1,second,field,with,commas,third,field,with,commas

您无法知道第二个字段结束和第三个字段开始的位置

您必须使用实际的CSV语法，并使用CSV解析器解析文件

1,"second,field,with,commas","third,field,with,commas"

如果可以确定前两个字段中没有逗号，则可以执行以下操作：

sed 's/^[^,]\+,[^,]\+,//' file

awk

救援

不是通用解决方案，但适用于您的格式

$ awk -F, '{for(i=4;i<=NF;i++) $3 = $3 FS $i} {print $3}' badcsv

cast($src_fld as DECIMAL(15,2) $tgt_fld
cast($src_fld as integer) $tgt_fld
'$default' $tgt_fld

$awk-F'{（i=4；iawk
救命
不是通用解决方案，但适用于您的格式
$ awk -F, '{for(i=4;i<=NF;i++) $3 = $3 FS $i} {print $3}' badcsv

cast($src_fld as DECIMAL(15,2) $tgt_fld
cast($src_fld as integer) $tgt_fld
'$default' $tgt_fld

$awk-F，'{for（i=4；i如您所说，如果前两个字段不包含逗号，您可以使用带有逗号的cut
作为字段分隔符：
$ cut -d ',' -f 3- test.txt 
cast($src_fld as DECIMAL(15,2) $tgt_fld 
cast($src_fld as integer) $tgt_fld                
'$default' $tgt_fld

如您所说，如果前两个字段不包含逗号，则可以使用带有逗号的cut
：
$ cut -d ',' -f 3- test.txt 
cast($src_fld as DECIMAL(15,2) $tgt_fld 
cast($src_fld as integer) $tgt_fld                
'$default' $tgt_fld

你没有告诉我们正确的输出是什么，只是它不是什么，所以这只是一个猜测，你可能想要什么，但如果这不太正确，你应该能够从中找出你需要什么：
$ cat tst.awk
BEGIN { FS="," }
{
    $0 = gensub(/([(][^()]+),([^()]+[)])/,"\\1"RS"\\2","g",$0)
    for (i=1; i<=NF; i++) {
        gsub(RS,FS,$i)
        print NR, NF, i, $i
    }
    print "----"
}

$ awk -f tst.awk file
1 3 1 6
1 3 2 STRING TO DECIMAL WITHOUT DEFAULT
1 3 3 cast($src_fld as DECIMAL(15,2) $tgt_fld
----
2 3 1 7
2 3 2 STRING TO INTERGER WITHOUT DEFAULT
2 3 3 cast($src_fld as integer) $tgt_fld
----
3 3 1 10
3 3 2 DEFAULT NO RULE
3 3 3 '$default' $tgt_fld
----

$cat tst.awk
开始{FS=“，”}
{
$0=gensub（/（[（[^（）]+），（[^（）]+[）]）/，“\\1”RS“\\2”，“g”，“$0）
对于（i=1；i您没有告诉我们正确的输出是什么，只是它不是什么，因此这是您可能想要的猜测，但如果这不完全正确，您应该能够从中找出您需要的：
$ cat tst.awk
BEGIN { FS="," }
{
    $0 = gensub(/([(][^()]+),([^()]+[)])/,"\\1"RS"\\2","g",$0)
    for (i=1; i<=NF; i++) {
        gsub(RS,FS,$i)
        print NR, NF, i, $i
    }
    print "----"
}

$ awk -f tst.awk file
1 3 1 6
1 3 2 STRING TO DECIMAL WITHOUT DEFAULT
1 3 3 cast($src_fld as DECIMAL(15,2) $tgt_fld
----
2 3 1 7
2 3 2 STRING TO INTERGER WITHOUT DEFAULT
2 3 3 cast($src_fld as integer) $tgt_fld
----
3 3 1 10
3 3 2 DEFAULT NO RULE
3 3 3 '$default' $tgt_fld
----

$cat tst.awk
开始{FS=“，”}
{
$0=gensub（/（[（[^（）]+），（[^（）]+[）]）/，“\\1”RS“\\2”，“g”，“$0）
对于（i=1；i当您想要使用循环时，可以使用
while IFS=, read -r field1 field2 rest_of_line; do
   echo "Field 3: ${rest_of_line}" 
done < test.txt

当IFS=，读取-r字段1字段2行的剩余部分；执行
echo“字段3:${rest\u of_line}”
完成
如果要使用循环，可以使用
while IFS=, read -r field1 field2 rest_of_line; do
   echo "Field 3: ${rest_of_line}" 
done < test.txt

当IFS=，读取-r字段1字段2行的剩余部分；执行
echo“字段3:${rest\u of_line}”
完成
我不明白。awk
是否逐行循环文件？我现在用正确的输出编辑了我的问题。不，你没有。所有问题显示的都是你不想要的输出，而不是你想要的输出。我不明白。awk
是否逐行循环文件？我用正确的输出编辑了我的问题现在输出。不，您没有。所有问题显示的是您不想要的输出，而不是您想要的输出。感谢您的快速回复。我确信前两个字段中不会有任何逗号。您能解释一下上述sed命令的工作原理吗？我能从上述命令中只提取第三个字段吗？它会删除前两个字段逗号分隔的字段，只留下第三个。[^，]\+
是一个或多个非逗号字符。感谢您的快速回复。我确信前两个字段中不会有任何逗号。能否请您解释一下上述sed命令的工作原理？我是否可以从上述命令中仅提取第三个字段？它将删除前两个逗号分隔的字段，只留下第三个。[^，]\+
是一个或多个非逗号字符。谢谢您的回复。您能告诉我如何在for循环中使用此命令吗。因为在提取第三个字段后，我需要处理它。如果我在cut-d'、'-f3-test.txt
；do echo$i；done；，中以i的形式写入它，它将无法提供正确的输出。@user1768029您可以执行当IFS=read-r line；do echo$line；done<@user1768029、@user1768029时，不要使用shell循环来处理文本（例如，请参阅）谢谢你的回复。你能告诉我如何在for循环中使用这个命令吗？因为在提取第三个字段之后我需要处理它。如果我在cut-d'，'-f3-test.txt
；do echo$i；done；，它不能提供正确的输出。@user1768029当IFS=read-r line；do echo$line；done时，你可以执行<@user1768029、@user1768029，不要使用shell循环来操作文本（例如，请参阅）