Csv 使用awk获取正确的列并定义要使用的分隔符和分隔符?

Csv 使用awk获取正确的列并定义要使用的分隔符和分隔符?,csv,awk,grep,delimiter,Csv,Awk,Grep,Delimiter,我有一个20000列的csv。这里是它的一个子集: "eid","20216-2.0","20216-3.0","20220-2.0","20220-3.0" "1548197","1","hello","","2020-03-05" "2101984","2"

我有一个20000列的csv。这里是它的一个子集:

"eid","20216-2.0","20216-3.0","20220-2.0","20220-3.0"
"1548197","1","hello","","2020-03-05"
"2101984","2","string","","2020-03-04"
"2986696","3","no","","2020-04-05"
"1543304","3","ge","","2020-02-10"
"3207207","3","no","","2020-03-20"
"2373538","4","yesterday","","2020-03-01"
"4930973","5","today","","2020-03-06"
"6012673","54","tomorrow","","2020-05-05"
"4978627","1","yes","","2020-03-10"
我想使用awk获得两列:

awk -F "," '{ print $1, $3 }' input.csv  > output.csv
当我检查output.csv文件时,结果很混乱,如下所示:

"eid","20216-3.0"
"1548197","2020-03-05"
"2","string"
"no",""
"1543304",""
"","2020-03-20"
"yesterday",""
"4930973","2020-03-06"
"tomorrow","2020-05-05"
"4978627","2020-03-10"
有人能帮我吗

awk -v FPAT="([^,]+)|(\"[^\"]+\")" '{ print $1, $6713 }' input.csv  > output.csv

成功了!谢谢大家

怎么搞砸了?您展示的示例当然没有6713列,因此可能这就是问题所在,但我们确实无法告诉您预期的结果或如何解决此问题。我担心您确实需要澄清您现在得到了什么,以及您想要得到什么。另请参阅和,特别是提供a的指南。您的某些列是否可能在双引号中包含逗号?如上文所述,CSV文件中的字符串字段可以包含嵌入的引号,例如
“Smith,John”
“City,State”
。要正确处理这些问题,请尝试中的一种解决方案。Pandas知道如何处理带引号的字段,Awk只需按您的要求在每个逗号上拆分。不稳定行为的另一个可能来源是DOS回车,其中Python/Pandas更具弹性/Unix-y更少。但正如所贴,我认为这是不可复制的。你必须在你的FPAT中转义双引号,因为你在它周围使用了错误类型的引号。使用
FPAT='([^,]+)|(“[^”]+”)“
。你也不需要paren,但它们可以提高清晰度,而且不会造成伤害。如果你有未引用的空字段,那么FPAT将失败,因此你最好在第一部分使用
*
,而不是
+
,如果你有嵌套的双引号,它将失败。总之,我建议你使用
FPAT='[^,]*.(“[^”]**)”+“
谢谢@EdMorton