使用awk在.CSV逗号分隔文件中添加双引号

使用awk在.CSV逗号分隔文件中添加双引号,csv,awk,quotes,comma,Csv,Awk,Quotes,Comma,嗨,我需要详细说明一个大的csv文件(2000万行),为每个逗号分隔的字段添加双引号。csv文件中有8个字段分隔,如下所示: '2016-03-12','12393659','134',,'35533605',189348,9798,gmail.com;live_com.com '2016-03-12','12390103','138',,'35438006',5133,1897,google.com '2016-03-12','45616164','139',,'01318800',10945

嗨,我需要详细说明一个大的csv文件(2000万行),为每个逗号分隔的字段添加双引号。csv文件中有8个字段分隔,如下所示:

'2016-03-12','12393659','134',,'35533605',189348,9798,gmail.com;live_com.com
'2016-03-12','12390103','138',,'35438006',5133,1897,google.com
'2016-03-12','45616164','139',,'01318800',10945593,596633,facebook.com;tumblr.com;t.co
'2016-03-12','45673436','38',,'86441702',4350985,150327,serving-sys.com;chartboost.com;admarvel.com;mydas.mobi;adap.tv;cloudfront.net
如您所见,前3个字段位于单引号之间,第4个字段为空,第5个字段位于单引号之间,第6到第8个字段仅以逗号分隔。 我希望得到以下结果(第四个字段即使为空也需要双引号):

我通过混合sed和awk部分获得结果:

sed -e s/\'//g inpu.csv > output.csv eliminate quotes
awk '{gsub(/[^,]+/,"\"&\"")}1' output.csv > output1.csv add double quotes
但是第四个字段不是双重引用的,我需要尽可能减少细化时间。 无论如何,这有助于在awk中以更好的表现和第四场双报价完成所有工作。
谢谢你的帮助。M.Tave

试试这个awk一行:

 awk -F, -v OFS="," -v re="^'?|'?$" -v q='"' 
                  '{for(i=1;i<=NF;i++)if($i)gsub(re,q,$i);else $i=q$i q}7' file

尝试一下这款awk one liner:

 awk -F, -v OFS="," -v re="^'?|'?$" -v q='"' 
                  '{for(i=1;i<=NF;i++)if($i)gsub(re,q,$i);else $i=q$i q}7' file

如果您的数据真的那么简单,没有嵌入引号、换行符或任何东西,那么您只需要:

$ awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' file
"2016-03-12","12393659","134","","35533605","189348","9798","gmail.com;live_com.com"
"2016-03-12","12390103","138","","35438006","5133","1897","google.com"
"2016-03-12","45616164","139","","01318800","10945593","596633","facebook.com;tumblr.com;t.co"
"2016-03-12","45673436","38","","86441702","4350985","150327","serving-sys.com;chartboost.com;admarvel.com;mydas.mobi;adap.tv;cloudfront.net"

如果您的数据真的那么简单,没有嵌入引号、换行符或任何东西,那么您只需要:

$ awk -F"'?,'?" -v OFS='","' '{$1=$1; gsub(/^.|$/,"\"")} 1' file
"2016-03-12","12393659","134","","35533605","189348","9798","gmail.com;live_com.com"
"2016-03-12","12390103","138","","35438006","5133","1897","google.com"
"2016-03-12","45616164","139","","01318800","10945593","596633","facebook.com;tumblr.com;t.co"
"2016-03-12","45673436","38","","86441702","4350985","150327","serving-sys.com;chartboost.com;admarvel.com;mydas.mobi;adap.tv;cloudfront.net"

像往常一样,Thanx再次发挥了巨大的作用。将使用更多的压缩代码。Thanx再次非常有效。与往常一样。将使用这个更压缩的代码。