Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ruby-on-rails/57.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby on rails 分析格式错误的CSV行_Ruby On Rails_Ruby_Ruby On Rails 3_Csv_Fastercsv - Fatal编程技术网

Ruby on rails 分析格式错误的CSV行

Ruby on rails 分析格式错误的CSV行,ruby-on-rails,ruby,ruby-on-rails-3,csv,fastercsv,Ruby On Rails,Ruby,Ruby On Rails 3,Csv,Fastercsv,我正在解析以下CSV行。我需要拯救格式错误的行,这些行看起来像下面的“格式错误”。我可以用什么正则表达式来实现这一点?我需要考虑什么 body = %( "Sensitive",2416,159,"Test "Malformed" Failure",2789,111,7-24-11,1800,0600,"R2","12323","","" "Sensitive",2742,107,"Test",2791,112,7-24-11,1800,0600,"R1","","","" "Sensitive

我正在解析以下CSV行。我需要拯救格式错误的行,这些行看起来像下面的
“格式错误”
。我可以用什么正则表达式来实现这一点?我需要考虑什么

body = %(
"Sensitive",2416,159,"Test "Malformed" Failure",2789,111,7-24-11,1800,0600,"R2","12323","",""
"Sensitive",2742,107,"Test",2791,112,7-24-11,1800,0600,"R1","","",""
"Sensitive",2700,135,"Test",2792,113,7-24-11,1800,0600,"R1","12110","","")

rows = []
body.each_line do |line|
  begin
    rows << FasterCSV.parse_line(line)
  rescue FasterCSV::MalformedCSVError => e
    rows << line if rescue_from_malformed_line(line)
  rescue => e
    Rails.logger.error(e.to_s)
    Rails.logger.info(line)
  end
end
body=%(
“敏感”,2416159,“测试”格式错误的“故障”,2789111,7-24-1118000600,“R2”,“12323”,以及“
“敏感”,2742107,“测试”,2791112,7-24-1118000600,“R1”,“测试”
“敏感”,2700135,“测试”,2792113,7-24-1118000600,“R1”,“12110”,“测试”
行=[]
body.u每行do |行|
开始
e行
e行
Rails.logger.error(e.to_s)
Rails.logger.info(行)
结束
结束

我不确定数据的格式有多不正确,但这里有一种方法可以解决这一问题

> puts line
"Sensitive",2416,159,"Test "Malformed" Failure",2789,111,7-24-11,1800,0600,"R2","12323","",""
>
> puts line.scan /[\d.-]+|(?:"[^"]*"[^",]*)+/
"Sensitive"
2416
159
"Test "Malformed" Failure"
2789
111
7-24-11
1800
0600
"R2"
"12323"
""
""

注意:在ruby 1.9.2p290上测试

在传递给解析器之前,可以使用正则表达式将嵌套的双引号替换为单引号

差不多

.gsub(/(?<!^|,)"(?!,|$)/,"'")

.gsub(/(?对于这样一行,您希望采取什么措施?我建议在、、处拆分并删除引号,然后重建正确的csv并在下一行继续。我希望在ararys数组中包含双引号。错误引用的文本可能包含逗号吗?为什么不在输入之前离线预处理csv文件将它添加到您的应用程序中,而不是每次必须读取它时都要修复它?预处理然后将其传递给FasterCSV不是更昂贵吗?嘿,这也会分割日期,请尝试使用
line.split“,“
@Devin,我没有意识到有日期,已修复。非常感谢.Np,只是碰巧注意到了它。