Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ruby on rails Ruby:CSV解析器在我的数据中被双引号绊倒_Ruby On Rails_Ruby_Csv_Rake Task - Fatal编程技术网

Ruby on rails Ruby:CSV解析器在我的数据中被双引号绊倒

Ruby on rails Ruby:CSV解析器在我的数据中被双引号绊倒,ruby-on-rails,ruby,csv,rake-task,Ruby On Rails,Ruby,Csv,Rake Task,我每天都在做一个rake任务,它会下载一个CSV,每天自动发送到Dropbox,解析并保存到数据库中。我无法控制将数据输入生成CSV报告的程序的方式,因此我无法避免在某些数据中使用双引号。但是,我想知道是否有一种方法可以在rake任务中剥离或用单引号替换它们,或者以某种方式通知解析器,使其不会抛出此错误 Rake任务代码: require 'net/http' require 'csv' require 'open-uri' namespace :fp_import do desc "

我每天都在做一个rake任务,它会下载一个CSV,每天自动发送到Dropbox,解析并保存到数据库中。我无法控制将数据输入生成CSV报告的程序的方式,因此我无法避免在某些数据中使用双引号。但是,我想知道是否有一种方法可以在rake任务中剥离或用单引号替换它们,或者以某种方式通知解析器,使其不会抛出此错误

Rake任务代码:

require 'net/http'
require 'csv'
require 'open-uri'

namespace :fp_import do
    desc "download abc_relations from dropbox, save as csv, create or update record in db"
    task :fp => :environment do
        data = URI.parse("<<file's dropbox link>>").read

       File.open(Rails.root.join('lib/assets', 'fp_relation.csv'), 'w') do |file|
         file.write(data)
       end

       file= Rails.root.join('lib/assets', 'fp_relation.csv')

        CSV.foreach(file) do |row|
            div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = row
            fg_sku = fg_style + "-" + fg_color
            part_sku = part_style + "-" + part_color

            relation = FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
            if relation == false

                FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
            end
        end
    end
end

CSV无效,应转义引号。如果不需要其他特殊处理,您可以逐行读取文件,按
拆分,并删除前导/尾随

更新版本

file = Kernel.open(File.join(__dir__, 'input.almost_csv'))
file.each do |line|
  values = line.split(',')
  values = values.map do |value|
    value[1...-1] # Remove leading and trailing double-quote
  end

  div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = values
  fg_sku = fg_style + "-" + fg_color
  part_sku = part_style + "-" + part_color

  if !FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
    FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
  end

end
请注意:

  • 您不需要
    @
    局部范围变量就足够了
  • 如果还希望删除字符串中的引号,则可以操作
    映射
    块中的值
  • 这仅在值中没有列分隔符
    时有效

源CSV格式不正确,应先转义引号

我会在用CSV解析文件之前对其进行编辑,删除逗号之间的引号,并用简单的引号替换双引号。如果您不想编辑原始文件,可以创建一个新文件

def fix_csv(file)
  out = File.open("fixed_"+file, 'w')
  File.readlines(file).each do |line|
    line = line[1...-2] #remove beggining and end quotes
    line.gsub!(/","/,",") #remove all quotes between commas
    line.gsub!(/"/,"'") #replace double quotes to single
    out << line +"\n" #add the line plus endline to output
  end

  out.close
  return "fixed_"+file
end
def fix_csv(文件)
out=File.open(“修复”+文件'w')
File.readlines(文件)。每个do |行|
行=行[1…-2]#删除行号和结束引号
line.gsub!(/“,”/,“,”)#删除逗号之间的所有引号
line.gsub!(/“/,””)#将双引号替换为单引号

out@Flip我不确定你的更正是否正确。@Tatiane:您的csv数据中有“**”部分,还是用来标记关键代码?如果所有数据都与此摘录类似,您可以删除所有数据“在您使用csv之前。@克努特:我明白了,您可能是对的。。将撤消该部分。”。感谢您指出这一点。@knut您是对的,我用“**”突出显示了关键代码,它实际上不是数据的一部分。您不能保证任何形式错误的CSV行,因此您最好拒绝错误的行,稍后再清理它们。为什么要修复它,然后再次解析它?修复后,它已被解析并准备导入。@pascalbetz,以防您不想修改原始csvI。请参阅。除非您需要另一个进程的已清理文件,否则可以保持原样,并在清理后将其导入AR。所以不需要读、清理、写、读、导入。@pascalbetz是的,谢谢,我知道。事情是这样的,我们已经在答案中的代码不需要修改,清理逻辑与处理逻辑分离。嗨@agush,我只想保留1个正确格式的csv,而不是创建2个。我是gsub的新手!方法。如果我只使用line.gsub!(/“/,“'”),如何将更改保存到现有文件?谢谢pascal!对不起,我不确定我的代码中应该包含哪些内容。请您再解释一下,在我发布的代码中应该包含哪些内容?
file = Kernel.open(File.join(__dir__, 'input.almost_csv'))
file.each do |line|
  values = line.split(',')
  values = values.map do |value|
    value[1...-1] # Remove leading and trailing double-quote
  end

  div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = values
  fg_sku = fg_style + "-" + fg_color
  part_sku = part_style + "-" + part_color

  if !FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
    FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
  end

end
def fix_csv(file)
  out = File.open("fixed_"+file, 'w')
  File.readlines(file).each do |line|
    line = line[1...-2] #remove beggining and end quotes
    line.gsub!(/","/,",") #remove all quotes between commas
    line.gsub!(/"/,"'") #replace double quotes to single
    out << line +"\n" #add the line plus endline to output
  end

  out.close
  return "fixed_"+file
end
require 'tempfile'
require 'fileutils'

def modify_csv(file)
  temp_file = Tempfile.new('temp')
  begin
    File.readlines(file).each do |line|
      line = line[1...-2]
      line.gsub!(/","/,",")
      line.gsub!(/"/,"'")
      temp_file << line +"\n"
    end
    temp_file.close
    FileUtils.mv(temp_file.path, file)
  ensure
    temp_file.close
    temp_file.unlink
  end
end