Ruby on rails Ruby:CSV解析器在我的数据中被双引号绊倒
我每天都在做一个rake任务,它会下载一个CSV,每天自动发送到Dropbox,解析并保存到数据库中。我无法控制将数据输入生成CSV报告的程序的方式,因此我无法避免在某些数据中使用双引号。但是,我想知道是否有一种方法可以在rake任务中剥离或用单引号替换它们,或者以某种方式通知解析器,使其不会抛出此错误 Rake任务代码:Ruby on rails Ruby:CSV解析器在我的数据中被双引号绊倒,ruby-on-rails,ruby,csv,rake-task,Ruby On Rails,Ruby,Csv,Rake Task,我每天都在做一个rake任务,它会下载一个CSV,每天自动发送到Dropbox,解析并保存到数据库中。我无法控制将数据输入生成CSV报告的程序的方式,因此我无法避免在某些数据中使用双引号。但是,我想知道是否有一种方法可以在rake任务中剥离或用单引号替换它们,或者以某种方式通知解析器,使其不会抛出此错误 Rake任务代码: require 'net/http' require 'csv' require 'open-uri' namespace :fp_import do desc "
require 'net/http'
require 'csv'
require 'open-uri'
namespace :fp_import do
desc "download abc_relations from dropbox, save as csv, create or update record in db"
task :fp => :environment do
data = URI.parse("<<file's dropbox link>>").read
File.open(Rails.root.join('lib/assets', 'fp_relation.csv'), 'w') do |file|
file.write(data)
end
file= Rails.root.join('lib/assets', 'fp_relation.csv')
CSV.foreach(file) do |row|
div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = row
fg_sku = fg_style + "-" + fg_color
part_sku = part_style + "-" + part_color
relation = FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
if relation == false
FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
end
end
end
end
CSV无效,应转义引号。如果不需要其他特殊处理,您可以逐行读取文件,按
,
拆分,并删除前导/尾随“
:
更新版本
file = Kernel.open(File.join(__dir__, 'input.almost_csv'))
file.each do |line|
values = line.split(',')
values = values.map do |value|
value[1...-1] # Remove leading and trailing double-quote
end
div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = values
fg_sku = fg_style + "-" + fg_color
part_sku = part_style + "-" + part_color
if !FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
end
end
请注意:
- 您不需要
局部范围变量就足够了@
- 如果还希望删除字符串中的引号,则可以操作
块中的值映射
- 这仅在值中没有列分隔符
时有效,
def fix_csv(file)
out = File.open("fixed_"+file, 'w')
File.readlines(file).each do |line|
line = line[1...-2] #remove beggining and end quotes
line.gsub!(/","/,",") #remove all quotes between commas
line.gsub!(/"/,"'") #replace double quotes to single
out << line +"\n" #add the line plus endline to output
end
out.close
return "fixed_"+file
end
def fix_csv(文件)
out=File.open(“修复”+文件'w')
File.readlines(文件)。每个do |行|
行=行[1…-2]#删除行号和结束引号
line.gsub!(/“,”/,“,”)#删除逗号之间的所有引号
line.gsub!(/“/,””)#将双引号替换为单引号
out@Flip我不确定你的更正是否正确。@Tatiane:您的csv数据中有“**”部分,还是用来标记关键代码?如果所有数据都与此摘录类似,您可以删除所有数据“在您使用csv之前。@克努特:我明白了,您可能是对的。。将撤消该部分。”。感谢您指出这一点。@knut您是对的,我用“**”突出显示了关键代码,它实际上不是数据的一部分。您不能保证任何形式错误的CSV行,因此您最好拒绝错误的行,稍后再清理它们。为什么要修复它,然后再次解析它?修复后,它已被解析并准备导入。@pascalbetz,以防您不想修改原始csvI。请参阅。除非您需要另一个进程的已清理文件,否则可以保持原样,并在清理后将其导入AR。所以不需要读、清理、写、读、导入。@pascalbetz是的,谢谢,我知道。事情是这样的,我们已经在答案中的代码不需要修改,清理逻辑与处理逻辑分离。嗨@agush,我只想保留1个正确格式的csv,而不是创建2个。我是gsub的新手!方法。如果我只使用line.gsub!(/“/,“'”),如何将更改保存到现有文件?谢谢pascal!对不起,我不确定我的代码中应该包含哪些内容。请您再解释一下,在我发布的代码中应该包含哪些内容?
file = Kernel.open(File.join(__dir__, 'input.almost_csv'))
file.each do |line|
values = line.split(',')
values = values.map do |value|
value[1...-1] # Remove leading and trailing double-quote
end
div, fg_style, fg_color, factory, part_style, part_color, comp_code, vendor, design_no, comp_type = values
fg_sku = fg_style + "-" + fg_color
part_sku = part_style + "-" + part_color
if !FgPart.where('part_sku LIKE ? AND fg_sku LIKE?', "%#{part_sku}%", "%#{fg_sku}%").exists?
FgPart.create(fg_style: fg_style, fg_color: fg_color, fg_sku: fg_sku, factory: factory, part_style: part_style, part_color: part_color, part_sku: part_sku, comp_code: comp_code, comp_type: comp_type, design_no: design_no)
end
end
def fix_csv(file)
out = File.open("fixed_"+file, 'w')
File.readlines(file).each do |line|
line = line[1...-2] #remove beggining and end quotes
line.gsub!(/","/,",") #remove all quotes between commas
line.gsub!(/"/,"'") #replace double quotes to single
out << line +"\n" #add the line plus endline to output
end
out.close
return "fixed_"+file
end
require 'tempfile'
require 'fileutils'
def modify_csv(file)
temp_file = Tempfile.new('temp')
begin
File.readlines(file).each do |line|
line = line[1...-2]
line.gsub!(/","/,",")
line.gsub!(/"/,"'")
temp_file << line +"\n"
end
temp_file.close
FileUtils.mv(temp_file.path, file)
ensure
temp_file.close
temp_file.unlink
end
end