Ruby 如何获取CSV文件的某些行并将其保存在单个文件中_Ruby

Ruby 如何获取CSV文件的某些行并将其保存在单个文件中

ruby

Ruby 如何获取CSV文件的某些行并将其保存在单个文件中,ruby,Ruby,我有一个包含许多行的CSV文件： Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match (80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total Joe,2020,3,52,0,5,2,3,2,0,5,0,

我有一个包含许多行的CSV文件：

Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match 
(80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total
Joe,2020,3,52,0,5,2,3,2,0,5,0,0,69,142,337
Engineering,2020,3,6469,0,0,0,0,0,0,0,0,0,6469,82,0
User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0
User_TR1_FR_FR,2020,3,66,3,0,0,0,0,0,0,0,2,71,82,0
User_TR1_JA_JP,2020,3,35,49,56,114,0,21,22,66,62,0,425,630,0
User_TR1_KO_KR,2020,3,60,0,0,0,0,0,10,0,0,0,70,0,0
User_TR1_NL_NL,2020,3,61,2,41,59,15,31,11,13,2,0,235,0,0
User_TR1_PL_PL,2020,3,134,17,41,57,15,31,21,13,0,0,329,15,0
User_TR1_PT_BR,2020,3,37,0,2,0,0,12,0,0,0,22,73,53,0
Marie,2020,3,37,15,8,2,0,2,0,5,0,0,69,40,0
Charly,2020,3,224,0,0,0,0,0,0,0,0,0,224,28,0
Joseph,2020,3,56,0,0,0,0,0,0,0,0,0,56,0,0

我只想为行的第一列中包含任何

User\u XXX\u XX\u XX

字符串的行创建单独的CSV文件。其他行应该被忽略。最后，所有生成的文件都应该以这些第一个元素中的每个元素命名

例如：

User_TR1_ES_ES.csv
User_TR1_PT_BR.csv

到目前为止，我的代码是：

#!/usr/bin/env ruby

require 'csv'
require 'fileutils'

this_dir = File.expand_path(File.dirname(__FILE__))
original_dir = File.join(this_dir, '_Original')

#working with the .CSV file, there should be only one, and always be .CSV....
puts "Finding a .CSV file..."
full_path = Dir.glob('**/*.csv')
full_path.each do |csv|
  puts "CSV file found: #{File.basename(csv)}"
end

new_path = File.join(this_dir, full_path[0])

#I start reading the CSV file found in the folder
parsed_data = CSV.read(new_path)

#I grab the header in a separate variable
header = parsed_data.shift

#I created a constant to look for lines where the first elements meets the desired string, not sure about it...
USER_NAME = 'User' + '_' + 'TR' + 1..10 + ([a-z].upcase * 2) + '_' + ([a-z].upcase * 2)

#I want loop through each line and look for the those that includes the constant in the first element
CSV.foreach(new_path) do |row|
  row.first[0].include?(USER_NAME)

  #create inviduals files in a same location
  new_dir = File.join(this_dir, '_result')
  FileUtils.mkdir(new_dir)
  newfiles = File.join(new_dir, CONSTANT, '.csv')

  CSV.open(newfiles, 'w+') do |csv|
    csv << header
    csv << captured_row
  end

#/usr/bin/env ruby
需要“csv”
需要“fileutils”
this_dir=File.expand_path（File.dirname（uu File_uu））
原始目录=File.join（此目录为“原始目录”）
#使用.CSV文件时，应该只有一个，并且始终是.CSV。。。。
放置“查找.CSV文件…”
完整路径=目录全局（'***.csv'）
完整路径。每个do | csv|
放置“找到CSV文件：#{file.basename（CSV）}”
结束
新建路径=File.join（此路径，完整路径[0]）
#我开始读取文件夹中的CSV文件
解析的_数据=CSV.read（新的_路径）
#我在一个单独的变量中获取标题
header=已解析的_data.shift
#我创建了一个常量来查找第一个元素与所需字符串匹配的行，但不确定它是否匹配。。。
USER_NAME='USER'+''+'TR'+1..10+（[a-z].upcase*2）+'.+（[a-z].upcase*2）
#我希望循环遍历每一行，并查找在第一个元素中包含常量的元素
CSV.foreach（新路径）do|行|
行。第一个[0]。包括？（用户名）
#在同一位置创建inviduals文件
new_dir=File.join（这个_dir，''u result'）
FileUtils.mkdir（新目录）
newfiles=File.join（new_dir，常量'.csv'）
CSV.open（新文件“w+”）do | CSV|
csv就我个人而言，我甚至不会麻烦将文件视为csv，因为当您可以轻松获取用户信息时，这并不重要。…
行：
header = nil

DATA.each_line { |l|
  if header.nil?
    header = l
    next
  end

  fn = l[/^User_TR1_[^,]+/]
  next unless fn

  File.write(fn + '.csv', header + l)
}

__END__
Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match (80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total
Joe,2020,3,52,0,5,2,3,2,0,5,0,0,69,142,337
Engineering,2020,3,6469,0,0,0,0,0,0,0,0,0,6469,82,0
User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0
User_TR1_FR_FR,2020,3,66,3,0,0,0,0,0,0,0,2,71,82,0
User_TR1_JA_JP,2020,3,35,49,56,114,0,21,22,66,62,0,425,630,0
User_TR1_KO_KR,2020,3,60,0,0,0,0,0,10,0,0,0,70,0,0
User_TR1_NL_NL,2020,3,61,2,41,59,15,31,11,13,2,0,235,0,0
User_TR1_PL_PL,2020,3,134,17,41,57,15,31,21,13,0,0,329,15,0
User_TR1_PT_BR,2020,3,37,0,2,0,0,12,0,0,0,22,73,53,0
Marie,2020,3,37,15,8,2,0,2,0,5,0,0,69,40,0
Charly,2020,3,224,0,0,0,0,0,0,0,0,0,224,28,0
Joseph,2020,3,56,0,0,0,0,0,0,0,0,0,56,0,0

它创造了：
-rw-r--r--@  1 TTM  staff   256B May  3 17:05 User_TR1_ES_ES.csv
-rw-r--r--@  1 TTM  staff   253B May  3 17:05 User_TR1_FR_FR.csv
-rw-r--r--@  1 TTM  staff   263B May  3 17:05 User_TR1_JA_JP.csv
-rw-r--r--@  1 TTM  staff   253B May  3 17:05 User_TR1_KO_KR.csv
-rw-r--r--@  1 TTM  staff   259B May  3 17:05 User_TR1_NL_NL.csv
-rw-r--r--@  1 TTM  staff   262B May  3 17:05 User_TR1_PL_PL.csv
-rw-r--r--@  1 TTM  staff   255B May  3 17:05 User_TR1_PT_BR.csv

看起来像：
cat User_TR1_ES_ES.csv

Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match (80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total
User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0

在本例中，我利用了Ruby在代码的\uuuu\uuuu
之后存储数据的能力<代码>数据

由Ruby创建，作为

\uuuuu END\uuuuu

之后内容的文件句柄，所以不要注意幕后的那个人

只需使用

file.foreach

读取输入文件，抓取第一行作为标题，循环返回并读取下一行。从这一点开始，只需查找与

/^User\u TR1.[^，]+/

模式匹配的行

以下是返回的内容：

'Engineering,2020,3,6469,0,0,0,0,0,0,0,0,0,6469,82,0'[/^User_TR1_[^,]+/] # => nil
'User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0'[/^User_TR1_[^,]+/] # => "User_TR1_ES_ES"

因此，如果行不是

用户…

line

nil

返回，导致代码循环。如果该行是

用户…

行，将返回字符串，代码将通过

文件输出标题和行。write

此外，如果是我的系统，我会在创建文件名时将其转换为小写。作为一名系统管理员，我学会了避免在文件名中使用大写或混合大写，因为它们有可能拼错文件名

'User_TR1_ES_ES'.downcase + '.csv' # => "user_tr1_es_es.csv"

另外，请参阅我上面关于使用Ruby类的评论。它易于实现，而且非常高效

就我个人而言，我甚至不会费心将文件视为CSV，因为当您可以轻松获取

USER…

行时，这并不重要：

header = nil

DATA.each_line { |l|
  if header.nil?
    header = l
    next
  end

  fn = l[/^User_TR1_[^,]+/]
  next unless fn

  File.write(fn + '.csv', header + l)
}

__END__
Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match (80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total
Joe,2020,3,52,0,5,2,3,2,0,5,0,0,69,142,337
Engineering,2020,3,6469,0,0,0,0,0,0,0,0,0,6469,82,0
User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0
User_TR1_FR_FR,2020,3,66,3,0,0,0,0,0,0,0,2,71,82,0
User_TR1_JA_JP,2020,3,35,49,56,114,0,21,22,66,62,0,425,630,0
User_TR1_KO_KR,2020,3,60,0,0,0,0,0,10,0,0,0,70,0,0
User_TR1_NL_NL,2020,3,61,2,41,59,15,31,11,13,2,0,235,0,0
User_TR1_PL_PL,2020,3,134,17,41,57,15,31,21,13,0,0,329,15,0
User_TR1_PT_BR,2020,3,37,0,2,0,0,12,0,0,0,22,73,53,0
Marie,2020,3,37,15,8,2,0,2,0,5,0,0,69,40,0
Charly,2020,3,224,0,0,0,0,0,0,0,0,0,224,28,0
Joseph,2020,3,56,0,0,0,0,0,0,0,0,0,56,0,0

它创造了：

-rw-r--r--@  1 TTM  staff   256B May  3 17:05 User_TR1_ES_ES.csv
-rw-r--r--@  1 TTM  staff   253B May  3 17:05 User_TR1_FR_FR.csv
-rw-r--r--@  1 TTM  staff   263B May  3 17:05 User_TR1_JA_JP.csv
-rw-r--r--@  1 TTM  staff   253B May  3 17:05 User_TR1_KO_KR.csv
-rw-r--r--@  1 TTM  staff   259B May  3 17:05 User_TR1_NL_NL.csv
-rw-r--r--@  1 TTM  staff   262B May  3 17:05 User_TR1_PL_PL.csv
-rw-r--r--@  1 TTM  staff   255B May  3 17:05 User_TR1_PT_BR.csv

看起来像：

cat User_TR1_ES_ES.csv

Username,Year,Month,Match (0-60%),Match (60-65%),Match (65-70%),Match (70-75%),Match (75-80%),Match (80-85%),Match (85-90%),Match (90-95%),Match (95-100%),Match (100%),New_total,Edit_total,Review_total
User_TR1_ES_ES,2020,3,112,3,0,0,0,14,10,0,0,2,141,3,0

在本例中，我利用了Ruby在代码的

\uuuu\uuuu

之后存储数据的能力<代码>数据由Ruby创建，作为

\uuuuu END\uuuuu