Ruby 如何按选项卡分隔文件中的特定列对项目进行分组
我在选项卡分隔的文本文件中有以下记录:Ruby 如何按选项卡分隔文件中的特定列对项目进行分组,ruby,Ruby,我在选项卡分隔的文本文件中有以下记录: sku title Product Type 19686940 This is test Title1 toys 19686941 This is test Title2 toys 19686942 This is test Title3 toys
sku title Product Type
19686940 This is test Title1 toys
19686941 This is test Title2 toys
19686942 This is test Title3 toys
20519300 This is test Title1 toys2
20519301 This is test Title2 toys2
20580987 This is test Title1 toys3
20580988 This is test Title2 toys3
20582176 This is test Title1 toys4
如何按产品类型对项目进行分组,并在标题中找到所有唯一的单词
输出格式:
Product Type Unique_words
------------ ------------
toys This is test Title1 Title2 Title3
toys2 This is test Title1 Title2
toys3 This is test Title1 Title2
toys4 This is test Title1
更新
到目前为止,我一直在编写代码,直到读取文件并存储到数组中:
class Product
attr_reader :sku, :title, :productType
def initialize(sku,title,productType)
@sku = sku
@title = title
@productType = productType
end
def sku
@sku
end
def title
@title
end
def productType
@productType
end
end
class FileReader
def ReadFile(m_FilePath)
array = Array.new
lines = IO.readlines(m_FilePath)
lines.each_with_index do |line, i|
current_row = line.split("\t")
product = Product.new(current_row[0],current_row[1],current_row[2])
array.push product
end
end
end
filereader_method = FileReader.new.method("ReadFile")
Reading = filereader_method.to_proc
puts Reading.call("Input.txt")
要获得分组,可以使用:
Ruby的美妙之处在于你有很多选择。您还可以签出库,因为这只是一个数据对象:
require 'csv'
require 'ostruct'
def products_by_type(file_path)
csv_opts = { col_sep: "\t",
headers: true,
header_converters: [:downcase, :symbol] }
CSV.open(file_path, csv_opts)
.map{ |row| OpenStruct.new row.to_hash }
.group_by{ |product| product.product_type }
end
或者使用基于散列键创建的习惯用法来删除上面对行
上的#to_hash
的调用:
class Product
attr_accessor :sku, :title, :product_type
def initialize(data)
data.each{ |key, value| self.key = value }
end
end
def products_by_type(file_path)
csv_opts = { #... }
CSV.open(file_path, csv_opts)
.map{ |row| Product.new row }
.group_by{ |product| product.product_type }
end
然后根据散列,根据需要格式化输出:
def unique_title_words(*products)
products.flat_map{ |product| product.title.scan(/\w+/) }
.unique
end
puts "Product Type\tUnique Words"
products_by_type("./file.txt").each do |type, products|
puts "#{type}\t#{unique_title_words products}"
end
你也可以给出一些示例输出吗?向我们展示你迄今为止尝试过的内容和不起作用的内容。请查看更新的问题
def unique_title_words(*products)
products.flat_map{ |product| product.title.scan(/\w+/) }
.unique
end
puts "Product Type\tUnique Words"
products_by_type("./file.txt").each do |type, products|
puts "#{type}\t#{unique_title_words products}"
end