Ruby on rails 如何按此哈希数组对_进行分组

Ruby on rails 如何按此哈希数组对_进行分组,ruby-on-rails,ruby,csv,enumerable,Ruby On Rails,Ruby,Csv,Enumerable,我已将CSV格式的数据从文件读取到以下数组中: arr = [ ["company", "location", "region", "service", "price", "duration", "disabled"], ["Google", "Berlin", "EU", "Design with HTML/CSS", "120", "30", "false"], ["Google", "San Francisco", "US", "Design with HTML/CSS", "120",

我已将CSV格式的数据从文件读取到以下数组中:

arr = [
["company", "location", "region", "service", "price", "duration", "disabled"], 
["Google", "Berlin", "EU", "Design with HTML/CSS", "120", "30", "false"], ["Google", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "false"], 
["Google", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["IBM", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "true"],
["Google<script>alert('hi')<script>", "Berlin", "EU", "Practical TDD", "300", "60", "false"],
["Œoogle", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["Apple", "Berlin", "EU", "Practical TDD", "300", "60", "true"],
["Apple", "London", "EU", "Advanced AngularJS", "1200", "180", "false"],
["Apple", "New York", "US", "Restful API design", "1500", "120", "false"]
]
可能是以下提到的散列可以使用:(不确定。请建议一个好的设计,如果可能的话)

我的尝试是:

companies = []
CSV.foreach(csv_file, headers: true) do |row|
  company = {}
  company[:name]   = row['company']
  company[:regions_attributes] = {}
  company[:regions_attributes][:name] = row['region']
  company[:regions_attributes][:branches_attributes] = {}
  company[:regions_attributes][:branches_attributes][:name] = row['location']
  company[:services_attributes] = {}
  company[:services_attributes][:name] = row['service']
  company[:services_attributes][:price] = row['price']
  company[:services_attributes][:duration] = row['duration']
  company[:services_attributes][:disabled] = row['disabled']
  companies << company
end

companies.uniq! { |c| c.values }
companies = companies.group_by { |c| c[:name] }
公司=[]
CSV.foreach(CSV_文件,标题:true)do|行|
公司={}
公司[:名称]=行[“公司”]
公司[:区域_属性]={}
公司[:地区\属性][:名称]=行['region']
公司[:地区\属性][:分支机构\属性]={}
公司[:regions\u attributes][:Branchs\u attributes][:name]=行['location']
公司[:服务_属性]={}
公司[:services\u attributes][:name]=行['service']
公司[:services\u attributes][:price]=行['price']
公司[:services\u attributes][:duration]=行['duration']
公司[:services\u attributes][:disabled]=行['disabled']

公司考虑更改哈希的结构,并使用下面的代码构建它。文件
'tmp.csv'
包含csv文件的前20行左右,其链接由OP提供。我在末尾包含了它的内容

require 'csv'

CSV.read('tmp.csv', headers: true).group_by { |csv| csv["company"] }.
    transform_values do |arr1|
      arr1.group_by { |csv| csv["region"] }.
           transform_values do |arr2|
             arr2.group_by { |csv| csv["location"] }.
                  transform_values do |arr2|
                    arr2.map { |csv| csv["service"] }.uniq
                  end
           end
    end

如果此哈希格式不合适(但内容是所需的),则可以轻松地将其更改为其他格式

请参阅文档以了解、和

我被要求对链接的csv文件进行一些预处理。问题是,对于UTF-8文件,公司名称前面有一个“字节顺序标记”(搜索“Ok,找到它”。)我使用Nathan Long给出的代码删除了这些字符。OP必须在不带这些标记的情况下写入CSV文件,或在读取文件时将其剥离

The content of my reduced CSV test file is the following.

arr = ["company,location,region,service,price,duration,disabled\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "Apple,London,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Apple,Berlin,EU,Restful API design,1500,120,FALSE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,Berlin,EU,Advanced AngularJS,1200,180,FALSE\n",
       "Google,Berlin,EU,Restful API design,1500,120,FALSE\n", 
       "Google,London,EU,Restful API design,1500,120,FALSE\n",
       "Apple,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n"]

考虑更改哈希的结构,并使用下面的代码构造它。文件
'tmp.csv'
包含csv文件的前20行左右,其链接由OP提供。我在末尾包含了它的内容

require 'csv'

CSV.read('tmp.csv', headers: true).group_by { |csv| csv["company"] }.
    transform_values do |arr1|
      arr1.group_by { |csv| csv["region"] }.
           transform_values do |arr2|
             arr2.group_by { |csv| csv["location"] }.
                  transform_values do |arr2|
                    arr2.map { |csv| csv["service"] }.uniq
                  end
           end
    end

如果此哈希格式不合适(但内容是所需的),则可以轻松地将其更改为其他格式

请参阅文档以了解、和

我被要求对链接的csv文件进行一些预处理。问题是,对于UTF-8文件,公司名称前面有一个“字节顺序标记”(搜索“Ok,找到它”。)我使用Nathan Long给出的代码删除了这些字符。OP必须在不带这些标记的情况下写入CSV文件,或在读取文件时将其剥离

The content of my reduced CSV test file is the following.

arr = ["company,location,region,service,price,duration,disabled\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "Apple,London,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Apple,Berlin,EU,Restful API design,1500,120,FALSE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,Berlin,EU,Advanced AngularJS,1200,180,FALSE\n",
       "Google,Berlin,EU,Restful API design,1500,120,FALSE\n", 
       "Google,London,EU,Restful API design,1500,120,FALSE\n",
       "Apple,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n"]


ruby是主要语言吗?我可以在JS中提供帮助。您能以Ruby而不是电子表格截图的形式提供示例数据吗?有一段代码可以作为实验的基础是一个巨大的帮助。上传文件是一个帮助,但问题是链接容易断开,读者可能会在将来看到你的问题。第一步是使用
CSV.read
将文件读入数组。对吗?最好从“我已将CSV格式的数据从文件读取到以下数组中:
arr=[…]
。这样读者就可以剪切和粘贴。请确保包含变量名(例如,
arr
)因此,读者可以在回答和评论中引用它,而无需对其进行定义。我建议您删除问题,编辑以进行更改,然后取消删除…并使数组尽可能小(当然,仍然具有说明问题解决方案所需的所有基本元素)。字符串
“alert('hi'))“
应该从
arr
中删除(并且,从美容角度来看,
arr[2]
应该在自己的行中)。我不完全理解您希望返回的哈希的结构。您可以通过为“Google”/“EU”/“London”向
arr
添加一个元素,并在
arr
生成的哈希中显示键
“Google”
的完整值来澄清这一点。ruby是主要语言吗?我可以在JS中提供帮助。您能以Ruby而不是电子表格截图的形式提供示例数据吗?有一段代码可以作为实验的基础是一个巨大的帮助。上传文件是一个帮助,但问题是链接容易断开,读者可能会在将来看到你的问题。第一步是使用
CSV.read
将文件读入数组。对吗?最好从“我已将CSV格式的数据从文件读取到以下数组中:
arr=[…]
。这样读者就可以剪切和粘贴。请确保包含变量名(例如,
arr
)因此,读者可以在回答和评论中引用它,而无需对其进行定义。我建议您删除问题,编辑以进行更改,然后取消删除…并使数组尽可能小(当然,仍然具有说明问题解决方案所需的所有基本元素)。字符串
“alert('hi'))“
应该从
arr
中删除(并且,从美容角度来看,
arr[2]
应该在自己的行中)。我不完全理解您希望返回的哈希的结构。您可以在
arr
中为“Google”/“EU”/“London”添加一个元素,并在从
arr
生成的哈希中显示键
“Google”
的完整值,以澄清这一点。谢谢您的建议,我对问题进行了修改。此外,我还需要坚持我在问题中提到的格式。好的,我将编辑我的答案,以所需的格式构造哈希。然而,首先,我需要更全面地理解您希望生成的哈希的结构。查看我对这个问题的新评论。您现在可以查看关联和更新部分以及我的进度。我已经更新了关联。请看最新的问题,我仍然很困惑。在问题中,你说,“也许可以使用下面提到的散列:(不确定。如果可能,请建议一个好的设计)”。我在回答中提出了我认为是一个好的设计。然后你评论道,“我还需要坚持我在问题中提到的格式”,这似乎是矛盾的。
  #=> {"Google"=>{
         "EU"=>{
           "Berlin"=>["Design with HTML/CSS","Advanced AngularJS","Restful API design"],
           "London"=>["Restful API design"]
         },
         "US"=>{
            "San Francisco"=>["Design with HTML/CSS", "Restful API design"]
         }
       },
       "Apple"=>{
         "EU"=>{
           "London"=>["Design with HTML/CSS"],
           "Berlin"=>["Restful API design"]
         },
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         }
       },
       "IBM"=>{
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         },
         "EU"=>{
           "Berlin"=>["Restful API design"],
           "London"=>["Restful API design"]
         }
      }
     }
The content of my reduced CSV test file is the following.

arr = ["company,location,region,service,price,duration,disabled\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "Apple,London,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Apple,Berlin,EU,Restful API design,1500,120,FALSE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,Berlin,EU,Advanced AngularJS,1200,180,FALSE\n",
       "Google,Berlin,EU,Restful API design,1500,120,FALSE\n", 
       "Google,London,EU,Restful API design,1500,120,FALSE\n",
       "Apple,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n"]