Ruby 如何为数组中的重复项编制索引？_Ruby

Ruby 如何为数组中的重复项编制索引？

ruby

Ruby 如何为数组中的重复项编制索引？,ruby,Ruby,从以下数组（散列）开始：如何添加重复URL的索引，如下所示： [ {:name=>"site a", :url=>"http://example.org/site/1/", :index => 1}, {:name=>"site b", :url=>"http://example.org/site/2/", :index => 1}, {:name=>"site c", :url=>"http://example.org/site

从以下数组（散列）开始：

如何添加重复URL的索引，如下所示：

[
  {:name=>"site a", :url=>"http://example.org/site/1/", :index => 1}, 
  {:name=>"site b", :url=>"http://example.org/site/2/", :index => 1}, 
  {:name=>"site c", :url=>"http://example.org/site/3/", :index => 1}, 
  {:name=>"site d", :url=>"http://example.org/site/1/", :index => 2}, 
  {:name=>"site e", :url=>"http://example.org/site/2/", :index => 2}, 
  {:name=>"site f", :url=>"http://example.org/site/6/", :index => 1},
  {:name=>"site g", :url=>"http://example.org/site/1/", :index => 3}
]

数组=[
{:name=>“站点a”，“url=>”http://example.org/site/1/"}, 
{:name=>“站点b”，“url=>”http://example.org/site/2/"}, 
{:name=>“站点c”，“url=>”http://example.org/site/3/"}, 
{:name=>“站点d”，“url=>”http://example.org/site/1/"}, 
{:name=>“site e e”，“url=>”http://example.org/site/2/"}, 
{:name=>“sitef”，“url=>”http://example.org/site/6/"},
{:name=>“site g”，“url=>”http://example.org/site/1/"}
]
array.inject（[]）{ar，it |
count|so|u far=ar.count{i|i[：url]==it[：url]}
it[：index]=到目前为止的计数+1
应收账
[
{:name=>“站点a”，“url=>”http://example.org/site/1/“，：index=>1}，
{:name=>“站点b”，“url=>”http://example.org/site/2/“，：index=>1}，
{:name=>“站点c”，“url=>”http://example.org/site/3/“，：index=>1}，
{:name=>“站点d”，“url=>”http://example.org/site/1/“，：index=>2}，
{:name=>“site e e”，“url=>”http://example.org/site/2/“，：index=>2}，
{:name=>“sitef”，“url=>”http://example.org/site/6/“，：index=>1}，
{:name=>“site g”，“url=>”http://example.org/site/1/“，：index=>3}
]

我会使用散列来跟踪索引。一遍又一遍地扫描以前的条目似乎效率低下

counts = Hash.new(0)
array.each { | hash | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}

还是干净一点

array.each_with_object(Hash.new(0)) { | hash, counts | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}

如果我想让它更有效率，我会写：

items_with_index = items.inject([[], {}]) do |(output, counts), h|
  new_count = (counts[h[:url]] || 0) + 1
  [output << h.merge(:index => new_count), counts.update(h[:url] => new_count)]
end[0]

items_，其中_索引=items.inject（[[]，{}]）do |（输出，计数），h|
新的|u计数=（计数[h[：url]]| 0）+1
[输出新计数），计数。更新（h[：url]=>新计数）]
完[0]

非常好，现在我只是想弄清楚它是如何工作的……：）我重新格式化了

inject

调用，希望能让它更清晰。

inject

在接收数组上循环，在对inject块的每次调用中，

ar

将包含“到目前为止”看到的URL（及其运行计数）-因为它们是在块的末尾添加的。所以在开始时，你计算到目前为止看到的“当前”URL的数量，然后添加。解释起来有点笨拙，因为这实际上是一个伪装的递归操作。（感谢@fl00r慷慨地让我试着让他的代码易于理解。）我将用

count\u sou far=ar.count{I|I[：url]==it[：url]}

替换为

count\u sou far=ar[ar.rindex{I | I[：url]==it[：url]}[:index]

如果有很多元素，性能会更好。@Sii为什么？在G中时，rindex将返回3，正如我所期望的那样。@Serabe:事实上，不，你是对的，我没有注意到你正在检索以前的运行计数。我只是花了一些时间来处理

rindex

nil

的情况，我认为这是无效的nt，但取决于数组大小。您可以在此处使用liek@mu

each_with_object

数组：

each_with_object（Hash.new（0））{计数，Hash | Hash[：index]=计数[Hash[：url]=计数[Hash[：url]]+1}

@fl00r和这个答案都很有效；只是觉得这个更容易理解。@floor:谢谢，更新了。这是因为我在这里使用的是1.8.6，它没有

每个带有对象的\u

。

array.each_with_object(Hash.new(0)) { | hash, counts | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}

items_with_index = items.inject([[], {}]) do |(output, counts), h|
  new_count = (counts[h[:url]] || 0) + 1
  [output << h.merge(:index => new_count), counts.update(h[:url] => new_count)]
end[0]