Ruby 按第一个公共字母分组的字符串数组_Ruby

Ruby 按第一个公共字母分组的字符串数组

ruby

Ruby 按第一个公共字母分组的字符串数组,ruby,Ruby,是否存在将字符串数组中的第一个常用字母分组的方法例如： array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ] 所以当我这么做的时候 array.group_by{ |string| some_logic_with_string } 结果应该是, { 'hello' => ['hello', 'hello you'], 'people' => ['people'], 'fin' =&

是否存在将字符串数组中的第一个常用字母分组的方法

例如：

 array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]

所以当我这么做的时候

array.group_by{ |string| some_logic_with_string }

结果应该是,

{ 
   'hello' => ['hello', 'hello you'],
   'people' => ['people'],
   'fin' => ['finally', 'finland']
}

不确定，如果你能按所有常用字母排序。但是，如果您只想按第一个字母进行排序，则如下所示：

array = [ 'hello', 'hello you', 'people', 'finally', 'finland' ]    
result = {}
array.each { |st| result[st[0]] = result.fetch(st[0], []) + [st] }

pp result
{"h"=>["hello", "hello you"], "p"=>["people"], "f"=>["finally", "finland"]}

现在，

result

包含您想要的散列。

嗯，您正在尝试做一些非常定制的事情。我可以想出两种经典的方法来满足你的需求：1）和2）

使用词干分析，您可以找到较长单词的词根。这里有一个答案

Levenshtein是计算两个字符串之间差异的著名算法。由于本机C扩展，有一个for-it运行得非常快。

注意：有些测试用例不明确，期望值与其他测试冲突，您需要修复它们

我想普通的
groupby
可能不起作用，需要进一步处理
我提出了以下代码，这些代码似乎以一致的方式适用于所有给定的测试用例
我在代码中留下了注释来解释逻辑。完全理解它的唯一方法是检查
h
的值，并查看简单测试用例的流程

def group_by_common_chars(array) # We will iteratively group by as many time as there are characters # in a largest possible key, which is max length of all strings max_len = array.max_by {|i| i.size}.size # First group by first character. h = array.group_by{|i| i[0]} # Now iterate remaining (max_len - 1) times (1...max_len).each do |c| # Let's perform a group by next set of starting characters. t = h.map do |k,v| h1 = v.group_by {|i| i[0..c]} end.reduce(&:merge) # We need to merge the previously generated hash # with the hash generated in this iteration. Here things get tricky. # If previously, we had # {"a" => ["a"], "ab" => ["ab", "abc"]}, # and now, we have # {"a"=>["a"], "ab"=>["ab"], "abc"=>["abc"]}, # We need to merge the two hashes such that we have # {"a"=>["a"], "ab"=>["ab", "abc"], "abc"=>["abc"]}. # Note that `Hash#merge`'s block is called only for common keys, so, "abc" # will get merged, we can't do much about it now. We will process # it later in the loop h = h.merge(t) do |k, o, n| if (o.size != n.size) diff = [o,n].max - [o,n].min if diff.size == 1 && t.value?(diff) [o,n].max else [o,n].min end else o end end end # Sort by key length, smallest in the beginning. h = h.sort {|i,j| i.first.size <=> j.first.size }.to_h # Get rid of those key-value pairs, where value is single element array # and that single element is already part of another key-value pair, and # that value array has more than one element. This step will allow us # to get rid of key-value like "abc"=>["abc"] in the example discussed # above. h = h.tap do |h| keys = h.keys keys.each do |k| v = h[k] if (v.size == 1 && h.key?(v.first) && h.values.flatten.count(v.first) > 1) then h.delete(k) end end end # Get rid of those keys whose value array consist of only elements that # already part of some other key. Since, hash is ordered by key's string # size, this process allows us to get rid of those keys which are smaller # in length but consists of only elements that are present somewhere else # with a key of larger length. For example, it lets us to get rid of # "a"=>["aba", "abb", "aaa", "aab"] from a hash like # {"a"=>["aba", "abb", "aaa", "aab"], "ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]} h.tap do |h| keys = h.keys keys.each do |k| values = h[k] other_values = h.values_at(*(h.keys-[k])).flatten already_present = values.all? do |v| other_values.include?(v) end h.delete(k) if already_present end end end

你的逻辑不清楚。
array=[“a”、“ab”、“abc”]
的预期结果是什么？关于
[“aba”、“abb”、“aaa”、“aab”]
呢？@Drenmi显然情况并非如此。查看OP预期散列中的键。它们都有相同的长度吗？当数组是
[“为什么”，“没有”，“你”，“回答”，“上面”，“问题？”，“请”，“做”，“所以。”]
。对于
数组=['a'，ab'，abc']
，为什么它不是
{'a'=>['a'，ab'，abc']}
或
{'a'=>['a'，ab']，'abc'=>['abc']}
，等等？那不是OP想要的。是的，我知道。我在第一行写的。
@Wand-Maker
真是太棒了。这就是我想要的。非常感谢。
p group_by_common_chars ['hello', 'hello you', 'people', 'finally', 'finland'] #=> {"fin"=>["finally", "finland"], "hello"=>["hello", "hello you"], "people"=>["people"]} p group_by_common_chars ['a', 'ab', 'abc'] #=> {"a"=>["a"], "ab"=>["ab", "abc"]} p group_by_common_chars ['aba', 'abb', 'aaa', 'aab'] #=> {"ab"=>["aba", "abb"], "aa"=>["aaa", "aab"]} p group_by_common_chars ["Why", "haven't", "you", "answered", "the", "above", "questions?", "Please", "do", "so."] #=> {"a"=>["answered", "above"], "do"=>["do"], "Why"=>["Why"], "you"=>["you"], "so."=>["so."], "the"=>["the"], "Please"=>["Please"], "haven't"=>["haven't"], "questions?"=>["questions?"]}