使用Ruby在源文本中查找词典单词
使用Ruby,我需要输出字典中的单词列表,可以通过从源文本中删除字母来形成 例如,如果我输入源文本crazed,我想得到的不仅仅是像craze和razed这样的单词,它们的字母顺序相同,并且在源文本中彼此相邻,而且还有像rad和red这样的单词,因为这些单词是存在的,并且可以通过从craized中删除select字母来找到,并且输出的单词保持字母顺序。但是,像dare或race这样的单词不应该出现在输出列表中,因为dare或race中的字母顺序与crazed中的字母顺序不同。如果raed或crae是词典中的单词,它们将成为输出的一部分 我的想法是以二进制的方式浏览源文本使用Ruby在源文本中查找词典单词,ruby,dictionary,Ruby,Dictionary,使用Ruby,我需要输出字典中的单词列表,可以通过从源文本中删除字母来形成 例如,如果我输入源文本crazed,我想得到的不仅仅是像craze和razed这样的单词,它们的字母顺序相同,并且在源文本中彼此相邻,而且还有像rad和red这样的单词,因为这些单词是存在的,并且可以通过从craized中删除select字母来找到,并且输出的单词保持字母顺序。但是,像dare或race这样的单词不应该出现在输出列表中,因为dare或race中的字母顺序与crazed中的字母顺序不同。如果raed或cra
(for "crazed", we'd get:
000001 = "d";
000010 = "e";
000011 = "ed";
000100 = "z";
000101 = "zd";
000111 = "zed";
001000 = "a";
001001 = "ad"; etc.)
并将每个结果与字典中的单词进行比较,尽管我不知道如何编码,也不知道这是否是最有效的。在这里,我将从你的帮助中受益匪浅
而且,源文本的长度是可变的;它不一定像《疯狂》那样有六个字母长。输入可能会更大20-30个字符,可能更多
我在这里搜索了一下,发现了一些关于字谜和任何字母顺序的单词的问题,但并没有具体说明我在寻找什么。这在Ruby中可能吗?谢谢。首先,让我们将字典中的单词读入数组,然后对重复的单词进行筛选、降格和删除,例如,如果字典同时包含a和a,就像我在下面使用的Mac上的字典一样
DICTIONARY = File.readlines("/usr/share/dict/words").map { |w| w.chomp.downcase }.uniq
#=> ["a", "aa", "aal", "aalii",..., "zyzomys", "zyzzogeton"]
DICTIONARY.size
#=> 234371
下面的方法根据顺序生成给定单词的一个或多个字符的所有组合,并针对每个组合,将字符合并成字符串,检查字符串是否在字典中,如果在字典中,则将字符串保存到数组中
为了检查字符串是否与字典中的单词匹配,我使用以下方法执行二进制搜索。这利用了字典已经按字母顺序排序的事实
def subwords(word)
arr = word.chars
(1..word.size).each.with_object([]) do |n,a|
arr.combination(n).each do |comb|
w = comb.join
a << w if DICTIONARY.bsearch { |dw| w <=> dw }
end
end
end
subwords "crazed"
# => ["c", "r", "a", "z", "e", "d",
# "ca", "ce", "ra", "re", "ae", "ad", "ed",
# "cad", "rad", "red", "zed",
# "raze", "craze", "crazed"]
下面是一个广泛的解决方案集,其中包括可以通过以任何顺序使用字母获得的单词。使用组合查找可能的子词的缺点是缺少组合的排列。根据“重要性”,在某个时候会出现“mpa”的组合。由于这不是字典中的单词,因此将跳过它。因此,我们损失了排列“地图”——字典中“重要性”的子词。下面是一个广泛的解决方案,可以找到更多可能的字典单词。我同意我的方法可以优化速度
#steps
#split string at ''
#find combinations for n=2 all the way to n=word.size
#for each combination
#find the permutations of all the arrangements
#then
#join the array
#check to see if word is in dictionary
#and it's not already collected
#if it is, add to collecting array
require 'set'
Dictionary=File.readlines('dictionary.txt').map(&:chomp).to_set
Dictionary.size #39501
def subwords(word)
#split string at ''
arr=word.split('')
#excluding single letter words
#you can change 2 to 1 in line below to select for single letter words too
(2..word.size).each_with_object([]) do |n,a|
#find combinations for n=2 all the way to n=word.size
arr.combination(n).each do |comb|
#for each combination
#find the permutations of all the arrangements
comb.permutation(n).each do |perm|
#join the array
w=perm.join
#check to see if word is in dictionary and it's not already collected
if Dictionary.include?(w) && !a.include?(w)
#if it is, add to collecting array
a<<w
end
end
end
end
end
p subwords('crazed')
#["car", "arc", "rec", "ace", "cad", "are", "era", "ear", "rad", "red", "adz", "zed", "czar", "care", "race", "acre", "card", "dace", "raze", "read", "dare", "dear", "adze", "daze", "craze", "cadre", "cedar", "crazed"]
p subwords('battle')
#["bat", "tab", "alb", "lab", "bet", "tat", "ate", "tea", "eat", "eta", "ale", "lea", "let", "bate", "beat", "beta", "abet", "bale", "able", "belt", "teat", "tale", "teal", "late", "bleat", "table", "latte", "battle", "tablet"]
第一个问题是你在用谁的字典?我不明白你为什么要把这些转换成二进制,除非你有一本方便的二进制到英语字典,里面有一个你可以使用word的ruby字符串单词。每个字符都会给你一个字符数组。另请看:@whodini9,word.chars就够了。我刚刚注意到我说我在回答中使用了bsearch,但在我的代码中找不到它。我修好了。
#steps
#split string at ''
#find combinations for n=2 all the way to n=word.size
#for each combination
#find the permutations of all the arrangements
#then
#join the array
#check to see if word is in dictionary
#and it's not already collected
#if it is, add to collecting array
require 'set'
Dictionary=File.readlines('dictionary.txt').map(&:chomp).to_set
Dictionary.size #39501
def subwords(word)
#split string at ''
arr=word.split('')
#excluding single letter words
#you can change 2 to 1 in line below to select for single letter words too
(2..word.size).each_with_object([]) do |n,a|
#find combinations for n=2 all the way to n=word.size
arr.combination(n).each do |comb|
#for each combination
#find the permutations of all the arrangements
comb.permutation(n).each do |perm|
#join the array
w=perm.join
#check to see if word is in dictionary and it's not already collected
if Dictionary.include?(w) && !a.include?(w)
#if it is, add to collecting array
a<<w
end
end
end
end
end
p subwords('crazed')
#["car", "arc", "rec", "ace", "cad", "are", "era", "ear", "rad", "red", "adz", "zed", "czar", "care", "race", "acre", "card", "dace", "raze", "read", "dare", "dear", "adze", "daze", "craze", "cadre", "cedar", "crazed"]
p subwords('battle')
#["bat", "tab", "alb", "lab", "bet", "tat", "ate", "tea", "eat", "eta", "ale", "lea", "let", "bate", "beat", "beta", "abet", "bale", "able", "belt", "teat", "tale", "teal", "late", "bleat", "table", "latte", "battle", "tablet"]