Algorithm 如何找到两个常用词最多的句子?
给出一个句子列表,找出两个常用词最多的句子。 常用词在句子中不需要定位在同一位置(顺序无关紧要)Algorithm 如何找到两个常用词最多的句子?,algorithm,Algorithm,给出一个句子列表,找出两个常用词最多的句子。 常用词在句子中不需要定位在同一位置(顺序无关紧要) if words[a] equal-to-ignore-case words[b] tempCount++ if tempCount > maxCount sentence1Index = i sentence2Index = j maxCount = tempCount 谢谢 if
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
谢谢
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
更新:
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
这个问题是否存在非成对算法?因为成对是非常简单的
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
我的想法是使用倒排索引来存储这个单词出现的位置。这需要遍历每个句子中的每个单词。然后创建一个n*n 2D数组,用于计算两个句子在反向索引中出现在同一个桶中的次数 首先,你需要一种方法,用其中两个句子来确定它们有多少个共同的单词。这可以通过将给定的两个句子作为输入,并从中创建两个按字母顺序包含单词的数组来实现。然后,您可以检查这两个数组,向前推进字母顺序较早的数组(因此,如果当前匹配的是“abacus”和“book”,则将“abacus”移动到下一个单词)。如果有匹配项(“book”和“book”),则增加匹配单词的计数,并将两个数组移动到下一个单词。继续执行此操作,直到到达其中一个数组的末尾(因为另一个数组中的其余单词将没有任何匹配项)
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
一旦实现了此算法,您将需要一个如下所示的循环:
for (i = 0; i < sentenceCount - 1; i++) {
for (j = i+1; j < sentenceCount; j++) {
}
}
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
for(i=0;i
在循环中,您将调用函数,该函数使用索引
i
和j
处的句子计算常用单词数。您将跟踪到目前为止看到的最常见的单词数量,以及找到这些单词的两个句子。如果一个新句子有更多的共同单词,你将存储该计数和产生该计数的两个句子。最后,你将得到你想要的两个句子。假设你有一系列句子:
String[] sentences
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
创建一些包含默认值的变量,以跟踪包含最常用单词的两个句子
sentence1Index = -1
sentence2Index = -1
maxCount = -1
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
在句子数组上执行嵌套循环
for i : 0 -> sentences.length
for j : 0 -> sentences.length
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
确保你没有检查同一个句子
if i != j
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
将字符串按空格分开(假设将一些符号计算为单词,通常会给出每个单词)
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
为此运行创建临时计数值
tempCount = 0
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
在两个单词数组之间循环(从正在比较的两个句子中获得)
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
如果单词相同,则递增温度计数
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
比较完单词后,如果tempCount大于当前的maxCount,则更新跟踪您要查找的所有值
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
返回新创建的包含两个句子的数组
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
if sentence1Index != -1 and sentence2Index != -1
String[] retArray = sentences[sentence1Index], sentences[sentence2Index ]
return retArray
return null
所有伪代码:
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
String[] sentences
sentence1Index = -1
sentence2Index = -1
maxCount = -1
for i : 0 -> sentences.length
for j : 0 -> sentences.length
if i != j
String[] words1 = sentences[i].splitAt(" ")
String[] words2 = sentences[j].splitAt(" ")
tempCount = 0
for a : 0 -> words1 .length
for b : 0 -> words2.length
if words[a] equal-to-ignore-case words[b]
tempCount++
if tempCount > maxCount
sentence1Index = i
sentence2Index = j
maxCount = tempCount
if sentence1Index != -1 and sentence2Index != -1
String[] retArray = sentences[sentence1Index], sentences[sentence2Index ]
return retArray
return null
如果你展示@axiom,你可能会得到更好的答案,我补充了我的想法。因为我觉得效率不够,所以我一开始没有说。你的方法是蛮力。我所寻求的是一种更有效的方法。@city你在这篇文章之前没有提到你尝试了什么……也没有说明你做了什么,对此我很抱歉。我有点懒:)