Arrays 比较字符串以识别重复项
我必须编写一个isDup函数,根据两条tweet相似的字数比较两条tweet,根据选择的十进制阈值(0-1)确定tweet是否重复 我的过程是用我的教授提供的两条硬编码tweet编写一个sub(只是为了在转换为函数之前获得理解)。我遇到运行时错误5Arrays 比较字符串以识别重复项,arrays,excel,vba,string,duplicates,Arrays,Excel,Vba,String,Duplicates,我必须编写一个isDup函数,根据两条tweet相似的字数比较两条tweet,根据选择的十进制阈值(0-1)确定tweet是否重复 我的过程是用我的教授提供的两条硬编码tweet编写一个sub(只是为了在转换为函数之前获得理解)。我遇到运行时错误5 Option Explicit Sub isDup() Dim tweet1 As String Dim tweet2 As String Dim threshold As Double threshold =
Option Explicit
Sub isDup()
Dim tweet1 As String
Dim tweet2 As String
Dim threshold As Double
threshold = 0.7
tweet1 = "Hours of planning can save weeks of coding"
tweet2 = "Weeks of programming can save you hours of planning"
Dim tweet1Split() As String
tweet1Split = Split(tweet1, " ")
Dim tweet2Split() As String
tweet2Split = Split(tweet2, " ")
Dim i As Integer
Dim j As Integer
Dim sameCount As Integer
'my thought process below was to compare strings i and j to see if equal, and if true add 1 to sameCount,
'but the If StrComp line is where the error is
For i = LBound(tweet1Split) To UBound(tweet1Split) Step 1
For j = LBound(tweet2Split) To UBound(tweet2Split) Step 1
If StrComp(i, j, vbDatabaseCompare) = 0 Then
sameCount = sameCount + 1
Exit For
End If
Next j
Next i
End Sub
'here i wanted to get a total count of the first tweet to compare, the duplicate tweet is true based on the number of
'similar words
Function totalWords(tweet1 As String) As Integer
totalWords = 0
Dim stringLength As Integer
Dim currentCharacter As Integer
stringLength = Len(tweet1)
For currentCharacter = 1 To stringLength
If (Mid(tweet1, currentCharacter, 1)) = " " Then
totalWords = totalWords + 1
End If
Next currentCharacter
End Function
'this is where i compute an "isDup score" based on similar words compared to total words in tweet1, in this
'example the threshold was stated above at 0.7
Dim score As Double
score = sameCount / totalWords
If score > threshold Then
MsgBox "isDup Score: " & score & " ...This is a duplicate"
Else
MsgBox "isDup Score: " & score & " ...This is not a duplicate"
End If
End Sub
第一期:
i
和j
只是索引。您希望比较与索引相关的字符串,以便:
If StrComp(tweet1Split(i), tweet2Split(j), vbDatabaseCompare) = 0 Then
第二期:
如Microsoft文档中所述,
vbDatabaseCompare
是为访问而保留的,您没有使用它,因此是第二个错误的来源。您需要切换到不同的比较错误出现在哪一行?我运行sub,获取错误并按debug,如果StrComp(I,j,vbDatabaseCompare)=0,则这一行突出显示为错误,然后I和j是索引,即您是在比较索引,而不是字符串。好的,谢谢,这更有意义。我更新了代码,但在同一行上仍然出现错误,同样的错误(运行时错误5)。整个代码可能还有其他问题吗现在的问题是vbDatabaseCompare。您没有使用access。您需要使用不同的比较选项。请看哇,这样一个愚蠢的小错误,修复了那条线,非常感谢!如果您碰巧注意到任何其他错误,或看到任何其他错误,请告诉我,我将不胜感激!谢谢,我只是在看另一个问题,我的教授对函数totalWords有一个不同的解决方案,我在网上找到了这段代码。他将此作为模板上传,但我仍然对此感到困惑,“如果UBound(tweet1Split)>UBound(tweet2Split),则将分数降低为双倍,然后‘计算isDup分数’否则‘计算isDup分数结束,如果’”这是一个完全不同的问题,值得一个新问题。堆栈溢出有助于解决特定问题。如果问题被允许永远演变和改变,就不会有最终被接受的解决方案。