Arrays 使用VBA筛选器函数时的性能注意事项
我搞不懂过滤函数是怎么工作得这么快的。我对所有类型的数据都使用过过滤器,无论数据类型如何,过滤器都会删除我使用的任何替代方法。我经常使用由Stephen Bullen编写的二进制搜索算法和QuickArraySort算法(在中找到)。二进制搜索速度极快(与过滤函数一样快,假设数组已排序),快速排序算法是已知的最快排序算法之一 我在下面编写了一些测试代码,比较在一个非常大的数组(size=2000000)中查找随机元素的速度。我故意以非有序的方式填充数组(应该注意,我尝试过各种非有序的赋值方法,无论赋值方法如何,结果都是相似的) 这两种方法返回相同的结果,但是Filter方法的返回时间为0毫秒,而QuickSort/BinarySearch方法的返回时间接近20秒。这是一个巨大的差异!!如前所述,如果对数组进行排序,则二进制搜索方法也会返回0 ms(众所周知,二进制搜索要求首先对数组进行排序) 那么,Filter函数如何查看2000000个未排序的条目并立即找到正确的结果呢?它不能简单地遍历每个条目并将其与filtervalue进行比较(这是迄今为止最慢的方法),但根据这些初步测试,它也不能使用二进制搜索,因为它必须首先对数组进行排序。即使已经编译了一个很棒的排序算法,我也很难相信它能在瞬间对大小超过一百万的数组进行排序 顺便说一下,下面是快速排序算法和二进制搜索算法Arrays 使用VBA筛选器函数时的性能注意事项,arrays,excel,algorithm,vba,sorting,Arrays,Excel,Algorithm,Vba,Sorting,我搞不懂过滤函数是怎么工作得这么快的。我对所有类型的数据都使用过过滤器,无论数据类型如何,过滤器都会删除我使用的任何替代方法。我经常使用由Stephen Bullen编写的二进制搜索算法和QuickArraySort算法(在中找到)。二进制搜索速度极快(与过滤函数一样快,假设数组已排序),快速排序算法是已知的最快排序算法之一 我在下面编写了一些测试代码,比较在一个非常大的数组(size=2000000)中查找随机元素的速度。我故意以非有序的方式填充数组(应该注意,我尝试过各种非有序的赋值方法,无
Sub QuickSortString1D(ByRef saArray() As String, _
Optional ByVal bSortAscending As Boolean = True, _
Optional ByVal lLow1 As Variant, _
Optional ByVal lHigh1 As Variant)
'Dimension variables
Dim lLow2 As Long
Dim lHigh2 As Long
Dim sKey As String
Dim sSwap As String
On Error GoTo ErrorExit
'If not provided, sort the entire array
If IsMissing(lLow1) Then lLow1 = LBound(saArray)
If IsMissing(lHigh1) Then lHigh1 = UBound(saArray)
'Set new extremes to old extremes
lLow2 = lLow1
lHigh2 = lHigh1
'Get value of array item in middle of new extremes
sKey = saArray((lLow1 + lHigh1) \ 2)
'Loop for all the items in the array between the extremes
Do While lLow2 < lHigh2
If bSortAscending Then
'Find the first item that is greater than the mid-point item
Do While saArray(lLow2) < sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is less than the mid-point item
Do While saArray(lHigh2) > sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
Else
'Find the first item that is less than the mid-point item
Do While saArray(lLow2) > sKey And lLow2 < lHigh1
lLow2 = lLow2 + 1
Loop
'Find the last item that is greater than the mid-point item
Do While saArray(lHigh2) < sKey And lHigh2 > lLow1
lHigh2 = lHigh2 - 1
Loop
End If
'If the two items are in the wrong order, swap the rows
If lLow2 < lHigh2 Then
sSwap = saArray(lLow2)
saArray(lLow2) = saArray(lHigh2)
saArray(lHigh2) = sSwap
End If
'If the pointers are not together, advance to the next item
If lLow2 <= lHigh2 Then
lLow2 = lLow2 + 1
lHigh2 = lHigh2 - 1
End If
Loop
'Recurse to sort the lower half of the extremes
If lHigh2 > lLow1 Then
QuickSortString1D saArray, bSortAscending, lLow1, lHigh2
End If
'Recurse to sort the upper half of the extremes
If lLow2 < lHigh1 Then
QuickSortString1D saArray, bSortAscending, lLow2, lHigh1
End If
ErrorExit:
End Sub
'***********************************************************
' Comments: Uses a binary search algorithm to quickly locate
' a string within a sorted array of strings
'
' Arguments: sLookFor The string to search for in the array
' saArray An array of strings, sorted ascending
' lMethod Either vbBinaryCompare or vbTextCompare
' Defaults to vbTextCompare
' lNotFound The value to return if the text isn’t
' found. Defaults to -1
'
' Returns: Long The located position in the array,
' or lNotFound if not found
'
' Date Developer Action
' ———————————————————————————————-
' 02 Jun 04 Stephen Bullen Created
'
Function BinarySearchString(ByRef sLookFor As String, _
ByRef saArray() As String, _
Optional ByVal lMethod As VbCompareMethod = vbTextCompare, _
Optional ByVal lNotFound As Long = -1) As Long
Dim lLow As Long
Dim lMid As Long
Dim lHigh As Long
Dim lComp As Long
On Error GoTo ErrorExit
'Assume we didn’t find it
BinarySearchString = lNotFound
'Get the starting positions
lLow = LBound(saArray)
lHigh = UBound(saArray)
Do
'Find the midpoint of the array
lMid = (lLow + lHigh) \ 2
'Compare the mid-point element to the string being searched for
lComp = StrComp(saArray(lMid), sLookFor, lMethod)
If lComp = 0 Then
'We found it, so return the location and quit
BinarySearchString = lMid
Exit Do
ElseIf lComp = 1 Then
'The midpoint item is bigger than us - throw away the top half
lHigh = lMid - 1
Else
'The midpoint item is smaller than us - throw away the bottom half
lLow = lMid + 1
End If
'Continue until our pointers cross
Loop Until lLow > lHigh
ErrorExit:
End Function
Filter
确实使用了线性搜索——它只是快速执行,因为它是在高度优化的C/C++代码中实现的。要查看此信息,请运行以下代码:
Function RandString(n As Long) As String
'returns a random string in B-Z
Dim i As Long
Dim s As String
For i = 1 To n
s = s & Chr(66 + Int(25 * Rnd()))
Next i
RandString = s
End Function
Sub test()
Dim times(1 To 20) As Double
Dim i As Long, n As Long
Dim A() As String
Dim start As Double
Dim s As String
Randomize
s = RandString(99)
ReDim A(1 To 2000000)
For i = 1 To 2000000
A(i) = s + RandString(1)
Next i
s = s & "A"
For i = 20 To 1 Step -1
n = i * 100000
ReDim Preserve A(1 To n)
start = Timer
Debug.Print UBound(Filter(A, s)) 'should be -1
times(i) = Timer - start
Next i
For i = 1 To 20
Cells(i, 1) = i
Cells(i, 2) = times(i)
Next i
End Sub
此代码创建一个200万个长度为100的随机字符串数组,每个字符串都不同于最后位置的目标字符串。然后,它将大小为100000的倍数的子阵列送入过滤器
,计时所需的时间。输出如下所示:
清晰的线性趋势并不能确切地证明,但它有力地证明了VBA的
过滤器正在执行一个简单的线性搜索。我相信你在这里比较的是苹果和桔子。在测试Filter
函数时,您将无序数组作为输入,然后使用Filter
查找与测试值的匹配项。直觉告诉我们O(N)=200万次运算——每个数组元素测试一次。那你就完了
使用自定义VBA函数进行过滤时,首先要进行排序,这相当昂贵,O(N*Log2(N))=2900万。一旦对数组进行排序,您就可以获得搜索有序数组的好处,即O(Log2(N))=14。即使你大大加快了搜索速度,排序的惩罚也会让你丧命
<>希望有帮助。你正在观察编译的C++代码和解释的VBA之间的区别。编译后的代码要快几个数量级。此外,与VBA相比,筛选器可能会“作弊”,因为筛选器代码可以直接访问工作表值,而VBA不能。当然,这不是真正的欺骗,但它确实给了过滤器一个巨大的优势。最后,过滤器是高度优化的,我怀疑它采用了一种自己的幕后操作。@ExcelHero关于编译和解释的优点很好——但这似乎是关于比较VBA的搜索时间,而VBA不涉及任何对工作表值的访问。@JohnColeman说得对。我没有仔细查看代码,错误地假设正在测试哪个过滤器。现在我看到它是VBA的数组过滤器。我确实注意到OP使用了“条目”这个词,并且正在键入我的答案。哦,这个过滤器也是高度优化的,毫无疑问,它利用了幕后的排序。非常聪明的分析!我添加了一列,将运行时间(即B列中的值)除以乘因子(即a列中的值),每次,结果值都是常量,这进一步支持了您的说法。好的,对不起,我误解了。我认为解释为什么排序优先比线性搜索慢可能会有所帮助。
Sub Test3()
Dim i As Long, strMyArray() As String, lngSize As Long, strTest As String
Dim lngResultBrute As Long, TimeBruteSearch As Long
lngSize = 2000000
strTest = CStr(936740 * 97)
ReDim strMyArray(lngSize)
For i = 1 To UBound(strMyArray)
If i Mod 2 = 0 Then
strMyArray(i) = CStr((i - 1) * 97)
Else
strMyArray(i) = CStr((i + 1) * 97)
End If
Next i
StartTime = Timer
' Brute force search
For i = 1 To UBound(strMyArray)
If strMyArray(i) = strTest Then
lngResultBrute = CLng(strTest)
Exit For
End If
Next i
EndTime = Timer
TimeBruteSearch = EndTime - StartTime
MsgBox TimeBruteSearch
End Sub
Function RandString(n As Long) As String
'returns a random string in B-Z
Dim i As Long
Dim s As String
For i = 1 To n
s = s & Chr(66 + Int(25 * Rnd()))
Next i
RandString = s
End Function
Sub test()
Dim times(1 To 20) As Double
Dim i As Long, n As Long
Dim A() As String
Dim start As Double
Dim s As String
Randomize
s = RandString(99)
ReDim A(1 To 2000000)
For i = 1 To 2000000
A(i) = s + RandString(1)
Next i
s = s & "A"
For i = 20 To 1 Step -1
n = i * 100000
ReDim Preserve A(1 To n)
start = Timer
Debug.Print UBound(Filter(A, s)) 'should be -1
times(i) = Timer - start
Next i
For i = 1 To 20
Cells(i, 1) = i
Cells(i, 2) = times(i)
Next i
End Sub