Vba 过滤列表的算法
我已经实现了一个我认为是非常垃圾的方法,用于在VBA中过滤Vba 过滤列表的算法,vba,algorithm,arraylist,filter,time-complexity,Vba,Algorithm,Arraylist,Filter,Time Complexity,我已经实现了一个我认为是非常垃圾的方法,用于在VBA中过滤System.Collections.ArrayList。代码获取一个列表和一个要筛选的项/比较值。它在列表中循环并删除匹配项。然后它会重新启动循环(因为不能同时为每个和删除) 为什么是垃圾 假设列表的长度为X元素,包含Y项,这些项与筛选条件匹配,且X>Y。据我所知,最好的案例性能是O(X),当所有Y都集中在一开始时。最坏的情况是当所有的Ys都聚集在末尾时。在这种情况下,算法需要(X-Y)*Y查找操作,最大值为Y=X/2,因此O(X^2)
System.Collections.ArrayList
。代码获取一个列表和一个要筛选的项/比较值。它在列表中循环并删除匹配项。然后它会重新启动循环(因为不能同时为每个和删除
)
为什么是垃圾
假设列表的长度为X
元素,包含Y
项,这些项与筛选条件匹配,且X>Y
。据我所知,最好的案例性能是O(X)
,当所有Y
都集中在一开始时。最坏的情况是当所有的Y
s都聚集在末尾时。在这种情况下,算法需要(X-Y)*Y
查找操作,最大值为Y=X/2
,因此O(X^2)
与简单的
O(X)
算法相比,这是一个很差的算法,该算法在找到匹配项时进行步进和删除,但不打破循环。但我不知道如何实施它有什么方法可以提高此过滤器的性能吗?您不能执行以下操作,我认为这是O(n):
Option Explicit
Public Sub RemItems()
Const TARGET_VALUE As String = "dd"
Dim myList As Object
Set myList = CreateObject("System.Collections.ArrayList")
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
Dim i As Long
For i = myList.Count - 1 To 0 Step -1
If myList(i) = TARGET_VALUE Then myList.Remove myList(i)
Next i
End Sub
有关复杂性信息,请参阅以下讨论:
如果可信(.NET-Big-O-Algorithm-Complexity-Cheat-Sheet):
注意:我用
编辑:
警告-我不是CS毕业生。这是在胡闹。我确信,对于正在处理的数据类型、分发等问题,会有争论……欢迎改进
上面的.Net表显示,从哈希表中删除,对于删除,平均是O(1)
,而对于ArrayList,则是O(n)
,因此我从{a”,“b”,“c”}
的值中随机生成了100000行。然后,我将其用作以下结果的固定测试集
测试运行代码(请温和!)
这是一个很棒的问题,尽管它确实感觉像是一个代码审查式的问题。另外,arrayList中的数据类型是什么?这不是VB.NET吗?@QHarr该列表包含
实现IComparable
的对象,这是mscorlib.dll
中的一些接口。但是,它们也可以是字符串或数字,然后检查将是=
,而不是。CompareTo()=0
。PS谢谢,虽然这只是我代码的一个浓缩片段,并且觉得它比CR更适合这里-我认为CR只适用于长代码?@DisplayName不,我在excelvba
中做这件事,参考mscorlib.dll
CR需要整个代码(不是关于大小,而是关于上下文。原位代码),而不是MCVE。所以,是的…也许这里的片段更好。我猜你是故意避免使用分类列表?是的,这似乎是一个更好的方法。这是假设枚举数组列表(For…Each
)和索引(myList(i)
)具有相同的算法复杂性,您知道吗?我的直觉是它们都是O(n^0)
,即独立于数组的大小,索引的开销稍大一些(如果ArrayLists
像Collections
)。当n->large时,开销可以忽略不计。为什么您认为我可能要使用SortedList。不显示任何类型的内置过滤器方法?对于排序列表
,我能做什么?对于数组列表
,我不能做什么?忘了,我相信排序列表不允许重复键。哦,这很有趣;拆卸是O(n)
。但我还是不确定;是否为myList中的每个项目设置:下一个项目
,以及为i=myList设置。计数-1到0步骤-1:设置项目=myList(i):下一个i
两者都设置O(n)
,或者枚举中是否存在一些我不知道的隐藏复杂性?根据您的源代码,索引绝对是O(1)
,因此后一种方法肯定是O(n)
Option Explicit
Public Sub RemItems()
Const TARGET_VALUE As String = "dd"
Dim myList As Object
Set myList = CreateObject("System.Collections.ArrayList")
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
myList.Add "dd"
myList.Add "a"
myList.Add "a"
Dim i As Long
For i = myList.Count - 1 To 0 Step -1
If myList(i) = TARGET_VALUE Then myList.Remove myList(i)
Next i
End Sub
Option Explicit
Private Declare PtrSafe Function getFrequency Lib "kernel32" _
Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long
Private Declare PtrSafe Function getTickCount Lib "kernel32" _
Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long
Public Sub TestingArrayList()
Const TARGET_VALUE = "a"
Dim aList As Object
Set aList = CreateObject("System.Collections.ArrayList")
Dim arr()
arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value
Dim i As Long
For i = 1 To UBound(arr, 1) '50000
aList.Add arr(i, 2)
Next i
Debug.Print aList.Contains(TARGET_VALUE)
Dim StartTime As Double
StartTime = MicroTimer()
For i = aList.Count - 1 To 0 Step -1
If aList(i) = TARGET_VALUE Then aList.Remove aList(i)
Next i
Debug.Print "Removal from array list took: " & Round(MicroTimer - StartTime, 3) & " seconds"
Debug.Print aList.Contains(TARGET_VALUE)
End Sub
Public Sub TestingHashTable()
Const TARGET_VALUE = "a"
Dim hTable As Object
Set hTable = CreateObject("System.Collections.HashTable")
Dim arr()
arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value
Dim i As Long
For i = 1 To UBound(arr, 1) '50000
hTable.Add i, arr(i, 2)
Next i
Debug.Print hTable.ContainsValue(TARGET_VALUE)
Dim StartTime As Double
StartTime = MicroTimer()
For i = hTable.Count To 1 Step -1
If hTable(i) = TARGET_VALUE Then hTable.Remove i
Next i
Debug.Print "Removal from hash table took: " & Round(MicroTimer - StartTime, 3) & " seconds"
Debug.Print hTable.ContainsValue(TARGET_VALUE)
End Sub
Public Function MicroTimer() As Double
Dim cyTicks1 As Currency
Static cyFrequency As Currency
MicroTimer = 0
If cyFrequency = 0 Then getFrequency cyFrequency
getTickCount cyTicks1
If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency
End Function
Option Explicit
Public Sub TestingComparison()
Const RUN_COUNT As Long = 4
Dim hTable As Object
Dim aList As Object
Dim i As Long, j As Long, k As Long, rowCount As Long
Dim results() As Double
Set hTable = CreateObject("System.Collections.HashTable")
Set aList = CreateObject("System.Collections.ArrayList")
Dim testSizes()
testSizes = Array(100, 1000, 10000, 100000) ', 1000000)
ReDim results(0 To RUN_COUNT * (UBound(testSizes) + 1) - 1, 0 To 4)
Application.ScreenUpdating = False
With ThisWorkbook.Worksheets("Sheet5")
For i = LBound(testSizes) To UBound(testSizes)
For k = 1 To RUN_COUNT
For j = 1 To testSizes(i)
hTable.Add j, 1
aList.Add 1
Next j
Dim StartTime As Double, completionTime As Double
StartTime = MicroTimer()
For j = hTable.Count To 1 Step -1
hTable.Remove j
Next j
results(rowCount, 3) = Round(MicroTimer - StartTime, 3)
results(rowCount, 0) = testSizes(i)
results(rowCount, 1) = k
StartTime = MicroTimer()
For j = aList.Count - 1 To 0 Step -1
aList.Remove aList(j)
Next j
results(rowCount, 2) = Round(MicroTimer - StartTime, 3)
hTable.Clear
aList.Clear
rowCount = rowCount + 1
Next k
Next i
.Range("A2").Resize(UBound(results, 1) + 1, UBound(results, 2)) = results
End With
Application.ScreenUpdating = True
End Sub