Vba 过滤列表的算法

Vba 过滤列表的算法,vba,algorithm,arraylist,filter,time-complexity,Vba,Algorithm,Arraylist,Filter,Time Complexity,我已经实现了一个我认为是非常垃圾的方法,用于在VBA中过滤System.Collections.ArrayList。代码获取一个列表和一个要筛选的项/比较值。它在列表中循环并删除匹配项。然后它会重新启动循环(因为不能同时为每个和删除) 为什么是垃圾 假设列表的长度为X元素,包含Y项,这些项与筛选条件匹配,且X>Y。据我所知,最好的案例性能是O(X),当所有Y都集中在一开始时。最坏的情况是当所有的Ys都聚集在末尾时。在这种情况下,算法需要(X-Y)*Y查找操作,最大值为Y=X/2,因此O(X^2)

我已经实现了一个我认为是非常垃圾的方法,用于在VBA中过滤
System.Collections.ArrayList
。代码获取一个列表和一个要筛选的项/比较值。它在列表中循环并删除匹配项。然后它会重新启动循环(因为不能同时为每个和
删除

为什么是垃圾

假设列表的长度为
X
元素,包含
Y
项,这些项与筛选条件匹配,且
X>Y
。据我所知,最好的案例性能是
O(X)
,当所有
Y
都集中在一开始时。最坏的情况是当所有的
Y
s都聚集在末尾时。在这种情况下,算法需要
(X-Y)*Y
查找操作,最大值为
Y=X/2
,因此
O(X^2)


与简单的
O(X)
算法相比,这是一个很差的算法,该算法在找到匹配项时进行步进和删除,但不打破循环。但我不知道如何实施它有什么方法可以提高此过滤器的性能吗?

您不能执行以下操作,我认为这是O(n):

Option Explicit

Public Sub RemItems()

    Const TARGET_VALUE As String = "dd"
    Dim myList As Object
    Set myList = CreateObject("System.Collections.ArrayList")

    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"
    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"

    Dim i As Long
    For i = myList.Count - 1 To 0 Step -1
        If myList(i) = TARGET_VALUE Then myList.Remove myList(i)
    Next i

End Sub
有关复杂性信息,请参阅以下讨论:

如果可信(.NET-Big-O-Algorithm-Complexity-Cheat-Sheet):

注意:我用

编辑: 警告-我不是CS毕业生。这是在胡闹。我确信,对于正在处理的数据类型、分发等问题,会有争论……欢迎改进

上面的.Net表显示,从哈希表中删除,对于删除,平均是
O(1)
,而对于ArrayList,则是
O(n)
,因此我从
{a”,“b”,“c”}
的值中随机生成了100000行。然后,我将其用作以下结果的固定测试集

测试运行代码(请温和!)


这是一个很棒的问题,尽管它确实感觉像是一个代码审查式的问题。另外,arrayList中的数据类型是什么?这不是VB.NET吗?@QHarr该列表包含
实现IComparable
的对象,这是
mscorlib.dll
中的一些接口。但是,它们也可以是字符串或数字,然后检查将是
=
,而不是
。CompareTo()=0
。PS谢谢,虽然这只是我代码的一个浓缩片段,并且觉得它比CR更适合这里-我认为CR只适用于长代码?@DisplayName不,我在
excelvba
中做这件事,参考
mscorlib.dll
CR需要整个代码(不是关于大小,而是关于上下文。原位代码),而不是MCVE。所以,是的…也许这里的片段更好。我猜你是故意避免使用分类列表?是的,这似乎是一个更好的方法。这是假设枚举数组列表(
For…Each
)和索引(
myList(i)
)具有相同的算法复杂性,您知道吗?我的直觉是它们都是
O(n^0)
,即独立于数组的大小,索引的开销稍大一些(如果
ArrayLists
Collections
)。当n->large时,开销可以忽略不计。为什么您认为我可能要使用SortedList。不显示任何类型的内置
过滤器
方法?对于
排序列表
,我能做什么?对于
数组列表
,我不能做什么?忘了,我相信排序列表不允许重复键。哦,这很有趣;拆卸是
O(n)
。但我还是不确定;是否为myList中的每个项目设置
:下一个项目
,以及为i=myList设置
。计数-1到0步骤-1:设置项目=myList(i):下一个i
两者都设置
O(n)
,或者枚举中是否存在一些我不知道的隐藏复杂性?根据您的源代码,索引绝对是
O(1)
,因此后一种方法肯定是
O(n)
Option Explicit

Public Sub RemItems()

    Const TARGET_VALUE As String = "dd"
    Dim myList As Object
    Set myList = CreateObject("System.Collections.ArrayList")

    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"
    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"
    myList.Add "dd"
    myList.Add "a"
    myList.Add "a"

    Dim i As Long
    For i = myList.Count - 1 To 0 Step -1
        If myList(i) = TARGET_VALUE Then myList.Remove myList(i)
    Next i

End Sub
Option Explicit

Private Declare PtrSafe Function getFrequency Lib "kernel32" _
Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long
Private Declare PtrSafe Function getTickCount Lib "kernel32" _
Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long

Public Sub TestingArrayList()
    Const TARGET_VALUE = "a"
    Dim aList As Object
    Set aList = CreateObject("System.Collections.ArrayList")

    Dim arr()
    arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value

    Dim i As Long
    For i = 1 To UBound(arr, 1) '50000
        aList.Add arr(i, 2)
    Next i

    Debug.Print aList.Contains(TARGET_VALUE)

    Dim StartTime As Double

    StartTime = MicroTimer()

    For i = aList.Count - 1 To 0 Step -1
       If aList(i) = TARGET_VALUE Then aList.Remove aList(i)
    Next i

    Debug.Print "Removal from array list took: " & Round(MicroTimer - StartTime, 3) & " seconds"
    Debug.Print aList.Contains(TARGET_VALUE)

End Sub

Public Sub TestingHashTable()
    Const TARGET_VALUE = "a"
    Dim hTable As Object
    Set hTable = CreateObject("System.Collections.HashTable")

    Dim arr()
    arr = ThisWorkbook.Worksheets("Sheet1").Range("A1").CurrentRegion.Value '<== Reads in 100000 value

    Dim i As Long
    For i = 1 To UBound(arr, 1) '50000
        hTable.Add i, arr(i, 2)
    Next i

    Debug.Print hTable.ContainsValue(TARGET_VALUE)

    Dim StartTime As Double

    StartTime = MicroTimer()

    For i = hTable.Count To 1 Step -1
       If hTable(i) = TARGET_VALUE Then hTable.Remove i
    Next i

    Debug.Print "Removal from hash table took: " & Round(MicroTimer - StartTime, 3) & " seconds"
    Debug.Print hTable.ContainsValue(TARGET_VALUE)

End Sub

Public Function MicroTimer() As Double

    Dim cyTicks1 As Currency
    Static cyFrequency As Currency

    MicroTimer = 0

    If cyFrequency = 0 Then getFrequency cyFrequency

    getTickCount cyTicks1

    If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency
End Function
Option Explicit

Public Sub TestingComparison()

    Const RUN_COUNT As Long = 4

    Dim hTable As Object
    Dim aList As Object
    Dim i As Long, j As Long, k As Long, rowCount As Long
    Dim results() As Double

    Set hTable = CreateObject("System.Collections.HashTable")
    Set aList = CreateObject("System.Collections.ArrayList")

    Dim testSizes()
    testSizes = Array(100, 1000, 10000, 100000)  ', 1000000)
    ReDim results(0 To RUN_COUNT * (UBound(testSizes) + 1) - 1, 0 To 4)

    Application.ScreenUpdating = False

    With ThisWorkbook.Worksheets("Sheet5")

        For i = LBound(testSizes) To UBound(testSizes)

            For k = 1 To RUN_COUNT

                For j = 1 To testSizes(i)
                    hTable.Add j, 1
                    aList.Add 1
                Next j

                Dim StartTime As Double, completionTime As Double

                StartTime = MicroTimer()

                For j = hTable.Count To 1 Step -1
                    hTable.Remove j
                Next j

                results(rowCount, 3) = Round(MicroTimer - StartTime, 3)
                results(rowCount, 0) = testSizes(i)
                results(rowCount, 1) = k

                StartTime = MicroTimer()

                For j = aList.Count - 1 To 0 Step -1
                    aList.Remove aList(j)
                Next j

                results(rowCount, 2) = Round(MicroTimer - StartTime, 3)

                hTable.Clear
                aList.Clear
                rowCount = rowCount + 1
            Next k

        Next i

        .Range("A2").Resize(UBound(results, 1) + 1, UBound(results, 2)) = results

    End With

    Application.ScreenUpdating = True
End Sub