Arrays 使用Excel VBA在多维数组中查找(而不是删除)重复值(行)
在我过去的一段时间里,我正在努力实现自己的目标: 我希望根据多个条件使用VBA代码查找并突出显示重复的向上收费:Arrays 使用Excel VBA在多维数组中查找(而不是删除)重复值(行),arrays,excel,vba,multidimensional-array,conditional-formatting,Arrays,Excel,Vba,Multidimensional Array,Conditional Formatting,在我过去的一段时间里,我正在努力实现自己的目标: 我希望根据多个条件使用VBA代码查找并突出显示重复的向上收费: 产品的XID(A列) 加价标准1(CT列) 上调收费标准2(CU列) 向上充电类型(CV列)和 上升电荷水平(CW列) 如果电子表格中有多个实例/行共享/匹配所有这些条件,则表示追加费用是重复的。正如我在上面链接的前一篇文章中所看到的: 我所尝试的: 创建了一个通用公式(请参见下文),该公式插入到“帮助器”列中,并一直复制到电子表格中,指出哪些升级费用是重复的。此方法资源太多,耗时太
脚本.dictionary
,但我不确定如何(或是否)使用多维数组Sub populateArray()
Dim arrXID() As Variant, arrUpchargeOne() As Variant, arrUpchargeTwo() As Variant, arrUpchargeType() As Variant, arrUpchargeLevel() As Variant
Dim arrAllData() As Variant
Dim i As Long, lrow As Long
lrow = ActiveSheet.Cells(Rows.Count, 1).End(xlUp).Row
arrXID = Range("A2:A" & lrow) 'amend column number
arrUpchargeOne = Range("CT2:CT" & lrow)
arrUpchargeTwo = Range("CU2:CU" & lrow)
arrUpchargeType = Range("CV2:CV" & lrow)
arrUpchargeLevel = Range("CW2:CW" & lrow)
ReDim arrAllData(1 To UBound(arrXID, 1), 4) As Variant
For i = 1 To UBound(arrXID, 1)
arrAllData(i, 0) = arrXID(i, 1)
arrAllData(i, 1) = arrUpchargeOne(i, 1)
arrAllData(i, 2) = arrUpchargeTwo(i, 1)
arrAllData(i, 3) = arrUpchargeType(i, 1)
arrAllData(i, 4) = arrUpchargeLevel(i, 1)
Next i
End Sub
我可以将列放入数组中,但我从那里被卡住了。我不知道如何检查数组中的重复“行”
我的问题:
=和(SUMPRODUCT($A$2:$A$”&lastRow&“=$A2)*($CT$2:$CT$”&lastRow&“=$CT2)*($CU$2:$CU$”&lastRow&“=$CU2)*($CV$2:$CV$”&lastRow&“=$CV2)*($CW 2:$CW
如果Upcharge是重复的,则返回TRUE
考虑SQL解决方案,因为这是一种典型的过滤计数大于1的情况。要执行路由,需要在循环中跨数组的所有元素使用许多条件逻辑
虽然我建议您只需将数据导入数据库(如Excel的同级MS Access),但Excel可以在自己的工作簿上使用SQL语句(不是为了详细说明,而是Excel和Access使用相同的Jet/ACE引擎)。一个好消息是,您似乎已设置为使用命名列的类似于表的结构运行这样的查询
下面的示例引用了名为Data(Data$
)的工作表中的字段,并将查询结果输出到名为Results(带标题)的工作表中。根据需要更改名称。包括两个连接字符串(其中一个已注释掉)。希望它能在您的端运行
Sub RunSQL()
Dim conn As Object, rst As Object
Dim i As Integer, fld As Object
Dim strConnection As String, strSQL As String
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' Connection and SQL Strings
' strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
' & "DBQ=C:\Path\To\Workbook.xlsm;"
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source='C:\Path\To\Workbook.xlsm';" _
& "Extended Properties=""Excel 8.0;HDR=YES;"";"
strSQL = " SELECT [Data$].[Product's XID], [Data$].[Upcharge Criteria 1]," _
& " [Data$].[Upcharge Criteria 2], [Data$].[Upcharge Type]," _
& " [Data$].[Upcharge Type], [Data$].[Upcharge Level]" _
& " FROM [Data$]" _
& " GROUP BY [Data$].[Product's XID], [Data$].[Upcharge Criteria 1]," _
& " [Data$].[Upcharge Criteria 2], [Data$].[Upcharge Type]," _
& " [Data$].[Upcharge Type], [Data$].[Upcharge Level]," _
& " [Data$].[Product's XID]" _
& " HAVING COUNT(*) > 1;"
' Open the db connection
conn.Open strConnection
rst.Open strSQL, conn
' Column headers
i = 0
Worksheets("Results").Range("A1").Activate
For Each fld In rst.Fields
ActiveCell.Offset(0, i) = fld.Name
i = i + 1
Next fld
' Data rows
Worksheets("Results").Range("A2").CopyFromRecordset rst
rst.Close
conn.Close
End Sub
为什么不删除
Indirect()
您可以直接使用一些稳定的行引用,如$A$2:$A$50000
,这可能会显示性能上的一些重大变化
或
使用“为数据创建表”。在公式中使用表引用,这将比Indirect()
reference更快
编辑
你的实际公式
=AND(SUMPRODUCT(($A$2:$A$500=$A2)*($CU$2:$CU$500=$CU2)*($CV$2:$CV$500=$CV2)*($CW$2:$CW$500=$CW2)*($CX$2:$CX$500=$CX2))>1,$CU2 <> "")
你说识别重复的;我听到了对象
Public Sub-lminyDupes()
尺寸d为长,str为字符串,vAs为变量,vCTCWs为变量
Dim dDUPEs As Object'这可能会像魔术一样起作用,但不确定是否会起作用
您能否创建另一个支持(临时)列,将所有四个条件连接起来
ZZ_Temp=连接(CS;CV;CZ;等)
我想,通过这种方式,您可以更快地显示/突出显示重复项。条件格式和筛选
首先,您选择的函数不适用于如此多的行和多个条件。a可以执行与a相同的许多多条件操作,但通常需要25-35%的计算负载和时间。此外,可以在CountIf中使用全列引用,而不会影响列引用差异在内部被截断在的限制处
您的标准公式可以用COUNTIFS写成
=AND(COUNTIFS(A:A, A2, CT:CT, CT2, CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1, CT2<>"")
'... or,
=COUNTIFS(A:A, A2, CT:CT, CT2, CT:CT, "<>", CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1
虽然不是很快,但这很容易完成。“最佳猜测”比“最终开始和结束”更快,但您可能无法完全覆盖A列中重复的范围。当然,偏移量(例如,100向上和向下)可以调整范围的控制。应使用COUNTIFS函数替换SUMPRODUCT函数;后者通常为计算负载(和时间)的25-35%。也可以使用完整列引用,而不会造成损害。例如,=和(COUNTIFS(a:a,A2,CT:CT,CT2,CU:CU,CU2,CV:CV,CV2,CW:CW,CW2)>1,符号(LEN(CT2))
大约需要80秒。这不太可以接受,但需要在其中插入一个pin。只需选择完整的数据并按Ctrl+T键,即可将数据转换为表格。然后将该步骤转换为公式,选择一个范围参考,并将光标移到相关位置
=AND(COUNTIFS($A$2:$A$500,$A2,$CU$2:$CU$500,$CU2,$CV$2:$CV$500,$CV2,$CW$2:$CW**$500,$CW2,$CX$2:$CX$500,$CX2)>1,$CU12<>"")
Public Sub lminyDupes()
Dim d As Long, str As String, vAs As Variant, vCTCWs As Variant
Dim dDUPEs As Object '<~~ Late Binding
'Dim dDUPEs As New Scripting.Dictionary '<~~ Early Binding
Debug.Print Timer
Application.ScreenUpdating = False '<~~ uncomment this once you are no longer debugging
'Remove the next line with Early Binding¹
Set dDUPEs = CreateObject("Scripting.Dictionary")
dDUPEs.comparemode = vbTextCompare
With Worksheets("Upcharge") '<~~ you know what worksheet you are supposed to be on
With .Cells(1, 1).CurrentRegion
With .Resize(.Rows.Count - 1, .Columns.Count).Offset(1, 0)
.Columns(97).Interior.Pattern = xlNone '<~~ reset column CS
'the following is intended to mimic a CF rule using this formula
'=AND(COUNTIFS(A:A, A2, CT:CT, CT2, CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1, SIGN(LEN(CT2)))
vAs = .Columns(1).Value2
vCTCWs = Union(.Columns(98), .Columns(99), .Columns(100), .Columns(101)).Value2
For d = LBound(vAs, 1) To UBound(vAs, 1)
If CBool(Len(vCTCWs(d, 1))) Then
'make a key of the criteria values
str = Join(Array(vAs(d, 1), vCTCWs(d, 1), vCTCWs(d, 2), vCTCWs(d, 3), vCTCWs(d, 4)), ChrW(8203))
If dDUPEs.exists(str) Then
'the comboned key exists in the dictionary; append the current row
dDUPEs.Item(str) = dDUPEs.Item(str) & Chr(44) & "CS" & d
Else
'the combined key does not exist in the dictionary; store the current row
dDUPEs.Add Key:=str, Item:="CS" & d
End If
End If
Next d
'reuse a variant var to provide row highlighting
Erase vAs
For Each vAs In dDUPEs.keys
'if there is more than a single cell address, highlight all
If CBool(InStr(1, dDUPEs.Item(vAs), Chr(44))) Then _
.Range(dDUPEs.Item(vAs)).Interior.Color = vbRed
Next vAs
End With
End With
End With
dDUPEs.RemoveAll: Set dDUPEs = Nothing
Erase vCTCWs
Application.ScreenUpdating = True
Debug.Print Timer
End Sub
=AND(COUNTIFS(A:A, A2, CT:CT, CT2, CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1, CT2<>"")
'... or,
=COUNTIFS(A:A, A2, CT:CT, CT2, CT:CT, "<>", CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1
=IF(CT2<>"", COUNTIFS(A:A, A2, CT:CT, CT2, CU:CU, CU2, CV:CV, CV2, CW:CW, CW2)>1)
Sub lminyCFrule()
Debug.Print Timer
'Application.ScreenUpdating = False '<~~ uncomment this once you are no longer debugging
On Error Resume Next '<~~ needed for deleting objects without checking to see if they exist
With Worksheets("Upcharge") '<~~ you know what worksheet you are supposed to be on
If .AutoFilterMode Then .AutoFilterMode = False
'delete any existing defined name called 'localXID' or 'local200'
With .Parent
.Names("localXID").Delete
.Names("local200").Delete
End With
'create a new defined name called 'localXID' for CF rule method 1
.Names.Add Name:="localXID", RefersToR1C1:= _
"=INDEX('" & .Name & "'!C1:C104, MATCH('" & .Name & "'!RC1, '" & .Name & "'!C1, 0), 0):" & _
"INDEX('" & .Name & "'!C1:C104, MATCH('" & .Name & "'!RC1, '" & .Name & "'!C1 ), 0)"
'create a new defined name called 'local200' for CF rule method 2
.Names.Add Name:="local200", RefersToR1C1:= _
"=INDEX(Upcharge!C1:C104, MAX(2, ROW()-100), 0):INDEX(Upcharge!C1:C101, ROW()+100, 0)"
With .Cells(1, 1).CurrentRegion
'sort on column A in ascending order
.Cells.Sort Key1:=.Columns(1), Order1:=xlAscending, _
Orientation:=xlTopToBottom, Header:=xlYes
'create a CF rule on column CS
With .Resize(.Rows.Count - 1, 1).Offset(1, 96)
With .FormatConditions
.Delete
' method 1 and method 2. Only use ONE of these!
' method 1 - definitively start and end of XIDs in column A (slower, no mistakes)
'.Add Type:=xlExpression, Formula1:= _
"=IF(CT2<>"""", COUNTIFS(INDEX(localXID, 0, 1), A2, INDEX(localXID, 0, 98), CT2," & _
"INDEX(localXID, 0, 99), CU2, INDEX(localXID, 0, 100), CV2," & _
"INDEX(localXID, 0, 101), CW2)-1)"
' method 2 - best guess at start and end of XIDs in column A (faster, guesswork at true scope)
.Add Type:=xlExpression, Formula1:= _
"=IF(CT2<>"""", COUNTIFS(INDEX(local200, 0, 1), A2, INDEX(local200, 0, 98), CT2," & _
"INDEX(local200, 0, 99), CU2, INDEX(local200, 0, 100), CV2," & _
"INDEX(local200, 0, 101), CW2)-1)"
End With
.FormatConditions(.FormatConditions.Count).Interior.ColorIndex = 3
End With
'Filter based on column CS is red
.Columns(97).AutoFilter Field:=1, Criteria1:=vbRed, Operator:=xlFilterCellColor
End With
End With
Application.ScreenUpdating = True
Debug.Print Timer
End Sub