Excel 使用VBA复制粘贴脚本比较两个CSV文档
一个星期以来,我一直试图在我的话题上取得成功,但我找不到任何可以接受的解决办法。我是说。。。我有一个有效的解决方案,但比较只需要大约半天:-S 前提条件: 两个csv文件都已复制粘贴到本地工作簿中。他们在场,准备和他们一起玩。每个文件约有6000行和4列。 A列:文档名称/版本 B栏:主题1 C栏:主题2 D列:布尔人工制品 两个csv文件具有相同的结构。 列A包括documentname及其最新版本。 每行包含以下内容的组合:documentname/version、SubC1、SubC2和boolean CSV\u新旧示例,包括E列中CSV\u新的注释/更改Excel 使用VBA复制粘贴脚本比较两个CSV文档,excel,vba,csv,compare,Excel,Vba,Csv,Compare,一个星期以来,我一直试图在我的话题上取得成功,但我找不到任何可以接受的解决办法。我是说。。。我有一个有效的解决方案,但比较只需要大约半天:-S 前提条件: 两个csv文件都已复制粘贴到本地工作簿中。他们在场,准备和他们一起玩。每个文件约有6000行和4列。 A列:文档名称/版本 B栏:主题1 C栏:主题2 D列:布尔人工制品 两个csv文件具有相同的结构。 列A包括documentname及其最新版本。 每行包含以下内容的组合:documentname/version、SubC1、SubC2和b
Document/Version Subj1 Subj2 BOOLEAN
DOC_1/Vers1 FUN GERMANY FALSE
DOC_2/Vers3 FUN GERMANY TRUE
DOC_2/Vers3 FUN UK TRUE <- to be deleted in CSV_new
DOC_2/Vers3 FUN FRANCE TRUE
DOC_3/Vers7 ACTION GERMANY FALSE <- Version Update in CSV_new
DOC_4/Vers4 MOVIE UK TRUE
DOC_6/Vers1 HELP SPAIN FALSE
DOC_7/Vers2 FUN GERMANY FALSE <- boolean: true in CSV_new
DOC_8/Vers5 FUN FRANCE TRUE <- Subj1: ACTION instead of FUN
非常非常感谢您的努力o) 此代码将生成两组结果:一组用于
Sheet1
(旧),另一组用于Sheet2
(新)
- 集合1-
将显示Sheet2中缺少的记录Sheet1
- 集合2-
将显示Sheet1中缺少的记录Sheet2
- 两组都将显示更新的记录
词典,当准备好进行比较时 以前 之后 注意:您提供的示例数据与描述中的不同
CSV_old
Document/Version Subj1 Subj2 BOOLEAN
DOC_1/Vers1 FUN GERMANY FALSE <- Correct
DOC_2/Vers3 FUN GERMANY TRUE <- Correct
DOC_2/Vers3 FUN UK TRUE <- to be deleted in CSV_new <- Exists in new
DOC_2/Vers3 FUN FRANCE TRUE <- Correct
DOC_3/Vers7 ACTION GERMANY FALSE <- Version Update in CSV_new <- This not in new
DOC_4/Vers4 MOVIE UK TRUE <- Correct
DOC_6/Vers1 HELP SPAIN FALSE <- Correct
DOC_7/Vers2 FUN GERMANY FALSE <- boolean: true in CSV_new <- FALSE in new
DOC_8/Vers5 FUN FRANCE TRUE <- Subj1: ACTION instead of FUN <- Correct
CSV_new
Document/Version Subj1 Subj2 BOOLEAN
DOC_1/Vers1 FUN GERMANY FALSE <- Correct
DOC_2/Vers3 FUN GERMANY TRUE <- Correct
DOC_2/Vers3 FUN UK TRUE <- Exists in new
DOC_2/Vers3 FUN FRANCE TRUE <- Correct
DOC_3/Vers9 ACTION GERMANY FALSE <- Version Updated <- New record
DOC_4/Vers4 MOVIE UK TRUE <- Correct
DOC_5/Vers5 DANGER UK FALSE <- new/added Row in CSV_new <- Correct
DOC_6/Vers1 HELP SPAIN FALSE <- Correct
DOC_7/Vers2 FUN GERMANY FALSE <- boolean updated to true <- FALSE in new
DOC_8/Vers5 ACTION FRANCE TRUE <- Subj1: ACTION instead of FUN <- Correct
CSV\u旧
文档/版本SubC1 SubC2布尔值
DOC_1/Vers1 FUN GERMANY FALSE Maaaan谢谢,太好了!:o) 太快了!不幸的是,关于使用“CreateObject”(“Scripting.Dictionary”)“我一点也不懂。结果“Missing:”以相同的方式/含义总结添加和删除的行。是否有机会在新的/添加的行和删除的行之间做出区别?我很高兴这有帮助!但我不确定您所说的在新行和已删除行之间进行区分是什么意思:Sheet1
将显示Sheet2
中所有缺失的行-这将显示CSV\u new中所有已删除的
记录。然后,Sheet2
将显示Sheet1
中所有缺失的行-这将显示CSV\u new中的所有new
记录。您不能在一张表中显示新记录和已删除记录,因为您必须插入丢失的行,以显示丢失的记录(可能我没有正确理解)@stoeven-BTW-您能告诉我现在需要多长时间吗?(旧版总共有多少行,新版总共有多少行?)。我没有测试数千行的正确数据。谢谢亲爱的,你是对的——明白了:哦,所以。。。我会把它作为一个解决方案。现在我有大约5900行在这两个和它需要大约2秒,直到完成。使用这些词典的绝妙方法。但老实说,我需要再次了解Bing/Google的结构,以便自己能更进一步。再次感谢您为本主题所做的努力。欢迎您!如果使用大量数据,您将能够使用数组
、字典
、和集合
显著提高性能。要记住的关键是尽量减少与范围的交互:将整个工作表读入一个数组:Dim arr as Variant:arr=Sheet1。使用drange
,以与工作表相同的方式对数组进行更改:Sheet1。单元格(1,1)=“Test”
,与arr(1,1)=“Test”
相同,然后将整个阵列放回UsedRange:Sheet1.UsedRange=arr
Set findSameDocumentNumberInColumnA = Sheets(givenActiveWS).Cells.Find(Sheets("_ws_oldCSV").Range("A" & rowInOldCSV & ":D" & rowInOldCSV).Value, LookIn:=xlValues)
Set findSameDocumentNumberInColumnA_withoutVers = Sheets(givenActiveWS).Cells.Find(Left(Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value, Len(Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value) - 5), LookIn:=xlValues)
If Not findSameDocumentNumberInColumnA Is Nothing Then
'document/version found!
firstAddress = findSameDocumentNumberInColumnA.Address
Do
'if subj1+subj2 are same
If (Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 2).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 2).Value) And _
(Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 3).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 3).Value) Then '....and boolean-value the same
'Sheets("_ws_oldCSV").Range("A" & rowInOldCSV & ":D" & rowInOldCSV).Copy 'takes even longer
'Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 6).PasteSpecial Paste:=xlPasteValues
Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 6).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value
Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 7).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 2).Value
Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 8).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 3).Value
Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 9).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 4).Value
'leave loop
Exit Do
End If
Set findSameDocumentNumberInColumnA = Sheets(givenActiveWS).Cells.FindNext(findSameDocumentNumberInColumnA)
Loop While Not findSameDocumentNumberInColumnA Is Nothing And findSameDocumentNumberInColumnA.Address <> firstAddress
Else
'document/version not found
If Not findSameDocumentNumberInColumnA_withoutVers Is Nothing Then
'document found, looks like new version
'mark it with yellow to show updated version
Else
'unkown document, means new introduced since csv_old
'copy it under last item in RowF
'
End If
End If
next rowInOldCSV
Doc/Vers Subj1 Subj2 BOOLEAN Doc Subj1 Subj1 Boolean
DOC_1/Vers1 FUN GERMANY FALSE - - - -
DOC_2/Vers3 FUN GERMANY TRUE - - - -
DOC_2/Vers3 FUN UK TRUE Deleted - - -
DOC_2/Vers3 FUN FRANCE TRUE - - - -
DOC_3/Vers9 ACTION GERMANY FALSE Updated - - -
DOC_4/Vers4 MOVIE UK TRUE - - - -
DOC_5/Vers5 DANGER UK FALSE New - - -
DOC_6/Vers1 HELP SPAIN FALSE - - - -
DOC_7/Vers2 FUN GERMANY TRUE - - - X
DOC_8/Vers5 ACTION FRANCE TRUE - X - -
Option Explicit
Public Sub CompareCSVs() '1 = Old, 2 = New; UsedRange starts at A1
Const LC1 = 4 'D - LastCol in Old
Const LC2 = 4 'D - LastCol in New
Dim ur1 As Range, arr1 As Variant, dv1 As Object
Dim ur2 As Range, arr2 As Variant, dv2 As Object
Set ur1 = Sheet1.UsedRange 'Or: ThisWorkbook.Worksheets("csv_old").UsedRange
Set ur2 = Sheet2.UsedRange 'Or: ThisWorkbook.Worksheets("csv_new").UsedRange
arr1 = ur1
arr2 = ur2
Set dv1 = CreateObject("Scripting.Dictionary")
Set dv2 = CreateObject("Scripting.Dictionary")
Dim urRes1 As Range, urRes2 As Range, arrRes1 As Variant, arrRes2 As Variant
Set urRes1 = ur1.Offset(1, LC1).Resize(ur1.Rows.Count - 1, LC1 + 1) 'Exclude Headers
Set urRes2 = ur2.Offset(1, LC2).Resize(ur2.Rows.Count - 1, LC2 + 1) 'Exclude Headers
urRes1.ClearContents 'Clear results
urRes2.ClearContents
arrRes1 = urRes1
arrRes2 = urRes2
SetDictionaries dv1, arr1, LC1
SetDictionaries dv2, arr2, LC2: 'ShowAllItems dv1: ShowAllItems dv2
CompareData dv1, dv2, arrRes2
CompareData dv2, dv1, arrRes1
urRes1 = arrRes1
urRes2 = arrRes2
End Sub
Private Sub SetDictionaries(ByRef d As Object, ByRef arr As Variant, ByVal ubC As Long)
Dim r As Long, c As Long, k As String
For r = 2 To UBound(arr)
For c = 1 To ubC
k = k & arr(r, c) & "|"
d(Left(k, Len(k) - 1)) = 0
Next
k = vbNullString
Next
End Sub
Private Sub CompareData(ByRef d1 As Variant, ByRef d2 As Variant, ByRef res As Variant)
Dim r As Long, c As Long, itm As Variant, sp As Variant, k As Variant
r = 1
For Each itm In d2
sp = Split(itm, "|")
c = UBound(sp) + 1
If Not d1.Exists(itm) Then
If Len(res(r, 1)) = 0 Then
res(r, 1) = IIf(c = 1, "Missing: ", "Updated: ")
res(r, c + 1) = sp(c - 1)
Else
If res(r, 1) = "Updated: " Then res(r, c + 1) = sp(c - 1)
End If
End If
If c = 4 Then r = r + 1
Next
End Sub
Private Sub ShowAllItems(ByRef d As Object)
Dim x As Variant
For Each x In d
Debug.Print x 'Space$(5), String$(5, "-")
Next
Debug.Print
End Sub
CSV_old
Document/Version Subj1 Subj2 BOOLEAN
DOC_1/Vers1 FUN GERMANY FALSE <- Correct
DOC_2/Vers3 FUN GERMANY TRUE <- Correct
DOC_2/Vers3 FUN UK TRUE <- to be deleted in CSV_new <- Exists in new
DOC_2/Vers3 FUN FRANCE TRUE <- Correct
DOC_3/Vers7 ACTION GERMANY FALSE <- Version Update in CSV_new <- This not in new
DOC_4/Vers4 MOVIE UK TRUE <- Correct
DOC_6/Vers1 HELP SPAIN FALSE <- Correct
DOC_7/Vers2 FUN GERMANY FALSE <- boolean: true in CSV_new <- FALSE in new
DOC_8/Vers5 FUN FRANCE TRUE <- Subj1: ACTION instead of FUN <- Correct
CSV_new
Document/Version Subj1 Subj2 BOOLEAN
DOC_1/Vers1 FUN GERMANY FALSE <- Correct
DOC_2/Vers3 FUN GERMANY TRUE <- Correct
DOC_2/Vers3 FUN UK TRUE <- Exists in new
DOC_2/Vers3 FUN FRANCE TRUE <- Correct
DOC_3/Vers9 ACTION GERMANY FALSE <- Version Updated <- New record
DOC_4/Vers4 MOVIE UK TRUE <- Correct
DOC_5/Vers5 DANGER UK FALSE <- new/added Row in CSV_new <- Correct
DOC_6/Vers1 HELP SPAIN FALSE <- Correct
DOC_7/Vers2 FUN GERMANY FALSE <- boolean updated to true <- FALSE in new
DOC_8/Vers5 ACTION FRANCE TRUE <- Subj1: ACTION instead of FUN <- Correct