Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/excel/27.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Excel 使用VBA复制粘贴脚本比较两个CSV文档_Excel_Vba_Csv_Compare - Fatal编程技术网

Excel 使用VBA复制粘贴脚本比较两个CSV文档

Excel 使用VBA复制粘贴脚本比较两个CSV文档,excel,vba,csv,compare,Excel,Vba,Csv,Compare,一个星期以来,我一直试图在我的话题上取得成功,但我找不到任何可以接受的解决办法。我是说。。。我有一个有效的解决方案,但比较只需要大约半天:-S 前提条件: 两个csv文件都已复制粘贴到本地工作簿中。他们在场,准备和他们一起玩。每个文件约有6000行和4列。 A列:文档名称/版本 B栏:主题1 C栏:主题2 D列:布尔人工制品 两个csv文件具有相同的结构。 列A包括documentname及其最新版本。 每行包含以下内容的组合:documentname/version、SubC1、SubC2和b

一个星期以来,我一直试图在我的话题上取得成功,但我找不到任何可以接受的解决办法。我是说。。。我有一个有效的解决方案,但比较只需要大约半天:-S

前提条件: 两个csv文件都已复制粘贴到本地工作簿中。他们在场,准备和他们一起玩。每个文件约有6000行和4列。 A列:文档名称/版本 B栏:主题1 C栏:主题2 D列:布尔人工制品 两个csv文件具有相同的结构。 列A包括documentname及其最新版本。 每行包含以下内容的组合:documentname/version、SubC1、SubC2和boolean

CSV\u新旧示例,包括E列中CSV\u新的注释/更改

Document/Version    Subj1   Subj2   BOOLEAN 
DOC_1/Vers1         FUN     GERMANY FALSE   
DOC_2/Vers3         FUN     GERMANY TRUE    
DOC_2/Vers3         FUN     UK      TRUE    <- to be deleted in CSV_new
DOC_2/Vers3         FUN     FRANCE  TRUE    
DOC_3/Vers7         ACTION  GERMANY FALSE   <- Version Update in CSV_new
DOC_4/Vers4         MOVIE   UK      TRUE    
DOC_6/Vers1         HELP    SPAIN   FALSE   
DOC_7/Vers2         FUN     GERMANY FALSE   <- boolean: true in CSV_new
DOC_8/Vers5         FUN FRANCE  TRUE    <- Subj1: ACTION instead of FUN

非常非常感谢您的努力o)

此代码将生成两组结果:一组用于
Sheet1
(旧),另一组用于
Sheet2
(新)

  • 集合1-
    Sheet1
    将显示Sheet2中缺少的记录
  • 集合2-
    Sheet2
    将显示Sheet1中缺少的记录
  • 两组都将显示更新的记录
它使用嵌套字典(详细信息如下)





词典,当准备好进行比较时

以前

之后

注意:您提供的示例数据与描述中的不同

CSV_old

Document/Version Subj1  Subj2   BOOLEAN 
DOC_1/Vers1      FUN    GERMANY FALSE                                 <- Correct
DOC_2/Vers3      FUN    GERMANY TRUE                                  <- Correct
DOC_2/Vers3      FUN    UK      TRUE  <- to be deleted in CSV_new     <- Exists in new
DOC_2/Vers3      FUN    FRANCE  TRUE                                  <- Correct
DOC_3/Vers7      ACTION GERMANY FALSE <- Version Update in CSV_new    <- This not in new
DOC_4/Vers4      MOVIE  UK      TRUE                                  <- Correct
DOC_6/Vers1      HELP   SPAIN   FALSE                                 <- Correct  
DOC_7/Vers2      FUN    GERMANY FALSE <- boolean: true in CSV_new     <- FALSE in new
DOC_8/Vers5      FUN    FRANCE  TRUE  <- Subj1: ACTION instead of FUN <- Correct

CSV_new

Document/Version Subj1  Subj2   BOOLEAN 
DOC_1/Vers1      FUN    GERMANY FALSE                                 <- Correct
DOC_2/Vers3      FUN    GERMANY TRUE                                  <- Correct  
DOC_2/Vers3      FUN    UK      TRUE                                  <- Exists in new
DOC_2/Vers3      FUN    FRANCE  TRUE                                  <- Correct   
DOC_3/Vers9      ACTION GERMANY FALSE <- Version Updated              <- New record
DOC_4/Vers4      MOVIE  UK      TRUE                                  <- Correct
DOC_5/Vers5      DANGER UK      FALSE <- new/added Row in CSV_new     <- Correct
DOC_6/Vers1      HELP   SPAIN   FALSE                                 <- Correct
DOC_7/Vers2      FUN    GERMANY FALSE <- boolean updated to true      <- FALSE in new
DOC_8/Vers5      ACTION FRANCE  TRUE  <- Subj1: ACTION instead of FUN <- Correct
CSV\u旧
文档/版本SubC1 SubC2布尔值

DOC_1/Vers1 FUN GERMANY FALSE Maaaan谢谢,太好了!:o) 太快了!不幸的是,关于使用“CreateObject”(“Scripting.Dictionary”)“我一点也不懂。结果“Missing:”以相同的方式/含义总结添加和删除的行。是否有机会在新的/添加的行和删除的行之间做出区别?我很高兴这有帮助!但我不确定您所说的在新行和已删除行之间进行区分是什么意思:
Sheet1
将显示
Sheet2
中所有缺失的行-这将显示CSV\u new中所有
已删除的
记录。然后,
Sheet2
将显示
Sheet1
中所有缺失的行-这将显示CSV\u new中的所有
new
记录。您不能在一张表中显示新记录和已删除记录,因为您必须插入丢失的行,以显示丢失的记录(可能我没有正确理解)@stoeven-BTW-您能告诉我现在需要多长时间吗?(旧版总共有多少行,新版总共有多少行?)。我没有测试数千行的正确数据。谢谢亲爱的,你是对的——明白了:哦,所以。。。我会把它作为一个解决方案。现在我有大约5900行在这两个和它需要大约2秒,直到完成。使用这些词典的绝妙方法。但老实说,我需要再次了解Bing/Google的结构,以便自己能更进一步。再次感谢您为本主题所做的努力。欢迎您!如果使用大量数据,您将能够使用
数组
字典
、和
集合
显著提高性能。要记住的关键是尽量减少与范围的交互:将整个工作表读入一个数组:
Dim arr as Variant:arr=Sheet1。使用drange
,以与工作表相同的方式对数组进行更改:
Sheet1。单元格(1,1)=“Test”
,与
arr(1,1)=“Test”
相同,然后将整个阵列放回UsedRange:
Sheet1.UsedRange=arr
Set findSameDocumentNumberInColumnA = Sheets(givenActiveWS).Cells.Find(Sheets("_ws_oldCSV").Range("A" & rowInOldCSV & ":D" & rowInOldCSV).Value, LookIn:=xlValues)
Set findSameDocumentNumberInColumnA_withoutVers = Sheets(givenActiveWS).Cells.Find(Left(Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value, Len(Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value) - 5), LookIn:=xlValues)


If Not findSameDocumentNumberInColumnA Is Nothing Then
    'document/version found!

    firstAddress = findSameDocumentNumberInColumnA.Address
    Do
         'if subj1+subj2 are same
        If (Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 2).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 2).Value) And _
           (Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 3).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 3).Value) Then '....and boolean-value the same

            'Sheets("_ws_oldCSV").Range("A" & rowInOldCSV & ":D" & rowInOldCSV).Copy 'takes even longer
            'Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 6).PasteSpecial Paste:=xlPasteValues
            Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 6).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 1).Value
            Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 7).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 2).Value
            Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 8).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 3).Value
            Sheets(givenActiveWS).Cells(findSameDocumentNumberInColumnA.Row, 9).Value = Sheets("_ws_oldCSV").Cells(rowInOldCSV, 4).Value

            'leave loop
            Exit Do
        End If
        Set findSameDocumentNumberInColumnA = Sheets(givenActiveWS).Cells.FindNext(findSameDocumentNumberInColumnA)
    Loop While Not findSameDocumentNumberInColumnA Is Nothing And findSameDocumentNumberInColumnA.Address <> firstAddress

Else
    'document/version not found
    If Not findSameDocumentNumberInColumnA_withoutVers Is Nothing Then
        'document found, looks like new version
        'mark it with yellow to show updated version
    Else
        'unkown document, means  new introduced since csv_old
        'copy it under last item in RowF
        '
    End If
End If
next rowInOldCSV
Doc/Vers       Subj1    Subj2    BOOLEAN    Doc    Subj1    Subj1    Boolean
DOC_1/Vers1    FUN      GERMANY  FALSE      -       -       -           -
DOC_2/Vers3    FUN      GERMANY  TRUE       -       -       -           -
DOC_2/Vers3    FUN      UK       TRUE      Deleted  -       -           -
DOC_2/Vers3    FUN      FRANCE   TRUE      -        -       -           -
DOC_3/Vers9    ACTION   GERMANY  FALSE     Updated  -       -           -
DOC_4/Vers4    MOVIE    UK       TRUE      -        -       -           -
DOC_5/Vers5    DANGER   UK       FALSE     New      -       -           -
DOC_6/Vers1    HELP     SPAIN    FALSE     -        -       -           -
DOC_7/Vers2    FUN      GERMANY  TRUE      -        -       -           X
DOC_8/Vers5    ACTION   FRANCE   TRUE      -        X       -           -
Option Explicit

Public Sub CompareCSVs()    '1 = Old, 2 = New; UsedRange starts at A1

    Const LC1 = 4           'D - LastCol in Old
    Const LC2 = 4           'D - LastCol in New

    Dim ur1 As Range, arr1 As Variant, dv1 As Object
    Dim ur2 As Range, arr2 As Variant, dv2 As Object

    Set ur1 = Sheet1.UsedRange  'Or: ThisWorkbook.Worksheets("csv_old").UsedRange
    Set ur2 = Sheet2.UsedRange  'Or: ThisWorkbook.Worksheets("csv_new").UsedRange
    arr1 = ur1
    arr2 = ur2
    Set dv1 = CreateObject("Scripting.Dictionary")
    Set dv2 = CreateObject("Scripting.Dictionary")

    Dim urRes1 As Range, urRes2 As Range, arrRes1 As Variant, arrRes2 As Variant

    Set urRes1 = ur1.Offset(1, LC1).Resize(ur1.Rows.Count - 1, LC1 + 1) 'Exclude Headers
    Set urRes2 = ur2.Offset(1, LC2).Resize(ur2.Rows.Count - 1, LC2 + 1) 'Exclude Headers
    urRes1.ClearContents        'Clear results
    urRes2.ClearContents
    arrRes1 = urRes1
    arrRes2 = urRes2

    SetDictionaries dv1, arr1, LC1
    SetDictionaries dv2, arr2, LC2:     'ShowAllItems dv1:   ShowAllItems dv2

    CompareData dv1, dv2, arrRes2
    CompareData dv2, dv1, arrRes1

    urRes1 = arrRes1
    urRes2 = arrRes2
End Sub
Private Sub SetDictionaries(ByRef d As Object, ByRef arr As Variant, ByVal ubC As Long)

    Dim r As Long, c As Long, k As String

    For r = 2 To UBound(arr)
        For c = 1 To ubC
            k = k & arr(r, c) & "|"
            d(Left(k, Len(k) - 1)) = 0
        Next
        k = vbNullString
    Next
End Sub

Private Sub CompareData(ByRef d1 As Variant, ByRef d2 As Variant, ByRef res As Variant)

    Dim r As Long, c As Long, itm As Variant, sp As Variant, k As Variant

    r = 1
    For Each itm In d2
        sp = Split(itm, "|")
        c = UBound(sp) + 1
        If Not d1.Exists(itm) Then
            If Len(res(r, 1)) = 0 Then
                res(r, 1) = IIf(c = 1, "Missing: ", "Updated: ")
                res(r, c + 1) = sp(c - 1)
            Else
                If res(r, 1) = "Updated: " Then res(r, c + 1) = sp(c - 1)
            End If
        End If
        If c = 4 Then r = r + 1
    Next
End Sub
Private Sub ShowAllItems(ByRef d As Object)

    Dim x As Variant

    For Each x In d
        Debug.Print x   'Space$(5), String$(5, "-")
    Next
    Debug.Print
End Sub
CSV_old

Document/Version Subj1  Subj2   BOOLEAN 
DOC_1/Vers1      FUN    GERMANY FALSE                                 <- Correct
DOC_2/Vers3      FUN    GERMANY TRUE                                  <- Correct
DOC_2/Vers3      FUN    UK      TRUE  <- to be deleted in CSV_new     <- Exists in new
DOC_2/Vers3      FUN    FRANCE  TRUE                                  <- Correct
DOC_3/Vers7      ACTION GERMANY FALSE <- Version Update in CSV_new    <- This not in new
DOC_4/Vers4      MOVIE  UK      TRUE                                  <- Correct
DOC_6/Vers1      HELP   SPAIN   FALSE                                 <- Correct  
DOC_7/Vers2      FUN    GERMANY FALSE <- boolean: true in CSV_new     <- FALSE in new
DOC_8/Vers5      FUN    FRANCE  TRUE  <- Subj1: ACTION instead of FUN <- Correct

CSV_new

Document/Version Subj1  Subj2   BOOLEAN 
DOC_1/Vers1      FUN    GERMANY FALSE                                 <- Correct
DOC_2/Vers3      FUN    GERMANY TRUE                                  <- Correct  
DOC_2/Vers3      FUN    UK      TRUE                                  <- Exists in new
DOC_2/Vers3      FUN    FRANCE  TRUE                                  <- Correct   
DOC_3/Vers9      ACTION GERMANY FALSE <- Version Updated              <- New record
DOC_4/Vers4      MOVIE  UK      TRUE                                  <- Correct
DOC_5/Vers5      DANGER UK      FALSE <- new/added Row in CSV_new     <- Correct
DOC_6/Vers1      HELP   SPAIN   FALSE                                 <- Correct
DOC_7/Vers2      FUN    GERMANY FALSE <- boolean updated to true      <- FALSE in new
DOC_8/Vers5      ACTION FRANCE  TRUE  <- Subj1: ACTION instead of FUN <- Correct