Vba 将下载的数据解析为更简单的结构
每个月我都会从我们的一家供应商那里下载数据,虽然数据量很小,但其格式不便于使用查找公式。然后我读了一大堆单元格参考资料,希望他们找对了地方。在下图中,什么是读取数据和构造数据的最佳方式。我需要在一个月内阅读A:G列,下个月将是A:H,但最多只有12个月,然后在我的报告中构建它,如I2:K10所示 “位置”可能没有从供应商处下载的数据。所以位置在改变。此外,我还需要从中下载大约30个小数据范围,以便将它们合并到一个更大的报告中。此外,数据将粘贴到其自己的工作表上,而拉取的数据将粘贴到另一个工作表上。 我愿意接受VBA的建议以及单元格公式 不同的颜色显示了我想读的内容和我需要写的地方 谢谢 -沙伯尔Vba 将下载的数据解析为更简单的结构,vba,excel,excel-formula,Vba,Excel,Excel Formula,每个月我都会从我们的一家供应商那里下载数据,虽然数据量很小,但其格式不便于使用查找公式。然后我读了一大堆单元格参考资料,希望他们找对了地方。在下图中,什么是读取数据和构造数据的最佳方式。我需要在一个月内阅读A:G列,下个月将是A:H,但最多只有12个月,然后在我的报告中构建它,如I2:K10所示 “位置”可能没有从供应商处下载的数据。所以位置在改变。此外,我还需要从中下载大约30个小数据范围,以便将它们合并到一个更大的报告中。此外,数据将粘贴到其自己的工作表上,而拉取的数据将粘贴到另一个工作表上
这是答案的第1部分,介绍了解决方案所需的技术,但VBA新手可能不熟悉这些技术。解决方案的主例程在中,其子例程在中 这个问题没有完全描述这个问题。第一步是从远程站点下载40个CSV文件。直到稍后,才会尝试自动化该步骤。第二步是识别已下载到包含工作簿(数据将写入其中)的文件夹中的CSV文件 创建新的Excel工作簿,打开Visual Basic编辑器,创建模块并将此代码复制到该模块。宏
Demo01
列出与工作簿位于同一文件夹中的文件名
' Option Explicit means every variable must be defined
' If omitted a misspelt variable become a declaration. For example:
' Dim Count As Long
' Cuont = Count + 1
' declares a new variable Cuont and sets its value to Count+1. Such
' errors can be very difficult to spot. With Option Explicit, the
' compiler reports that Cuont is undefined.
Option Explicit
Sub Demo01()
Dim Fl As Object
Dim FlSysObj As Object
Dim FldObj As Object
' When assigning a value to an object, you must have "Set"
' at the beginning of the statement
' This creates a file system object which gives you access to the file system
Set FlSysObj = CreateObject("Scripting.FileSystemObject")
' This creates a folder object which gives access to all properties of the folder
' includes details of the files within it.
Set FldObj = FlSysObj.GetFolder(Application.ActiveWorkbook.Path)
' Loop for each file in the folder and output its name to the Immediate Window
For Each Fl In FldObj.Files
Debug.Print Fl.Name
Next
End Sub
我们现在需要忽略非CSV文件并打开CSV文件,以便访问其内容
将以下宏Demo02
复制并粘贴到与Demo01
相同的模块中,然后运行它。宏Demo02
打开每个CSV文件,并将识别信息输出到即时窗口,以证明它已这样做
Sub Demo02()
Dim Fl As Object
Dim FlSysObj As Object
Dim FldObj As Object
Dim PathName As String
Dim WkBkSrc As Workbook
PathName = Application.ThisWorkbook.Path
Set FlSysObj = CreateObject("Scripting.FileSystemObject")
Set FldObj = FlSysObj.GetFolder(PathName)
For Each Fl In FldObj.Files
' I only want to load the CSV files so check the extension
If LCase(Right(Fl.Name, 4)) = ".csv" Then
' There should be some error handling here to ensure the macro does not
' stop if a file fails to open. However, Excel's ability to open any old
' junk as a workbook never ceases to amaze me so for the sake of
' simplicity I have omitted it. If you do experience errors, consider
' something like this:
' Err.Clear ' Clear any record of previous error
' On Error Resume Next ' Continue if error
' Set WkBkSrc = Workbooks.Open(PathName & "\" & Fl.Name)
' On Error GoTo 0 ' Revert to normal error processing
' If Err.Number = 0 Then
' Debug.Print Fl.Name & " loaded successfully"
' Else
' Debug.Print Fl.Name & " failed to load."
' Debug.Print "Error Number = "; Err.Number & _
' " Description = " & Err.Description
' End If
Set WkBkSrc = Workbooks.Open(PathName & "\" & Fl.Name)
' The CSV file will be in the active worksheet of the active workbook
' If I understand your screen shot, columns 1 of rows 2 and 3 are the
' hospital name and the date range. The sole purpose of this debug
' print statement is to output something unique from each CSV file
' to prove each has been loaded. Change this statement as necessary
' if I have misunderstood the arrangement of the CSV files.
Debug.Print Fl.Name & ": " & Cells(2, 1).Value & " " & Cells(3, 1).Value
WkBkSrc.Close ' Close the CSV file
' The original workbook is again the active workbook.
End If
Next
End Sub
成功地将CSV文件读入内存后,下一步似乎是找到并提取准备传输到目标的所需数据。然而,还有一个优先步骤
在定位和提取数据之前,必须检查文件是否与您认为的相同。当假定为数字的值实际上是字符串时,试图处理错误的文件可能会导致崩溃。如果其他人看到这一点,这将是令人尴尬的,但不会是一场灾难。你必须检查的是作者忘记告诉你的格式的细微变化。导致提取错误数据的额外列或序列更改可能会在数月内被忽略,并且很难纠正。存档所有中间工作簿和所有CSV文件可能允许您重新编码和重复更新,以创建正确的当前工作簿,但您能否撤消基于错误数据的任何决策
所有这些检查、定位、提取和存储都需要考虑如何最好地访问数据。您可以直接在单个单元格或区域级别访问加载的工作表中的数据。然而,这可能是缓慢的。您正在处理的卷可能意味着这并不太重要。然而,即使性能不是问题,从一个工作表中移动数据也会变得混乱。我已经决定引进更先进的技术,因为如果不是这样的话,未来的项目将很有用
变体可以保存任何内容,并且其内容的性质可以更改。如果VBA不太合理,则以下内容有效:
Dim MyValue as Variant
MyValue = 5
MyValue = "String"
MyValue = Array("abc", 10, "def", 12)
更有趣的是,您可以执行以下操作:
MyValue = Range1.Value
这将把MyValue设置为一个2D数组,大小刚好足以容纳Range1中的每个值,并将这些值复制到MyValue。从内存中访问单元格值比从工作表中访问要快得多。相反的情况也可能发生:
Range2.Value = DestValue
这意味着,您可以构建希望保存在内存中的数据,然后在一条语句中将其写入所需的范围
二维阵列的传统做法是将列作为第一维,将行作为第二维。但是,从一个范围创建的数组则相反,第一个维度是行。这可能看起来很奇怪,但与访问工作表的语法相匹配:
Cells(Row, Col).Value = 5 ' worksheet cell
MyValue(Row, Col) = 5 ' array cell
将一个工作表加载到一个变量很容易,也很方便,但我确实需要在处理之前将所有CSV文件加载到内存中
Dim CellValueSrc() As Variant ' Define an array of variants
ReDim CellValue(1 to CountCSVFile) ' Define the size of CellValueSrc
' UsedRange is a property of a worksheet so this loads the used part of
' the active worksheet (the CSV file) to CellValue(InxFile)
CellValue(InxFile) = WkBkSrc.ActiveSheet.UsedRange
以上语句取自宏Demo03
。我没有将工作表加载到变量,而是定义了一个变量数组,然后将每个工作表加载到不同的元素
这被称为锯齿阵列。并不是所有的语言都有这种功能,因此很难理解这个想法。我设计了宏Demo03
来演示它们的使用。我首先将所有工作表加载到锯齿数组的元素中,然后将单元格值复制到新数组中,并将该数组加载到新工作表中。运行Demo03
查看它的功能,然后通过代码查看它是如何实现此效果的警告:宏覆盖工作表“Sheet1”。宏底部附近的注释告诉您如果无法接受,应更改什么
Sub Demo03()
Dim CellValueDest() As Variant
Dim CellValueSrc() As Variant
Dim ColCrntSrc As Long
Dim CountCell As Long
Dim CountCSVFile As Long
Dim Fl As Object
Dim FlName() As String
Dim FlSysObj As Object
Dim FldObj As Object
Dim InxFile As Long
Dim PathName As String
Dim RowCrntDest As Long
Dim RowCrntSrc As Long
Dim WkBkSrc As Workbook
PathName = Application.ThisWorkbook.Path
Set FlSysObj = CreateObject("Scripting.FileSystemObject")
Set FldObj = FlSysObj.GetFolder(PathName)
CountCSVFile = 0
' Loop through files to count number of CSv files
For Each Fl In FldObj.Files
If LCase(Right(Fl.Name, 4)) = ".csv" Then
CountCSVFile = CountCSVFile + 1
End If
Next
' It is possible to use ReDim Preserve to enlarge an array.
' However, there is a lot of work behind a ReDim Preserve so
' I avoid them if I can.
' You can omit the lower bound but that means the lower bound depends of
' the Option Base statement. I prefer to be explicit. I also prefer
' lower bounds of 1 for most purposes. Many do not agree and most languages
' do not give the programmer a choice. My code so my choice.
ReDim CellValueSrc(1 To CountCSVFile)
ReDim FlName(1 To CountCSVFile)
InxFile = 0
CountCell = 0
' Loop through files to save names and cell values.
' Count number of cells at same time
For Each Fl In FldObj.Files
If LCase(Right(Fl.Name, 4)) = ".csv" Then
InxFile = InxFile + 1
FlName(InxFile) = Fl.Name
Set WkBkSrc = Workbooks.Open(PathName & "\" & Fl.Name)
CellValueSrc(InxFile) = WkBkSrc.ActiveSheet.UsedRange
If IsEmpty(CellValueSrc(InxFile)) Then
' The worksheet is empty
' Count as one cell
CountCell = CountCell + 1
Else
' UBound(A,N) returns the upper bound of the Nth dimension of array A.
' An array loaded from a worksheet will always have lower bounds of 1.
CountCell = CountCell + UBound(CellValueSrc(InxFile), 1) * _
UBound(CellValueSrc(InxFile), 2)
End If
WkBkSrc.Close ' Close the CSV file
End If
Next
' Release resources
Set FlSysObj = Nothing
Set FldObj = Nothing
' Prepare to create an output worksheet containing all the data loaded
ReDim CellValueDest(1 To CountCell + 1, 1 To 4)
CellValueDest(1, 1) = "File"
CellValueDest(1, 2) = "Row"
CellValueDest(1, 3) = "Column"
CellValueDest(1, 4) = "Value"
RowCrntDest = 1
For InxFile = 1 To UBound(FlName)
If IsEmpty(CellValueSrc(InxFile)) Then
RowCrntDest = RowCrntDest + 1
CellValueDest(RowCrntDest, 1) = FlName(InxFile)
CellValueDest(RowCrntDest, 4) = "Empty CSV file"
Else
For RowCrntSrc = 1 To UBound(CellValueSrc(InxFile), 1)
For ColCrntSrc = 1 To UBound(CellValueSrc(InxFile), 2)
RowCrntDest = RowCrntDest + 1
CellValueDest(RowCrntDest, 1) = FlName(InxFile)
CellValueDest(RowCrntDest, 2) = RowCrntSrc
CellValueDest(RowCrntDest, 3) = ColCrntSrc
' Note the syntax for accessing cell value.
' CellValueSrc is a 1D array so CellValueSrc(InxFile) accessing
' an element within it. CellValueSrc(InxFile) is a 2D array so
' CellValueSrc(InxFile)(RowCrntSrc, ColCrntSrc) accessing an element
' within it.
CellValueDest(RowCrntDest, 4) = _
CellValueSrc(InxFile)(RowCrntSrc, ColCrntSrc)
Next
Next
End If
Next
' #### This assumes that the workbook contains a worksheet "Sheet1" and that
' #### I can overwrite that worksheet. Change as necessary.
With Worksheets("Sheet1")
.Cells.EntireRow.Delete ' Delete any existing data
' Note that you have to specify the size of the output range.
' If the output range is not the same size as the array, the array will
' be truncated or repeated.
.Range(.Cells(1, 1), .Cells(CountCell + 1, 4)).Value = CellValueDest
.Columns(4).AutoFit
End With
End Sub
这是答案的第2部分,介绍解决方案并包含主要例程。包含子例程。介绍我在解决方案中使用的技术 我的解决方案要求宏的工作簿包含两个工作表:一个用于错误,另一个用于合并数据。这些工作簿的名称定义为常量,因此可以根据需要进行更改 我创建了许多CSV文件,我相信这些文件的格式与您下载的文件相匹配。一个典型的例子是:
1 Caution: Rates Have Not Been Adjusted For Patient Mix
2 St Anthony's Hospital
3 Jan 2013 - April 2013 Location Comparison Based on 6 Locations
4 CMS Qualified HCAHPS Data from All Service Lines
5 Communications about Medications Composite Results
6 Location,Jan 2013,Feb 2013,Mar 2013,Apr 2013,Composite Rate,Percentile
7 2E,70,72.22,64.62,81.82,72.17,49th
8 2S,60,62.22,54.62,71.82,62.17,39th
9 3N,78.57,83.33,66.67,NR,76.19,74th
10 3S,50,90,50,100,72.5,56th
11 4N,88.89,75,77.27,100,85.29,85th
12 ICU/PCU,72.73,50,80,100,75.68,54th
13
14 St Anthony's Hospital,73.5,73.28,67.89,84.21,74.72,59th
15 Vendor DB % Top Box,72.29,72.86,73.58,75.17,73.48
医院的名字是真实的,尽管是虚构的
Option Explicit
' Constants are a convenient way of defining values that will not change
' during a run of the macro. They are particular suitable for:
' (1) Replacing numbers by meaningful name. If column 5 is used for
' names, say, using ColName instead of 5 helps document the macro.
' (2) Values that are used in several places and might change. When they
' do change, one amendment is sufficient to fully update the macro.
Const ColConsolHosp As Long = 1 '\
Const ColConsolLocn As Long = 2 '| If the columns of the consolidate
Const ColConsolQuestFirst As Long = 3 '| worksheet are rearranged, these
Const ColConsolQuestLast As Long = 12 '/ valuesmust be ajusted to match.
Const ColErrorTime As Long = 1
Const ColErrorFile As Long = 2
Const ColErrorRow As Long = 3
Const ColErrorCol As Long = 4
Const ColErrorMsg As Long = 5
Const FmtDate As String = "dmmmyy"
Const FmtDateTime As String = "dmmmyy hh:mm"
Const WkShtNameConsol As String = "Consolidate" '\ Change if require output to
Const WkShtNameError As String = "Error" '/ different worksheets.
Sub Consolidate()
Dim CellValueConsol() As Variant ' Cell values from used range
' of consoldate worksheet
Dim ColSrcCompositeRate As Long ' Column hold composite rate
Dim ColConsolCrnt As Long
Dim DateStartAll As Date
Dim DateStartCrnt As Date
Dim DateEndAll As Date
Dim DateEndCrnt As Date
Dim ErrMsg As String
Dim FileCellValueSrc() As Variant ' Value of UsedRange for each CSV file
Dim FileError() As Boolean ' Error state for each file
Dim FileInxHosp() As Long ' Hospital for each CSV file
Dim FileInxQuest() As Long ' Question for each CSV file
Dim FileName() As String ' Name for each CSV file
Dim FileSysObj As Object
Dim FileObj As Object
Dim FolderObj As Object
Dim Found As Boolean
Dim HospName() As Variant ' Names of hospitals
Dim HospNameCrnt As String
Dim InxFileCrnt As Long
Dim InxFileDate As Long
Dim InxHospCrnt As Long
Dim InxLocnCrnt As Long
Dim InxQuestCrnt As Long
Dim Locn() As Variant ' Locations for each hosital
Dim NumCSVFile As Long ' Number of CSV files
Dim NumHosps As Long
Dim NumMonthsData As Long
Dim PathName As String
Dim Quest As Variant ' Array of questions
Dim RowConsolCrnt As Long
Dim RowConsolHospFirst() As Long ' First row for each hospital
' within consolidate worksheet
Dim RowConsolTemp As Long
Dim RowErrorCrnt As Long
Dim RowSrcCrnt As Long
Dim WkBkSrc As Workbook
Application.ScreenUpdating = False ' Reduces screen flash and increases speed
' Load CSV files
' ==============
PathName = Application.ThisWorkbook.Path
Set FileSysObj = CreateObject("Scripting.FileSystemObject")
Set FolderObj = FileSysObj.GetFolder(PathName)
NumCSVFile = 0
' Loop through files to count number of CSV files
For Each FileObj In FolderObj.Files
If LCase(Right(FileObj.Name, 4)) = ".csv" Then
NumCSVFile = NumCSVFile + 1
End If
Next
' Size arrays holding data per file
ReDim FileCellValueSrc(1 To NumCSVFile)
ReDim FileError(1 To NumCSVFile)
ReDim FileInxHosp(1 To NumCSVFile)
ReDim FileInxQuest(1 To NumCSVFile)
ReDim FileName(1 To NumCSVFile)
InxFileCrnt = 0
' Loop through files to save names and cell values.
For Each FileObj In FolderObj.Files
If LCase(Right(FileObj.Name, 4)) = ".csv" Then
InxFileCrnt = InxFileCrnt + 1
FileName(InxFileCrnt) = FileObj.Name
Set WkBkSrc = Workbooks.Open(PathName & "\" & FileObj.Name)
FileCellValueSrc(InxFileCrnt) = WkBkSrc.ActiveSheet.UsedRange
WkBkSrc.Close ' Close the CSV file
End If
Next
' Release resources
Set FileSysObj = Nothing
Set FolderObj = Nothing
' Extract controlling values from consolidate worksheet
' =====================================================
With Worksheets(WkShtNameConsol)
CellValueConsol = .UsedRange.Value
End With
'Debug.Print UBound(CellValueConsol, 1)
'Debug.Print UBound(CellValueConsol, 2)
' This code assumes a single header row consisting of:
' Hospital Location Question1 Question2 ...
' with appropriate names in the first two columns. The cells under the
' questions will all be overwritten.
' These columns are accessed using constants. Limited variation could
' be achieved within amending the code by changing constants.
' Execution will stop at a Debug.assert statement if the expression has a
' value of False. This is an easy way of confirming the worksheet is as
' expected. If a user might change the format of the output worksheet,
' this should be replaced by a MsgBox statement.
Debug.Assert CellValueConsol(1, ColConsolHosp) = "Hospital"
Debug.Assert CellValueConsol(1, ColConsolLocn) = "Location"
' Count number of hospitals.
' This code assumes all locations for a hospital are together and start at
' row 2. The hospital name may be repeated or may be blank on the second and
' subsequent rows for a hospital. That is, the following is acceptable:
' HospitalA X
' HospitalA Y
' HospitalA Z
' HospitalB X
' Y
' Z
' Count number of hospitals
HospNameCrnt = CellValueConsol(2, ColConsolHosp)
NumHosps = 1
For RowConsolCrnt = 3 To UBound(CellValueConsol, 1)
If CellValueConsol(RowConsolCrnt, ColConsolHosp) <> HospNameCrnt And _
CellValueConsol(RowConsolCrnt, ColConsolHosp) <> "" Then
NumHosps = NumHosps + 1
HospNameCrnt = CellValueConsol(RowConsolCrnt, ColConsolHosp)
End If
Next
'Debug.Print NumHosps
' Size HospName, Locn and RowConsolHospFirst for the number of hospitals
ReDim HospName(1 To NumHosps)
ReDim Locn(1 To NumHosps)
ReDim RowConsolHospFirst(1 To NumHosps)
' Load Hospital and Location arrays
InxHospCrnt = 1
HospNameCrnt = CellValueConsol(2, ColConsolHosp)
HospName(InxHospCrnt) = HospNameCrnt
RowConsolHospFirst(InxHospCrnt) = 2
For RowConsolCrnt = 3 To UBound(CellValueConsol, 1)
If CellValueConsol(RowConsolCrnt, ColConsolHosp) <> HospNameCrnt And _
CellValueConsol(RowConsolCrnt, ColConsolHosp) <> "" Then
' Load locations from worksheet to Location array
Call ExtractSubArray(CellValueConsol, Locn(InxHospCrnt), _
RowConsolHospFirst(InxHospCrnt), ColConsolLocn, _
RowConsolCrnt - 1, ColConsolLocn)
HospNameCrnt = CellValueConsol(RowConsolCrnt, ColConsolHosp)
InxHospCrnt = InxHospCrnt + 1
HospName(InxHospCrnt) = HospNameCrnt
RowConsolHospFirst(InxHospCrnt) = RowConsolCrnt
End If
Next
' Load locations for final hospital from worksheet to Location array
Call ExtractSubArray(CellValueConsol, Locn(InxHospCrnt), _
RowConsolHospFirst(InxHospCrnt), ColConsolLocn, _
UBound(CellValueConsol, 1), ColConsolLocn)
' Load questions
Call ExtractSubArray(CellValueConsol, Quest, _
1, ColConsolQuestFirst, _
1, ColConsolQuestLast)
' Clear data area of Consolidate worksheet
' =======================================
For RowConsolCrnt = 2 To UBound(CellValueConsol, 1)
For ColConsolCrnt = ColConsolQuestFirst To ColConsolQuestLast
CellValueConsol(RowConsolCrnt, ColConsolCrnt) = ""
Next
Next
' Prepare error worksheet
'========================
With Worksheets(WkShtNameError)
.Cells.EntireRow.Delete
.Cells(1, ColErrorTime).Value = "Time"
With .Cells(1, ColErrorFile)
.Value = "File"
.ColumnWidth = 71.71
End With
With .Cells(1, ColErrorRow)
.Value = "Row"
.HorizontalAlignment = xlRight
.ColumnWidth = 4
End With
With .Cells(1, ColErrorCol)
.Value = "Col"
.HorizontalAlignment = xlRight
.ColumnWidth = 4
End With
With .Cells(1, ColErrorMsg)
.Value = "Error"
.ColumnWidth = 71.71
End With
End With
RowErrorCrnt = 1
' Validate the CSV files and extract key information
' ==================================================
InxFileDate = -1 'Date range not yet found
NumMonthsData = 0
For InxFileCrnt = 1 To UBound(FileName)
FileError(InxFileCrnt) = False ' No error found for this file
If IsEmpty(FileCellValueSrc(InxFileCrnt)) Then
' The CSV file was empty
Call RecordError(FileName(InxFileCrnt), 0, 0, _
"Empty CSV file", RowErrorCrnt)
FileError(InxFileCrnt) = True ' This CSV file to be ignored
ElseIf VarType(FileCellValueSrc(InxFileCrnt)) = vbString Then
' The CSV file contained a single value
Call RecordError(FileName(InxFileCrnt), 0, 0, _
"CSV file contains a single string", RowErrorCrnt)
FileError(InxFileCrnt) = True ' This CSV file to be ignored
Else
' The only remaining format that could be returned from a range
' is an array
' Check that cells contain the values expected.
' Most checking code has been placed in subroutines. This keeps the code
' in the main routine clean and simple and allows the subroutines to be
' copied easily to new workbooks with macros performing similar tasks.
' Check Cell A1 = "Caution: Rates Have Not Been Adjusted For Patient Mix"
Call CheckCellValueSingle(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), FileError(InxFileCrnt), _
1, 1, _
"Caution: Rates Have Not Been Adjusted For Patient Mix", _
RowErrorCrnt)
' Check Cell A2 is a known hospital. Save InxHosp against file
Call CheckCellValueMultiple(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 2, 1, HospName, _
FileInxHosp(InxFileCrnt), RowErrorCrnt)
' Check Cell A3 is: Date - Date Location Comparison Based on N Locations
Call CheckDateRangeLocn(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 3, 1, _
DateStartCrnt, DateEndCrnt, RowErrorCrnt)
' Save DateStartCrnt and DatEndCrnt or check they are the same as the
' previously saved values
If InxFileDate = -1 Then
' First set of dates
DateStartAll = DateStartCrnt
DateEndAll = DateEndCrnt
InxFileDate = InxFileCrnt ' The first file found with these dates
Else
If DateStartAll = DateStartCrnt And DateEndAll = DateEndCrnt Then
' The date range for this CSV file matches those of previous files
Else
Call RecordError(FileName(InxFileCrnt), 3, 1, _
"**FATAL ERROR**: Date ranges do not match:" & vbLf & _
Format(DateStartAll, FmtDate) & " - " & _
Format(DateEndAll, FmtDate) & " " & _
FileName(InxFileDate) & vbLf & _
Format(DateStartCrnt, FmtDate) & " - " & _
Format(DateEndCrnt, FmtDate) & " " & _
FileName(InxFileCrnt), RowErrorCrnt)
' There are incompatible CSV files. This is a fatal error. Give up.
Exit Sub
End If
End If
' Check Cell A4 = "CMS Qualified HCAHPS Data from All Service Lines"
Call CheckCellValueSingle(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 4, 1, _
"CMS Qualified HCAHPS Data from All Service Lines", _
RowErrorCrnt)
' Check Cell A5 = Question " Composite Results"
If Not CheckBound(FileCellValueSrc(InxFileCrnt), 5, 1, ErrMsg) Then
Call RecordError(FileName(InxFileCrnt), 5, 1, ErrMsg, RowErrorCrnt)
FileError(InxFileCrnt) = True
Else
FileInxQuest(InxFileCrnt) = -1 ' No match against question
For InxQuestCrnt = 1 To UBound(Quest)
If FileCellValueSrc(InxFileCrnt)(5, 1) = _
Quest(InxQuestCrnt) & " Composite Results" Then
FileInxQuest(InxFileCrnt) = InxQuestCrnt
Exit For
End If
Next
If FileInxQuest(InxFileCrnt) = -1 Then
' No match found
FileError(InxFileCrnt) = True
Call RecordError(FileName(InxFileCrnt), 5, 1, """" & _
FileCellValueSrc(InxFileCrnt)(5, 1) & _
""" does not match a known question", RowErrorCrnt)
End If
End If
' Check cell A6 is: "Location"
Call CheckCellValueSingle(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 6, 1, "Location", _
RowErrorCrnt)
' Check cells B6 to X6 are the 1st day of month
' from DateStartAll to DateEndAll
Call CheckDateSequence(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 6, 2, DateStartAll, _
DateEndAll, "a", "m", RowErrorCrnt)
' Check cells Y6 is "Composite Rate"
If Not FileError(InxFileCrnt) Then
' The data range is not guaranteed until the file is error free
NumMonthsData = DateDiff("m", DateStartAll, DateEndAll) + 1
ColSrcCompositeRate = NumMonthsData + 2
Call CheckCellValueSingle(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), 6, ColSrcCompositeRate, _
"Composite Rate", RowErrorCrnt)
End If
If Not FileError(InxFileCrnt) Then
' For row 7 down to the first empty column A, check column A contains
' a known location and ColSrcCompositeRate is numeric.
RowSrcCrnt = 7
InxHospCrnt = FileInxHosp(InxFileCrnt)
Do While True
If Not CheckBound(FileCellValueSrc(InxFileCrnt), _
RowSrcCrnt, 1, ErrMsg) Then
' Row not present
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, 1, _
ErrMsg, RowErrorCrnt)
FileError(InxFileCrnt) = True
Exit Do
End If
If Not CheckBound(FileCellValueSrc(InxFileCrnt), _
RowSrcCrnt, ColSrcCompositeRate, ErrMsg) Then
' Composite rate missing
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, _
ColSrcCompositeRate, ErrMsg, RowErrorCrnt)
FileError(InxFileCrnt) = True
Exit Do
ElseIf Not IsNumeric(FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, _
ColSrcCompositeRate)) Then
' Composite rate is not numeric
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, _
ColSrcCompositeRate, "Composite rate """ & _
FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, _
ColSrcCompositeRate) & """ is not numeric", _
RowErrorCrnt)
End If
If FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, 1) = "" Then
' End of location list within file
Exit Do
End If
Found = False
For InxLocnCrnt = 1 To UBound(Locn(InxHospCrnt))
If FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, 1) = _
Locn(InxHospCrnt)(InxLocnCrnt) Then
' Location from CSV file found in list from consolidate worksheet
Found = True
Exit For
End If
Next
If Not Found Then
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, 1, _
"Location """ & _
FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, 1) & _
""" not found in list from worksheet """ & _
WkShtNameConsol & """", RowErrorCrnt)
FileError(InxFileCrnt) = True
End If
RowSrcCrnt = RowSrcCrnt + 1
Loop
End If
If Not FileError(InxFileCrnt) Then
' Row RowSrcCrnt will have a blank column 1
RowSrcCrnt = RowSrcCrnt + 1
' Check column A is the total line for the hospital
Call CheckCellValueSingle(FileName(InxFileCrnt), _
FileCellValueSrc(InxFileCrnt), _
FileError(InxFileCrnt), RowSrcCrnt, 1, _
HospName(FileInxHosp(InxFileCrnt)), _
RowErrorCrnt)
If Not CheckBound(FileCellValueSrc(InxFileCrnt), _
RowSrcCrnt, ColSrcCompositeRate, ErrMsg) Then
' Composite rate missing
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, _
ColSrcCompositeRate, ErrMsg, RowErrorCrnt)
FileError(InxFileCrnt) = True
ElseIf Not IsNumeric(FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, _
ColSrcCompositeRate)) Then
' Composite rate is not numeric
Call RecordError(FileName(InxFileCrnt), RowSrcCrnt, _
ColSrcCompositeRate, "Composite rate """ & _
FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, _
ColSrcCompositeRate) & """ is not numeric", _
RowErrorCrnt)
End If
End If
End If
Next InxFileCrnt
' If get here there has not been a fatal error although one or more
' individual files may have been rejected.
For InxFileCrnt = 1 To UBound(FileName)
If Not FileError(InxFileCrnt) Then
' No error has been found in this file
InxHospCrnt = FileInxHosp(InxFileCrnt)
InxQuestCrnt = FileInxQuest(InxFileCrnt)
ColConsolCrnt = 2 + InxQuestCrnt
RowSrcCrnt = 7 ' First location row
Do While True
If FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, 1) = "" Then
' End of location list within file
Exit Do
End If
For InxLocnCrnt = 1 To UBound(Locn(InxHospCrnt))
If FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, 1) = _
Locn(InxHospCrnt)(InxLocnCrnt) Then
' Location from CSV file found in list from consolidate worksheet
RowConsolCrnt = RowConsolHospFirst(InxHospCrnt) + InxLocnCrnt - 1
CellValueConsol(RowConsolCrnt, ColConsolCrnt) = _
FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, ColSrcCompositeRate)
Exit For
End If
Next
RowSrcCrnt = RowSrcCrnt + 1
Loop
RowSrcCrnt = RowSrcCrnt + 1 ' Advance to hospital total line
' Assume last location row is for total
RowConsolCrnt = RowConsolHospFirst(InxHospCrnt) + _
UBound(Locn(InxHospCrnt)) - 1
CellValueConsol(RowConsolCrnt, ColConsolCrnt) = _
FileCellValueSrc(InxFileCrnt)(RowSrcCrnt, ColSrcCompositeRate)
End If
Next
' Write new values back to consolidate worksheet
' ==============================================
With Worksheets(WkShtNameConsol)
.UsedRange.Value = CellValueConsol
End With
End Sub
Function CheckBound(ByRef CellValue As Variant, _
ByVal RowFile As Long, ByVal ColFile As Long, _
ByRef Msg As String)
' Return True if CellValue(RowFile, ColFile) exists
If RowFile > UBound(CellValue, 1) Then
' Row not present in file
CheckBound = False
Msg = "No such row within file"
Exit Function
End If
If ColFile > UBound(CellValue, 2) Then
' Column not present in file
CheckBound = False
Msg = "No such column within file"
Exit Function
End If
CheckBound = True
End Function
Sub CheckCellValueMultiple(ByRef FileNameCrnt As String, _
ByRef CellValue As Variant, _
ByRef CellError As Boolean, _
ByVal RowFile As Long, ByVal ColFile As Long, _
ByRef ValueReq() As Variant, _
ByRef InxValue As Long, _
ByRef RowErrorCrnt As Long)
' Check that a specified cell of a CSV file has one of a number of permitted
' values.
' Set CellError is True if the cell does not have any of the permitted
' required value.
' CellError is unchanged if the cell does have the required value. This means
' that several calls can be made to perform different checks and any failure
' will result in CellValue ending with a value of True.
' FileNameCrnt The name of the current file. Used in error message if any.
' CellValue The array of cell contents from the current file.
' CellError Set to True if an error is found.
' RowFile The row to be checked
' ColFile The column to be checked
' ValueReq An array containing all permitted values for the cell.
' InxValue If the cell value is matched against one of the permitted
' values, the index into ValueReq of that permitted value.
' RowErrorCrnt The last used row of the error worksheet. Any error message
' will be written to the next row.
Dim CellValueCrnt As Variant
Dim ErrMsg As String
If Not CheckBound(CellValue, RowFile, ColFile, ErrMsg) Then
Call RecordError(FileNameCrnt, RowFile, ColFile, ErrMsg, RowErrorCrnt)
CellError = True
Exit Sub
End If
CellValueCrnt = CellValue(RowFile, ColFile)
For InxValue = LBound(ValueReq) To UBound(ValueReq)
If CellValueCrnt = ValueReq(InxValue) Then
' Cell value matched against a permitted value
Exit Sub
End If
Next
Call RecordError(FileNameCrnt, RowFile, ColFile, _
"""" & CellValue(RowFile, ColFile) & _
""" not matched against any of the permitted values", _
RowErrorCrnt)
CellError = True
End Sub
Sub CheckCellValueSingle(ByRef FileNameCrnt As String, _
ByRef CellValue As Variant, _
ByRef CellError As Boolean, _
ByVal RowFile As Long, ByVal ColFile As Long, _
ByVal ValueReq As String, ByRef RowErrorCrnt As Long)
' Check that a specified cell of a CSV file has a required value.
' Set CellError is True if the cell does not have the required value.
' CellError is unchanged if the cell does have the required value. This means
' that several calls can be made to perform different checks and any failure
' will result in CellValue ending with a value of True.
' FileNameCrnt The name of the current file. Used in error message if any.
' CellValue The array of cell contents from the current file.
' CellError Set to True if an error is found.
' RowFile The row to be checked
' ColFile The column to be checked
' ValueReq The required value for the cell
' RowErrorCrnt The last used row of the error worksheet. Any error message
' will be written to the next row.
Dim ErrMsg As String
If Not CheckBound(CellValue, RowFile, ColFile, ErrMsg) Then
Call RecordError(FileNameCrnt, RowFile, ColFile, ErrMsg, RowErrorCrnt)
CellError = True
Exit Sub
End If
If CellValue(RowFile, ColFile) = ValueReq Then
' Required value found
Exit Sub
End If
Call RecordError(FileNameCrnt, RowFile, ColFile, """" & ValueReq & _
""" expected but """ & CellValue(RowFile, ColFile) _
& """ found", RowErrorCrnt)
CellError = True
End Sub
Sub CheckDateRangeLocn(ByVal FileNameCrnt As String, _
ByRef CellValue As Variant, _
ByRef CellError As Boolean, ByVal RowFile As Long, _
ByVal ColFile As Long, ByRef DateStart As Date, _
ByRef DateEnd As Date, ByRef RowErrorCrnt As Long)
' Check a specified cell of a CSV file has the format:
' Date "-" Date "Location Comparison Based on" N "Locations"
' Set CellError = True if the cell does not have this value.
' The values of DateStartCrnt and DateEndCrnt are not defined
' if CellError is set to True,
' Note: the value of N is not returned
' FileNameCrnt The name of the current file. Used in error message if any.
' CellValue The array of cell contents from the current file.
' CellError Set to True if an error is found.
' RowFile The row to be checked
' ColFile The column to be checked
' DateStartCrnt The value of the first date. Only guaranteed if CellError
' not set to True
' DateEndCrnt The value of the last date. Only guaranteed if CellError
' not set to True
' RowErrorCrnt The last used row of the error worksheet. Any error message
' will be written to the next row.
Dim ErrMsg As String
Dim Pos As Long
Dim Stg As String
If Not CheckBound(CellValue, RowFile, ColFile, ErrMsg) Then
Call RecordError(FileNameCrnt, RowFile, ColFile, ErrMsg, RowErrorCrnt)
CellError = True
Exit Sub
End If
Stg = CellValue(3, 1)
Pos = InStr(1, Stg, "-")
If Pos = 0 Then
' No hypen in string.
CellError = True
Exit Sub
End If
If Not IsDate(Mid(Stg, 1, Pos - 1)) Then
' Value before hyphen is not a date
CellError = True
Exit Sub
End If
DateStart = DateValue(Mid(Stg, 1, Pos - 1))
Stg = Mid(Stg, Pos + 1)
Pos = InStr(1, Stg, "Location Comparison Based on")
If Pos = 0 Then
' Important sub-string missing
CellError = True
Exit Sub
End If
If Not IsDate(Mid(Stg, 1, Pos - 1)) Then
' Value after hyphen is not a date
CellError = True
Exit Sub
End If
DateEnd = DateValue(Mid(Stg, 1, Pos - 1))
Stg = Mid(Stg, Pos + Len("Location Comparison Based on"))
If Not Right(Stg, Len("Locations")) = "Locations" Then
' Important sub-string missing
CellError = True
Exit Sub
End If
Stg = Mid(Stg, 1, Len(Stg) - Len("Locations"))
If Not IsNumeric(Stg) Then
' N is not numeric
CellError = True
Exit Sub
End If
' CellError unchanged. DateStart and DateEnd set
End Sub
Sub CheckDateSequence(ByVal FileNameCrnt As String, _
ByRef CellValue As Variant, ByRef RangeError As Boolean, _
ByVal RowFileStart As Long, ByVal ColFileStart As Long, _
ByVal DateStart As Date, ByVal DateEnd As Date, _
ByVal Direction As String, ByVal Interval As String, _
ByRef RowErrorCrnt As Long)
' Check a sequence of cells to hold a sequence of dates.
' FileNameCrnt The name of the current file. Used in error message if any.
' CellValue An array of cell contents from the current file.
' RangeError Set to True if an error is found.
' RowFileStart \ Identify the first cell of the sequence
' ColFileStart /
' DateStart The value of the first date in the sequence.
' DateEnd The value of the last date in the sequence.
' Direction Permitted values are "a" for across and "d" for down.
' Interval Permitted values are as for the Interval parameter of the
' function DateAdd. Each cell in the sequence must be one
' date interval more than the previous cell until DateEnd is
' reached.
' RowErrorCrnt The last used row of the error worksheet. Any error message
' will be written to the next row.
Dim ColFileCrnt As Long
Dim DateCrnt As Date
Dim DateTemp As Date
Dim ErrMsg As String
Dim RowFileCrnt As Long
DateCrnt = DateStart
RowFileCrnt = RowFileStart
ColFileCrnt = ColFileStart
Do While True
If Not CheckBound(CellValue, RowFileCrnt, ColFileCrnt, ErrMsg) Then
Call RecordError(FileNameCrnt, RowFileCrnt, _
ColFileCrnt, ErrMsg, RowErrorCrnt)
RangeError = True
Exit Sub
End If
If Not IsDate(CellValue(RowFileCrnt, ColFileCrnt)) Then
' Value is not a date nor is it a string that can be converted to a date
Call RecordError(FileNameCrnt, RowFileCrnt, ColFileCrnt, _
"Value should be """ & Format(DateCrnt, FmtDate) & _
""" but found """ & CellValue(RowFileCrnt, ColFileCrnt) _
& """", RowErrorCrnt)
RangeError = True
Exit Sub
End If
DateTemp = DateValue(CellValue(RowFileCrnt, ColFileCrnt))
If DateTemp = DateCrnt Then
' Cell has expected value
Else
' Cell does not have the expected value
' Excel corrupts "mmm-yy" to Day=yy, Month=mmm, Year=Current year
DateTemp = DateSerial(Day(DateTemp), Month(DateTemp), 1)
If DateTemp = DateCrnt Then
' Decorrupted value is the expected value
' Correct worksheet
CellValue(RowFileCrnt, ColFileCrnt) = DateTemp
Else
Call RecordError(FileNameCrnt, RowFileCrnt, ColFileCrnt, _
"Value should be """ & Format(DateCrnt, FmtDate) & _
""" but found """ & CellValue(RowFileCrnt, ColFileCrnt) _
& """", RowErrorCrnt)
RangeError = True
Exit Sub
End If
End If
If DateCrnt = DateEnd Then
' Successful check. Leave RangeError unchanged.
Exit Sub
End If
DateCrnt = DateAdd(Interval, 1, DateCrnt)
If Direction = "a" Then
ColFileCrnt = ColFileCrnt + 1
ElseIf Direction = "d" Then
RowFileCrnt = RowFileCrnt + 1
Else
Debug.Assert False ' Invalid value. Only "a" or "d" allowed
End If
Loop
End Sub
Sub ExtractSubArray(ByRef ArraySrc() As Variant, ByRef ArrayDest As Variant, _
ByVal RowSrcTop As Long, ByVal ColSrcLeft As Long, _
ByVal RowSrcBot As Long, ByVal ColSrcRight As Long)
' ArraySrc An array loaded from a worksheet
' ArrayDest A variant which will be set to an array to which selected
' entries from ArraySrc are to be copied. If either
' RowTop = RowBot or Colleft = ColRight it will be a 1D array.
' Otherwise it will be a 2D array with rows as the first
' dimension.
' RowSrcTop \ Specify the rectangle to be extracted from ArraySrc.
' ColSrcLeft |
' RowSrcBot | It is the callers responsibility to ensure the
' ColSrcRight / these values are valid indices for ArraySrc.
Dim ArrayDestLocal() As Variant
Dim ColDestCrnt As Long
Dim ColSrcCrnt As Long
Dim NumColsDest As Long
Dim NumRowsDest As Long
Dim RowDestCrnt As Long
Dim RowSrcCrnt As Long
NumColsDest = ColSrcRight - ColSrcLeft + 1
NumRowsDest = RowSrcBot - RowSrcTop + 1
If NumColsDest = 1 Then
' The selected rectangle is a column
ReDim ArrayDestLocal(1 To NumRowsDest)
RowDestCrnt = 1
For RowSrcCrnt = RowSrcTop To RowSrcBot
ArrayDestLocal(RowDestCrnt) = ArraySrc(RowSrcCrnt, ColSrcLeft)
RowDestCrnt = RowDestCrnt + 1
Next
ArrayDest = ArrayDestLocal
ElseIf NumRowsDest = 1 Then
' The selected rectangle is a row
ReDim ArrayDestLocal(1 To NumColsDest)
ColDestCrnt = 1
For ColSrcCrnt = ColSrcLeft To ColSrcRight
ArrayDestLocal(ColDestCrnt) = ArraySrc(RowSrcTop, ColSrcCrnt)
ColDestCrnt = ColDestCrnt + 1
Next
ArrayDest = ArrayDestLocal
Else
' The selected rectangle is a rectangle
ReDim ArrayDestLocal(1 To NumRowsDest, 1 To NumColsDest)
RowDestCrnt = 1
For RowSrcCrnt = RowSrcTop To RowSrcBot
ColDestCrnt = 1
For ColSrcCrnt = ColSrcLeft To ColSrcRight
ArrayDestLocal(RowDestCrnt, ColDestCrnt) = _
ArraySrc(RowSrcCrnt, ColSrcCrnt)
ColDestCrnt = ColDestCrnt + 1
Next
RowDestCrnt = RowDestCrnt + 1
Next
ArrayDest = ArrayDestLocal
End If
End Sub
Sub RecordError(ByRef FileName As String, ByRef RowFile As Long, _
ByRef ColFile As Long, ByRef Msg As String, _
ByRef RowError As Long)
' Outputs an error to the error worksheet
Debug.Assert Not IsNumeric(FileName)
With Worksheets(WkShtNameError)
RowError = RowError + 1
With .Cells(RowError, ColErrorTime)
.Value = Now()
.NumberFormat = FmtDateTime
End With
.Cells(RowError, ColErrorFile).Value = FileName
If RowFile <> 0 Then .Cells(RowError, ColErrorRow).Value = RowFile
If ColFile <> 0 Then .Cells(RowError, ColErrorCol).Value = ColFile
With .Cells(RowError, ColErrorMsg)
.Value = Msg
.WrapText = True
End With
End With
End Sub