删除特定列下目标csv中的空白
我想从csv文件中删除空白,理想情况下是从特定列中删除空白,只留下单个空格,而不是多余的空格。我有以下脚本可以实现这一点,但需要帮助实现以下脚本,以检查特定列下的目标csv并删除空白 以下是脚本:删除特定列下目标csv中的空白,csv,vbscript,wsh,Csv,Vbscript,Wsh,我想从csv文件中删除空白,理想情况下是从特定列中删除空白,只留下单个空格,而不是多余的空格。我有以下脚本可以实现这一点,但需要帮助实现以下脚本,以检查特定列下的目标csv并删除空白 以下是脚本: 'Start by trimming leading/trailing spaces str = Trim(str) 'Now, while we have 2 consecutive spaces, replace them 'with a single space... Do While InS
'Start by trimming leading/trailing spaces
str = Trim(str)
'Now, while we have 2 consecutive spaces, replace them
'with a single space...
Do While InStr(1, str, " ")
str = Replace(str, " ", " ")
Loop
理想情况下,我希望这样调用脚本:
Cscript whitespaceremover.vbs target.csv 'column_name'
foo bar baz
我认为我下面的例子可以改进,但我希望它足够好,可以开始 我的演示CSV文件“target.CSV”: “whitespaceremover.vbs”示例: 还有一些我忘了寄的纸条。为了简化测试,我在第二列中添加了双空格。简而言之,要查看正在运行的脚本,请使用
column_name2
作为命令行参数,即:
Cscript whitespaceremover.vbs target.csv column_name2
编辑
在阅读了Ansgar Wiechers关于Replace
函数的评论后,我最终决定做一些测试。也许上面的代码比正则表达式慢,但它可以工作。以下是我的示例:
str1 = "1" & Space(2) & "2" & Space(4) & "3" _
& Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, " ")
str1 = Replace(str1, " ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string: 1 2 3 4 5
'New string: 1 2 3 4 5
我认为我下面的例子可以改进,但我希望它足够好,可以开始 我的演示CSV文件“target.CSV”: “whitespaceremover.vbs”示例: 还有一些我忘了寄的纸条。为了简化测试,我在第二列中添加了双空格。简而言之,要查看正在运行的脚本,请使用
column_name2
作为命令行参数,即:
Cscript whitespaceremover.vbs target.csv column_name2
编辑
在阅读了Ansgar Wiechers关于Replace
函数的评论后,我最终决定做一些测试。也许上面的代码比正则表达式慢,但它可以工作。以下是我的示例:
str1 = "1" & Space(2) & "2" & Space(4) & "3" _
& Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, " ")
str1 = Replace(str1, " ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string: 1 2 3 4 5
'New string: 1 2 3 4 5
作为对所提供答案的补充:当存在3个或更多连续空格的序列时,简单地用单个空格替换两个空格可能会产生不期望的结果。像这样的一句话:
Cscript whitespaceremover.vbs target.csv 'column_name'
foo bar baz
将被替换为:
foo bar baz
不是这样的:
原因是Replace
在替换字符串后继续。例如,运行Replace(“aaaa”、“aa”、“a”)
将首先替换前2个a
字符:
aaaa
→ <代码>aaa
然后替换替换字符串后的下2个a
字符:
aaa
→ <代码>aa
然后终止
折叠空间(或通常的字符序列)的一个更可靠的解决方案是用正则表达式替换:
Set re = New RegExp
re.Pattern = " +" '<-- means "a sequence of one or more spaces"
re.Global = True
text = "foo bar baz"
WScript.Echo re.Replace(text, " ")
作为对所提供答案的补充:当存在3个或更多连续空格的序列时,简单地用单个空格替换两个空格可能会产生不期望的结果。像这样的一句话:
Cscript whitespaceremover.vbs target.csv 'column_name'
foo bar baz
将被替换为:
foo bar baz
不是这样的:
原因是Replace
在替换字符串后继续。例如,运行Replace(“aaaa”、“aa”、“a”)
将首先替换前2个a
字符:
aaaa
→ <代码>aaa
然后替换替换字符串后的下2个a
字符:
aaa
→ <代码>aa
然后终止
折叠空间(或通常的字符序列)的一个更可靠的解决方案是用正则表达式替换:
Set re = New RegExp
re.Pattern = " +" '<-- means "a sequence of one or more spaces"
re.Global = True
text = "foo bar baz"
WScript.Echo re.Replace(text, " ")
这并不是所有的指导方针,但可能会对某些人有所帮助
' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number
Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
strInputFileName = oArgs(0)
intColumn = oArgs(1) - 1
strOutputFileName = PrepareOutputPath(strInputFileName, "_new")
WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)
End If
Set oArgs = Nothing
Function TrimCsv (strFileContent, intColumn)
' usuwa niepotrzebne spacje w polach tabeli CSV
strFileContent = Replace(strFileContent, vbCrLf, vbLf)
arrFileContent = Split(strFileContent, vbLf)
strFileContent = ""
For Each strLine in arrFileContent
If Not Len(strLine) = 0 Then
arrRecord = Split(strLine, ";")
' for specified column number
arrRecord(intColumn) = Trim(arrRecord(intColumn))
' for all columns
' For iCount = LBound(arrRecord) To UBound(arrRecord)
' arrRecord(iCount) = Trim(arrRecord(iCount))
' Next
AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
Erase arrRecord
End If
Next
TrimCsv = strFileContent
Erase arrFileContent
End Function
Function PrepareOutputPath(strFileName, strSuffix)
Set objFSO = CreateObject("Scripting.FileSystemObject")
With objFSO
strPath = .GetParentFolderName(strFileName)
strName = .GetBaseName(strFileName)
strExt = .GetExtensionName(strFileName)
End With
PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
Set objFSO = Nothing
End Function
Function AddToList(strList, strValue, strDelim)
' add delimiter between values
If strList = "" Then
AddToList = strValue
Else
AddToList = strList & strDelim & strValue
End If
strList = AddToList
End Function
Function ReadTextFile(strFileName)
Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "utf-8"
objStream.Open
objStream.LoadFromFile(strFileName)
ReadTextFile = objStream.ReadText()
objStream.Close
Set objStream = Nothing
End Function
Sub WriteTextFile (strFileName, strFileContent)
adSaveCreateNotExist = 1
adSaveCreateOverWrite = 2
adWriteChar = 0
adWriteLine = 1
Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "utf-8"
objStream.Open
objStream.WriteText strFileContent, adWriteChar
objStream.SaveToFile strFileName, adSaveCreateOverwrite
objStream.Close
Set objStream = Nothing
End Sub
致意
--
Pawell L.这并不是所有的指南,但可能对某些人有帮助
' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number
Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
strInputFileName = oArgs(0)
intColumn = oArgs(1) - 1
strOutputFileName = PrepareOutputPath(strInputFileName, "_new")
WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)
End If
Set oArgs = Nothing
Function TrimCsv (strFileContent, intColumn)
' usuwa niepotrzebne spacje w polach tabeli CSV
strFileContent = Replace(strFileContent, vbCrLf, vbLf)
arrFileContent = Split(strFileContent, vbLf)
strFileContent = ""
For Each strLine in arrFileContent
If Not Len(strLine) = 0 Then
arrRecord = Split(strLine, ";")
' for specified column number
arrRecord(intColumn) = Trim(arrRecord(intColumn))
' for all columns
' For iCount = LBound(arrRecord) To UBound(arrRecord)
' arrRecord(iCount) = Trim(arrRecord(iCount))
' Next
AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
Erase arrRecord
End If
Next
TrimCsv = strFileContent
Erase arrFileContent
End Function
Function PrepareOutputPath(strFileName, strSuffix)
Set objFSO = CreateObject("Scripting.FileSystemObject")
With objFSO
strPath = .GetParentFolderName(strFileName)
strName = .GetBaseName(strFileName)
strExt = .GetExtensionName(strFileName)
End With
PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
Set objFSO = Nothing
End Function
Function AddToList(strList, strValue, strDelim)
' add delimiter between values
If strList = "" Then
AddToList = strValue
Else
AddToList = strList & strDelim & strValue
End If
strList = AddToList
End Function
Function ReadTextFile(strFileName)
Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "utf-8"
objStream.Open
objStream.LoadFromFile(strFileName)
ReadTextFile = objStream.ReadText()
objStream.Close
Set objStream = Nothing
End Function
Sub WriteTextFile (strFileName, strFileContent)
adSaveCreateNotExist = 1
adSaveCreateOverWrite = 2
adWriteChar = 0
adWriteLine = 1
Set objStream = CreateObject("ADODB.Stream")
objStream.CharSet = "utf-8"
objStream.Open
objStream.WriteText strFileContent, adWriteChar
objStream.SaveToFile strFileName, adSaveCreateOverwrite
objStream.Close
Set objStream = Nothing
End Sub
致意
--
Pawel L.值得注意的是,
Replace(str,“,”)
在有3个或更多连续空格时不会产生所需的结果。用正则表达式(“+”
)替换可能是一种更稳健的方法。@AnsgarWiechers-你说得对,我只是不太擅长正则表达式。如果您愿意,您可以向OP提供您自己的答案。值得注意的是,Replace(str,“,”)
在有3个或更多连续空格时不会产生所需的结果。用正则表达式(“+”
)替换可能是一种更稳健的方法。@AnsgarWiechers-你说得对,我只是不太擅长正则表达式。如果你愿意,你可以给OP提供你自己的答案。你关于替换
函数的评论引起我的注意,因为我有文件说我遗漏了什么,所以我做了一些测试,请参阅我编辑的答案。你关于替换
函数的评论引起我的注意,因为我有文件说我遗漏了什么,所以我做了一些测试,请参阅我编辑的答案。