删除特定列下目标csv中的空白

删除特定列下目标csv中的空白,csv,vbscript,wsh,Csv,Vbscript,Wsh,我想从csv文件中删除空白,理想情况下是从特定列中删除空白,只留下单个空格,而不是多余的空格。我有以下脚本可以实现这一点,但需要帮助实现以下脚本,以检查特定列下的目标csv并删除空白 以下是脚本: 'Start by trimming leading/trailing spaces str = Trim(str) 'Now, while we have 2 consecutive spaces, replace them 'with a single space... Do While InS

我想从csv文件中删除空白,理想情况下是从特定列中删除空白,只留下单个空格,而不是多余的空格。我有以下脚本可以实现这一点,但需要帮助实现以下脚本,以检查特定列下的目标csv并删除空白

以下是脚本:

'Start by trimming leading/trailing spaces
str = Trim(str)

'Now, while we have 2 consecutive spaces, replace them
'with a single space...
Do While InStr(1, str, "  ")
str = Replace(str, "  ", " ")
Loop
理想情况下,我希望这样调用脚本:

Cscript whitespaceremover.vbs target.csv 'column_name'
foo   bar      baz

我认为我下面的例子可以改进,但我希望它足够好,可以开始

我的演示CSV文件“target.CSV”:

“whitespaceremover.vbs”示例:

还有一些我忘了寄的纸条。为了简化测试,我在第二列中添加了双空格。简而言之,要查看正在运行的脚本,请使用
column_name2
作为命令行参数,即:

Cscript whitespaceremover.vbs target.csv column_name2
编辑

在阅读了Ansgar Wiechers关于
Replace
函数的评论后,我最终决定做一些测试。也许上面的代码比正则表达式慢,但它可以工作。以下是我的示例:

str1 = "1" & Space(2) & "2" & Space(4) & "3" _
    & Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, "  ")
    str1 = Replace(str1, "  ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string:  1  2    3 4      5
'New string:  1 2 3 4 5

我认为我下面的例子可以改进,但我希望它足够好,可以开始

我的演示CSV文件“target.CSV”:

“whitespaceremover.vbs”示例:

还有一些我忘了寄的纸条。为了简化测试,我在第二列中添加了双空格。简而言之,要查看正在运行的脚本,请使用
column_name2
作为命令行参数,即:

Cscript whitespaceremover.vbs target.csv column_name2
编辑

在阅读了Ansgar Wiechers关于
Replace
函数的评论后,我最终决定做一些测试。也许上面的代码比正则表达式慢,但它可以工作。以下是我的示例:

str1 = "1" & Space(2) & "2" & Space(4) & "3" _
    & Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, "  ")
    str1 = Replace(str1, "  ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string:  1  2    3 4      5
'New string:  1 2 3 4 5

作为对所提供答案的补充:当存在3个或更多连续空格的序列时,简单地用单个空格替换两个空格可能会产生不期望的结果。像这样的一句话:

Cscript whitespaceremover.vbs target.csv 'column_name'
foo   bar      baz
将被替换为:

foo  bar   baz
不是这样的:

原因是
Replace
在替换字符串后继续。例如,运行
Replace(“aaaa”、“aa”、“a”)
将首先替换前2个
a
字符:

aaaa
→ <代码>aaa

然后替换替换字符串后的下2个
a
字符:

aaa
→ <代码>aa

然后终止

折叠空间(或通常的字符序列)的一个更可靠的解决方案是用正则表达式替换:

Set re = New RegExp
re.Pattern = " +"  '<-- means "a sequence of one or more spaces"
re.Global = True

text = "foo   bar      baz"

WScript.Echo re.Replace(text, " ")

作为对所提供答案的补充:当存在3个或更多连续空格的序列时,简单地用单个空格替换两个空格可能会产生不期望的结果。像这样的一句话:

Cscript whitespaceremover.vbs target.csv 'column_name'
foo   bar      baz
将被替换为:

foo  bar   baz
不是这样的:

原因是
Replace
在替换字符串后继续。例如,运行
Replace(“aaaa”、“aa”、“a”)
将首先替换前2个
a
字符:

aaaa
→ <代码>aaa

然后替换替换字符串后的下2个
a
字符:

aaa
→ <代码>aa

然后终止

折叠空间(或通常的字符序列)的一个更可靠的解决方案是用正则表达式替换:

Set re = New RegExp
re.Pattern = " +"  '<-- means "a sequence of one or more spaces"
re.Global = True

text = "foo   bar      baz"

WScript.Echo re.Replace(text, " ")

这并不是所有的指导方针,但可能会对某些人有所帮助

' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number

Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
    strInputFileName = oArgs(0)
    intColumn = oArgs(1) - 1
    strOutputFileName = PrepareOutputPath(strInputFileName, "_new")

    WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)

End If
Set oArgs = Nothing

Function TrimCsv (strFileContent, intColumn)
    ' usuwa niepotrzebne spacje w polach tabeli CSV
    strFileContent = Replace(strFileContent, vbCrLf, vbLf)
    arrFileContent = Split(strFileContent, vbLf)
    strFileContent = ""
    For Each strLine in arrFileContent
        If Not Len(strLine) = 0 Then
            arrRecord = Split(strLine, ";")

'           for specified column number
            arrRecord(intColumn) = Trim(arrRecord(intColumn))

'           for all columns
'           For iCount = LBound(arrRecord) To UBound(arrRecord)
'               arrRecord(iCount) = Trim(arrRecord(iCount))
'           Next

            AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
            Erase arrRecord
        End If
    Next
    TrimCsv = strFileContent
    Erase arrFileContent
End Function


Function PrepareOutputPath(strFileName, strSuffix)
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    With objFSO
        strPath = .GetParentFolderName(strFileName)
        strName = .GetBaseName(strFileName)
        strExt = .GetExtensionName(strFileName)
    End With
    PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
    PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
    Set objFSO = Nothing
End Function


Function AddToList(strList, strValue, strDelim)
    ' add delimiter between values
    If strList = "" Then
        AddToList = strValue
    Else
        AddToList = strList & strDelim & strValue
    End If
    strList = AddToList
End Function 


Function ReadTextFile(strFileName)
    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.LoadFromFile(strFileName)
    ReadTextFile = objStream.ReadText()
    objStream.Close

    Set objStream = Nothing
End Function


Sub WriteTextFile (strFileName, strFileContent)
    adSaveCreateNotExist = 1
    adSaveCreateOverWrite = 2
    adWriteChar = 0
    adWriteLine = 1

    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.WriteText strFileContent, adWriteChar
    objStream.SaveToFile strFileName, adSaveCreateOverwrite
    objStream.Close

    Set objStream = Nothing
End Sub
致意

--


Pawell L.

这并不是所有的指南,但可能对某些人有帮助

' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number

Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
    strInputFileName = oArgs(0)
    intColumn = oArgs(1) - 1
    strOutputFileName = PrepareOutputPath(strInputFileName, "_new")

    WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)

End If
Set oArgs = Nothing

Function TrimCsv (strFileContent, intColumn)
    ' usuwa niepotrzebne spacje w polach tabeli CSV
    strFileContent = Replace(strFileContent, vbCrLf, vbLf)
    arrFileContent = Split(strFileContent, vbLf)
    strFileContent = ""
    For Each strLine in arrFileContent
        If Not Len(strLine) = 0 Then
            arrRecord = Split(strLine, ";")

'           for specified column number
            arrRecord(intColumn) = Trim(arrRecord(intColumn))

'           for all columns
'           For iCount = LBound(arrRecord) To UBound(arrRecord)
'               arrRecord(iCount) = Trim(arrRecord(iCount))
'           Next

            AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
            Erase arrRecord
        End If
    Next
    TrimCsv = strFileContent
    Erase arrFileContent
End Function


Function PrepareOutputPath(strFileName, strSuffix)
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    With objFSO
        strPath = .GetParentFolderName(strFileName)
        strName = .GetBaseName(strFileName)
        strExt = .GetExtensionName(strFileName)
    End With
    PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
    PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
    Set objFSO = Nothing
End Function


Function AddToList(strList, strValue, strDelim)
    ' add delimiter between values
    If strList = "" Then
        AddToList = strValue
    Else
        AddToList = strList & strDelim & strValue
    End If
    strList = AddToList
End Function 


Function ReadTextFile(strFileName)
    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.LoadFromFile(strFileName)
    ReadTextFile = objStream.ReadText()
    objStream.Close

    Set objStream = Nothing
End Function


Sub WriteTextFile (strFileName, strFileContent)
    adSaveCreateNotExist = 1
    adSaveCreateOverWrite = 2
    adWriteChar = 0
    adWriteLine = 1

    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.WriteText strFileContent, adWriteChar
    objStream.SaveToFile strFileName, adSaveCreateOverwrite
    objStream.Close

    Set objStream = Nothing
End Sub
致意

--


Pawel L.

值得注意的是,
Replace(str,“,”)
在有3个或更多连续空格时不会产生所需的结果。用正则表达式(
“+”
)替换可能是一种更稳健的方法。@AnsgarWiechers-你说得对,我只是不太擅长正则表达式。如果您愿意,您可以向OP提供您自己的答案。值得注意的是,
Replace(str,“,”)
在有3个或更多连续空格时不会产生所需的结果。用正则表达式(
“+”
)替换可能是一种更稳健的方法。@AnsgarWiechers-你说得对,我只是不太擅长正则表达式。如果你愿意,你可以给OP提供你自己的答案。你关于
替换
函数的评论引起我的注意,因为我有文件说我遗漏了什么,所以我做了一些测试,请参阅我编辑的答案。你关于
替换
函数的评论引起我的注意,因为我有文件说我遗漏了什么,所以我做了一些测试,请参阅我编辑的答案。