使用ASP.NET将逗号分隔的CSV文件处理为多个文件

使用ASP.NET将逗号分隔的CSV文件处理为多个文件,asp.net,csv,Asp.net,Csv,有人能帮我指出正确的方向吗?先谢谢你 我正在寻找一个小应用程序,它将处理一个csv文件,根据数组中一列(比如Header1)的不同值列表,将行输出到多个csv文件中,但我不知道从哪里开始。仅供参考:标题1中的列表将始终更改 我已经能够使用以下代码将文件读入数组: [Read From Comma-Delimited Text Files in Visual Basic][1] 现在我想基于第一列处理数据。比如, 输入: input.csv "Header1","Header2","Heade

有人能帮我指出正确的方向吗?先谢谢你

我正在寻找一个小应用程序,它将处理一个csv文件,根据数组中一列(比如Header1)的不同值列表,将行输出到多个csv文件中,但我不知道从哪里开始。仅供参考:标题1中的列表将始终更改

我已经能够使用以下代码将文件读入数组:

[Read From Comma-Delimited Text Files in Visual Basic][1]
现在我想基于第一列处理数据。比如,

输入:

input.csv

"Header1","Header2","Header3","Header4"
"apple","pie","soda","beer"
"apple","cake","milk","wine"
"pear","pie","soda","beer"
"pear","pie","soda","beer"
"orange","pie","soda","beer"
"orange","pie","soda","beer"
输出:

output1.csv

"Header1","Header2","Header3","Header4"
"apple","pie","soda","beer"
"apple","cake","milk","wine"

output2.csv

"Header1","Header2","Header3","Header4"
"pear","pie","soda","beer"
"pear","pie","soda","beer"

output2.csv

"Header1","Header2","Header3","Header4"
"orange","pie","soda","beer"
"orange","pie","soda","beer"
你能做的就是

将键列读入列表q 创建不同的键列表 将q中的值与dist进行比较,并根据dist中的索引将行写入文件 范例

Dim lines As String() = System.IO.File.ReadAllLines("input.csv")
Dim q = (From line In lines
                        Let x = line.Split(",")
                        Select x(0)).ToList()
Dim dist = q.Distinct().ToList()

For j As Integer = 1 To dist.Count - 1
    Using sw As New StreamWriter(File.Open("output" & j & ".csv", FileMode.OpenOrCreate))
        sw.WriteLine(lines(0))
    End Using
Next

For i As Integer = 1 To q.Count - 1
    Console.WriteLine(q(i))
    Console.WriteLine(dist.IndexOf(q(i)))

    Using sw As New StreamWriter(File.Open("output" & dist.IndexOf(q(i)) & ".csv", FileMode.Append))
        sw.WriteLine(lines(i))
    End Using
Next

如果键列不是第一列,请在x0中更改其索引。保存数据的合适数据结构(而不是数组)将是字典。这就很容易检查你是否已经有一个特定类别的条目,比如苹果或梨。然后,您只需将新条目添加到字典或添加到现有条目

要创建输出文件,需要迭代字典中的每个条目以分离文件,然后遍历字典条目值中的每个实体以获得文件中的行

Option Infer On

Imports System.IO
Imports Microsoft.VisualBasic.FileIO

Module Module1

    Sub SeparateCsvToFiles(srcFile As String)

        Dim d As New Dictionary(Of String, List(Of String))
        Dim headers As String()

        Using tfp As New TextFieldParser(srcFile)
            tfp.HasFieldsEnclosedInQuotes = True
            tfp.SetDelimiters(",")
            Dim currentRow As String()

            ' Get the headers
            Try
                headers = tfp.ReadFields()
            Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
                Throw New FormatException(String.Format("Could not read header line in ""{0}"".", srcFile))
            End Try

            ' Read the data
            Dim lineNumber As Integer = 1

            While Not tfp.EndOfData
                Try
                    currentRow = tfp.ReadFields()

                    'TODO: Possibly handle the wrong number of entries more gracefully.
                    If currentRow.Count = headers.Count Then
                        ' assume column to sort on is the zeroth one
                        Dim category = currentRow(0)
                        Dim values = String.Join(",", currentRow.Skip(1).Select(Function(s) """" & s & """"))

                        If d.ContainsKey(category) Then
                            d(category).Add(values)
                        Else
                            Dim valuesList As New List(Of String)
                            valuesList.Add(values)
                            d.Add(category, valuesList)
                        End If

                    Else
                        Throw New FormatException(String.Format("Wrong number of entries in line {0} in ""{1}"".", lineNumber, srcFile))
                    End If

                Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
                    Throw New FormatException(String.Format("Could not read data line {0} in ""{1}"".", lineNumber, srcFile))
                End Try

                lineNumber += 1

            End While
        End Using

        ' Output the data
        'TODO: Write code to output files to a different directory.
        Dim destDir = Path.GetDirectoryName(srcFile)

        Dim fileNumber As Integer = 1
        Dim headerLine = String.Join(",", headers.Select(Function(s) """" & s & """"))

        'TODO: think up more meaningful names instead of x and y.    
        For Each x In d
            Dim destFile = Path.Combine(destDir, "output" & fileNumber.ToString() & ".csv")

            Using sr As New StreamWriter(destFile)
                sr.WriteLine(headerLine)
                For Each y In x.Value
                    sr.WriteLine(String.Format("""{0}"",{1}", x.Key, y))
                Next
            End Using

            fileNumber += 1

        Next

    End Sub

    Sub Main()
        SeparateCsvToFiles("C:\temp\input.csv")
        Console.WriteLine("Done.")
        Console.ReadLine()

    End Sub

End Module

非常感谢您的快速回复!到目前为止,这是伟大的工作。我想知道是否有一种方法可以修改sr.WriteLine部分,以防我想选择性地只写某些列。因此,输出只能是Header1、Header3和/或Header4。要修改的行应该是Dim values=String.Join、、currentRow.Skip1.SelectFunctions&s&其中它构建了类别后面的输出行部分。我相信您可以自己使用For..Next循环来完成。@user3599851抱歉,我忘了在之前的评论中添加@,以确保您得到通知。@user3599851如果我建议的答案解决了问题,您可能希望将其标记为已接受的答案,或者至少标记为有用的答案,因此,其他人可以很容易地看到它的帮助,如果他们在未来来到这个线程。是的,你一直很有帮助。。我只是一直在和你斗争。。我找不到。。下一个循环开始工作,但我确实通过添加这样的东西使它按照我想要的方式工作。。[Dim category=currentRow1 Dim values=¤tRow3&、¤tRow4&、¤tRow6&