C# 如何格式化和读取CSV文件？_C#_Csv

C# 如何格式化和读取CSV文件？

c# csv

C# 如何格式化和读取CSV文件？,c#,csv,C#,Csv,这里只是一个我需要格式化的数据示例第一列很简单，问题出在第二列在一列中格式化多个数据字段的最佳方法是什么如何解析这些数据要点*：第二列需要包含多个值，如下例所示 Name Details Alex Age:25 Height:6 Hair:Brown Eyes:Hazel CSV文件通常使用逗号作为字段分隔符，CR作为行分隔符进行定义。您正在第二列中使用CR，这将导致问题。您需要重新格式化第二列

这里只是一个我需要格式化的数据示例

第一列很简单，问题出在第二列

在一列中格式化多个数据字段的最佳方法是什么

如何解析这些数据

要点*：第二列需要包含多个值，如下例所示

Name       Details

Alex       Age:25
           Height:6
           Hair:Brown
           Eyes:Hazel

CSV文件通常使用逗号作为字段分隔符，CR作为行分隔符进行定义。您正在第二列中使用CR，这将导致问题。您需要重新格式化第二列，以便在多个值之间使用其他形式的分隔符。常用的备用分隔符是|（管道）字符

您的格式如下所示：亚历克斯，年龄：25 |身高：6 |头发：棕色|眼睛：淡褐色

在解析过程中，首先解析逗号分隔的字段（返回两个值），然后将第二个字段解析为管道分隔的字段。

csv可能如下所示：

Name,Age,Height,Hair,Eyes
Alex,25,6,Brown,Hazel

每个单元格与其相邻单元格之间应仅用一个逗号分隔

您可以使用简单的正则表达式将某些换行和非换行空格替换为逗号，从而将其重新格式化（您可以很容易地找到每个块，因为它在两列中都有值）.

这是一个有趣的问题-解析特定格式的文件可能非常困难，这就是为什么人们经常编写特定的类来处理它们。更传统的文件格式（如CSV）或其他分隔格式更易于阅读，因为它们的格式类似

可以通过以下方式解决上述问题：

1）输出应该是什么样子

在你的例子中，这只是一个猜测，但我相信你的目标是：

Name, Age, Height, Hair, Eyes
Alex, 25, 6, Brown, Hazel

public void ParseFile(){

        String currentLine;

        bool newSection = false;

        //Store the column names and ordinal position here.
        List<String> nameOrdinals = new List<String>();
        nameOrdinals.Add("Name"); //IndexOf == 0

        Dictionary<Int32, List<String>> nameValues = new Dictionary<Int32 ,List<string>>(); //Use this to store each person's details

        Int32 rowNumber = 0;

        using (TextReader reader = File.OpenText("D:\\temp\\test.txt"))
        {

            while ((currentLine = reader.ReadLine()) != null) //This will read the file one row at a time until there are no more rows to read
            {

                string[] lineSegments = currentLine.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);

                if (lineSegments.Length == 2 && String.Compare(lineSegments[0], "Name", StringComparison.InvariantCultureIgnoreCase) == 0
                    && String.Compare(lineSegments[1], "Details", StringComparison.InvariantCultureIgnoreCase) == 0) //Looking for a Name  Details Line - Start of a new section
                {
                    rowNumber++;
                    newSection = true;
                    continue;
                }

                if (newSection && lineSegments.Length > 1) //We can start adding a new person's details - we know that 
                {
                    nameValues.Add(rowNumber, new List<String>());
                    nameValues[rowNumber].Insert(nameOrdinals.IndexOf("Name"), lineSegments[0]);

                    //Get the first column:value item
                    ParseColonSeparatedItem(lineSegments[1], nameOrdinals, nameValues, rowNumber);

                    newSection = false;
                    continue;
                }

                if (lineSegments.Length > 0 && lineSegments[0] != String.Empty) //Ignore empty lines
                {
                    ParseColonSeparatedItem(lineSegments[0], nameOrdinals, nameValues, rowNumber);
                }

            }
        }


        //At this point we should have collected a big list of items. We can then write out the CSV. We can use a StringBuilder for now, although your requirements will
        //be dependent upon how big the source files are.

        //Write out the columns

        StringBuilder builder = new StringBuilder();

        for (int i = 0; i < nameOrdinals.Count; i++)
        {
            if(i == nameOrdinals.Count - 1)
            {
                builder.Append(nameOrdinals[i]);
            }
            else
            {
                builder.AppendFormat("{0},", nameOrdinals[i]);
            }
        }

        builder.Append(Environment.NewLine);


        foreach (int key in nameValues.Keys)
        {
            List<String> values = nameValues[key];

            for (int i = 0; i < values.Count; i++)
            {
                if (i == values.Count - 1)
                {
                    builder.Append(values[i]);
                }
                else
                {
                    builder.AppendFormat("{0},", values[i]);
                }
            }

            builder.Append(Environment.NewLine);

        }

        //At this point you now have a StringBuilder containing the CSV data you can write to a file or similar




    }


    private void ParseColonSeparatedItem(string textToSeparate, List<String> columns, Dictionary<Int32, List<String>> outputStorage, int outputKey)
    {

        if (String.IsNullOrWhiteSpace(textToSeparate)) { return; }

        string[] colVals = textToSeparate.Split(new[] { ":" }, StringSplitOptions.RemoveEmptyEntries);

        List<String> outputValues = outputStorage[outputKey];

        if (!columns.Contains(colVals[0]))
        {
            //Add the column to the list of expected columns. The index of the column determines it's index in the output
            columns.Add(colVals[0]);

        }

        if (outputValues.Count < columns.Count)
        {
            outputValues.Add(colVals[1]);
        }
        else
        {
            outputStorage[outputKey].Insert(columns.IndexOf(colVals[0]), colVals[1]); //We append the value to the list at the place where the column index expects it to be. That way we can miss values in certain sections yet still have the expected output
        }
    }

在这种情况下，您必须根据上面的结构解析出这些信息。如果是像上面这样重复的文本块，那么我们可以说：

Name, Age, Height, Hair, Eyes
Alex, 25, 6, Brown, Hazel

public void ParseFile(){

        String currentLine;

        bool newSection = false;

        //Store the column names and ordinal position here.
        List<String> nameOrdinals = new List<String>();
        nameOrdinals.Add("Name"); //IndexOf == 0

        Dictionary<Int32, List<String>> nameValues = new Dictionary<Int32 ,List<string>>(); //Use this to store each person's details

        Int32 rowNumber = 0;

        using (TextReader reader = File.OpenText("D:\\temp\\test.txt"))
        {

            while ((currentLine = reader.ReadLine()) != null) //This will read the file one row at a time until there are no more rows to read
            {

                string[] lineSegments = currentLine.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);

                if (lineSegments.Length == 2 && String.Compare(lineSegments[0], "Name", StringComparison.InvariantCultureIgnoreCase) == 0
                    && String.Compare(lineSegments[1], "Details", StringComparison.InvariantCultureIgnoreCase) == 0) //Looking for a Name  Details Line - Start of a new section
                {
                    rowNumber++;
                    newSection = true;
                    continue;
                }

                if (newSection && lineSegments.Length > 1) //We can start adding a new person's details - we know that 
                {
                    nameValues.Add(rowNumber, new List<String>());
                    nameValues[rowNumber].Insert(nameOrdinals.IndexOf("Name"), lineSegments[0]);

                    //Get the first column:value item
                    ParseColonSeparatedItem(lineSegments[1], nameOrdinals, nameValues, rowNumber);

                    newSection = false;
                    continue;
                }

                if (lineSegments.Length > 0 && lineSegments[0] != String.Empty) //Ignore empty lines
                {
                    ParseColonSeparatedItem(lineSegments[0], nameOrdinals, nameValues, rowNumber);
                }

            }
        }


        //At this point we should have collected a big list of items. We can then write out the CSV. We can use a StringBuilder for now, although your requirements will
        //be dependent upon how big the source files are.

        //Write out the columns

        StringBuilder builder = new StringBuilder();

        for (int i = 0; i < nameOrdinals.Count; i++)
        {
            if(i == nameOrdinals.Count - 1)
            {
                builder.Append(nameOrdinals[i]);
            }
            else
            {
                builder.AppendFormat("{0},", nameOrdinals[i]);
            }
        }

        builder.Append(Environment.NewLine);


        foreach (int key in nameValues.Keys)
        {
            List<String> values = nameValues[key];

            for (int i = 0; i < values.Count; i++)
            {
                if (i == values.Count - 1)
                {
                    builder.Append(values[i]);
                }
                else
                {
                    builder.AppendFormat("{0},", values[i]);
                }
            }

            builder.Append(Environment.NewLine);

        }

        //At this point you now have a StringBuilder containing the CSV data you can write to a file or similar




    }


    private void ParseColonSeparatedItem(string textToSeparate, List<String> columns, Dictionary<Int32, List<String>> outputStorage, int outputKey)
    {

        if (String.IsNullOrWhiteSpace(textToSeparate)) { return; }

        string[] colVals = textToSeparate.Split(new[] { ":" }, StringSplitOptions.RemoveEmptyEntries);

        List<String> outputValues = outputStorage[outputKey];

        if (!columns.Contains(colVals[0]))
        {
            //Add the column to the list of expected columns. The index of the column determines it's index in the output
            columns.Add(colVals[0]);

        }

        if (outputValues.Count < columns.Count)
        {
            outputValues.Add(colVals[1]);
        }
        else
        {
            outputStorage[outputKey].Insert(columns.IndexOf(colVals[0]), colVals[1]); //We append the value to the list at the place where the column index expects it to be. That way we can miss values in certain sections yet still have the expected output
        }
    }

a。每个人都在一个以姓名细节开始的区块中

b。名称值是详细信息之后的第一个文本，其他列以格式列：value分隔

但是，如果原始输入是可选的，您也可能有带有附加属性的节，或者缺少属性，因此跟踪列和序号也会很有用

因此，一种方法可能如下所示：

Name, Age, Height, Hair, Eyes
Alex, 25, 6, Brown, Hazel

public void ParseFile(){

        String currentLine;

        bool newSection = false;

        //Store the column names and ordinal position here.
        List<String> nameOrdinals = new List<String>();
        nameOrdinals.Add("Name"); //IndexOf == 0

        Dictionary<Int32, List<String>> nameValues = new Dictionary<Int32 ,List<string>>(); //Use this to store each person's details

        Int32 rowNumber = 0;

        using (TextReader reader = File.OpenText("D:\\temp\\test.txt"))
        {

            while ((currentLine = reader.ReadLine()) != null) //This will read the file one row at a time until there are no more rows to read
            {

                string[] lineSegments = currentLine.Split(new[] { " " }, StringSplitOptions.RemoveEmptyEntries);

                if (lineSegments.Length == 2 && String.Compare(lineSegments[0], "Name", StringComparison.InvariantCultureIgnoreCase) == 0
                    && String.Compare(lineSegments[1], "Details", StringComparison.InvariantCultureIgnoreCase) == 0) //Looking for a Name  Details Line - Start of a new section
                {
                    rowNumber++;
                    newSection = true;
                    continue;
                }

                if (newSection && lineSegments.Length > 1) //We can start adding a new person's details - we know that 
                {
                    nameValues.Add(rowNumber, new List<String>());
                    nameValues[rowNumber].Insert(nameOrdinals.IndexOf("Name"), lineSegments[0]);

                    //Get the first column:value item
                    ParseColonSeparatedItem(lineSegments[1], nameOrdinals, nameValues, rowNumber);

                    newSection = false;
                    continue;
                }

                if (lineSegments.Length > 0 && lineSegments[0] != String.Empty) //Ignore empty lines
                {
                    ParseColonSeparatedItem(lineSegments[0], nameOrdinals, nameValues, rowNumber);
                }

            }
        }


        //At this point we should have collected a big list of items. We can then write out the CSV. We can use a StringBuilder for now, although your requirements will
        //be dependent upon how big the source files are.

        //Write out the columns

        StringBuilder builder = new StringBuilder();

        for (int i = 0; i < nameOrdinals.Count; i++)
        {
            if(i == nameOrdinals.Count - 1)
            {
                builder.Append(nameOrdinals[i]);
            }
            else
            {
                builder.AppendFormat("{0},", nameOrdinals[i]);
            }
        }

        builder.Append(Environment.NewLine);


        foreach (int key in nameValues.Keys)
        {
            List<String> values = nameValues[key];

            for (int i = 0; i < values.Count; i++)
            {
                if (i == values.Count - 1)
                {
                    builder.Append(values[i]);
                }
                else
                {
                    builder.AppendFormat("{0},", values[i]);
                }
            }

            builder.Append(Environment.NewLine);

        }

        //At this point you now have a StringBuilder containing the CSV data you can write to a file or similar




    }


    private void ParseColonSeparatedItem(string textToSeparate, List<String> columns, Dictionary<Int32, List<String>> outputStorage, int outputKey)
    {

        if (String.IsNullOrWhiteSpace(textToSeparate)) { return; }

        string[] colVals = textToSeparate.Split(new[] { ":" }, StringSplitOptions.RemoveEmptyEntries);

        List<String> outputValues = outputStorage[outputKey];

        if (!columns.Contains(colVals[0]))
        {
            //Add the column to the list of expected columns. The index of the column determines it's index in the output
            columns.Add(colVals[0]);

        }

        if (outputValues.Count < columns.Count)
        {
            outputValues.Add(colVals[1]);
        }
        else
        {
            outputStorage[outputKey].Insert(columns.IndexOf(colVals[0]), colVals[1]); //We append the value to the list at the place where the column index expects it to be. That way we can miss values in certain sections yet still have the expected output
        }
    }

与上面匹配的（\r\n实际上是Windows新行标记）

这种方法演示了自定义解析器的工作原理——它故意过于冗长，因为这里可能会发生大量重构，这只是一个示例

改进措施包括：

1）此函数假定实际文本项本身中没有空格。这是一个相当大的假设，如果错误，将需要一种不同的方法来解析线段。但是，这只需要在一个地方进行更改—例如，当您一次读取一行时，您可以应用reg ex，或者只读取字符，并假设第一个“列：”部分之后的所有内容都是一个值

2）无异常处理

3）文本输出不带引号。您可以测试每个值，看看它是日期还是数字——如果不是，请用引号括起来，因为其他程序（如Excel）将尝试更有效地保留底层数据类型

4）假定没有重复的列名。如果是，则必须检查是否已添加列项目，然后在分析部分创建ColName2列。

此格式不是CSV-是否将其格式化为CSV？或者你只是对阅读它的方式感兴趣吗？是的，我想用CSV格式化它，然后再阅读它。是否总是有相同数量的列：值变量？而且，至关重要的是，它们是否总是处于相同的顺序？