C#分割逗号分隔值_C#_.net_Regex_Csv

C#分割逗号分隔值

c# .net regex csv

C#分割逗号分隔值,c#,.net,regex,csv,C#,.net,Regex,Csv,如何将逗号分隔的字符串与也可以包含逗号的带引号的字符串拆分输入示例： John, Doe, "Sid, Nency", Smith 预期产出：约翰母鹿希德，内西史密斯用逗号分割是可以的，但我有一个要求，就是允许使用“Sid，Nency”这样的字符串。我尝试使用正则表达式来拆分这些值。Regex“，（？=（[^\“]*\”[^\“]*\”*[^\“]*$）”来自Java问题，它对我的.NET代码不起作用。它将一些字符串加倍，查找额外结果等那么分割这些字符串的最佳方法是什么呢？只

如何将逗号分隔的字符串与也可以包含逗号的带引号的字符串拆分

输入示例：

John, Doe, "Sid, Nency", Smith

预期产出：

约翰
母鹿
希德，内西
史密斯

用逗号分割是可以的，但我有一个要求，就是允许使用“Sid，Nency”这样的字符串。我尝试使用正则表达式来拆分这些值。Regex

“，（？=（[^\“]*\”[^\“]*\”*[^\“]*$）”

来自Java问题，它对我的.NET代码不起作用。它将一些字符串加倍，查找额外结果等

那么分割这些字符串的最佳方法是什么呢？

只需检查字符串即可。在检查字符串时，请跟踪
如果您是否处于“块”中。如果您处于-中，请不要将逗号视为
逗号（作为分隔符）。否则就这样处理。它是一个简单的
算法，我自己写。当你第一次遇到“你进入
一个街区。当你遇到“下一个”时，你就结束了你的障碍，依此类推。
因此，只需通过字符串一次即可完成此操作。

import java.util.ArrayList;


public class Test003 {

    public static void main(String[] args) {
        String s = "  John, , , , \" Barry, John  \" , , , , , Doe, \"Sid ,  Nency\", Smith  ";

        StringBuilder term = new StringBuilder();
        boolean inQuote = false;
        boolean inTerm = false;
        ArrayList<String> terms = new ArrayList<String>();
        for (int i=0; i<s.length(); i++){
            char ch = s.charAt(i);
            if (ch == ' '){
                if (inQuote){
                    if (!inTerm) { 
                        inTerm = true;
                    }
                    term.append(ch);
                }
                else {
                    if (inTerm){
                        terms.add(term.toString());
                        term.setLength(0);
                        inTerm = false;
                    }
                }
            }else if (ch== '"'){
                term.append(ch); // comment this out if you don't need it
                if (!inTerm){
                    inTerm = true;
                }
                inQuote = !inQuote;
            }else if (ch == ','){
                if (inQuote){
                    if (!inTerm){
                        inTerm = true;
                    }
                    term.append(ch);
                }else{
                    if (inTerm){
                        terms.add(term.toString());
                        term.setLength(0);
                        inTerm = false;
                    }
                }
            }else{
                if (!inTerm){
                    inTerm = true;
                }
                term.append(ch);
            }
        }

        if (inTerm){
            terms.add(term.toString());
        }

        for (String t : terms){
            System.out.println("|" + t + "|");
        }

    }



}

import java.util.ArrayList；
公共类Test003{
公共静态void main（字符串[]args）{
字符串s=“John，，\“Barry，John\”，，Doe，\“Sid，Nency\”，Smith”；
StringBuilder术语=新的StringBuilder（）；
布尔inQuote=false；
布尔中间值=假；
ArrayList terms=新的ArrayList（）；
对于（int i=0；i这是因为捕获组。只需将其转换为非捕获组：
",(?=(?:[^""]*""[^""]*"")*[^""]*$)"
      ^^

捕获组正在结果中包含捕获的部分

只需修剪结果。
我在Csv解析器类中使用以下代码来实现这一点：
    private string[] ParseLine(string line)
    {
        List<string> results = new List<string>();
        bool inQuotes = false;
        int index = 0;
        StringBuilder currentValue = new StringBuilder(line.Length);
        while (index < line.Length)
        {
            char c = line[index];
            switch (c)
            {
                case '\"':
                    {
                        inQuotes = !inQuotes;
                        break;
                    }

                default:
                    {
                        if (c == ',' && !inQuotes)
                        {
                            results.Add(currentValue.ToString());
                            currentValue.Clear();
                        }
                        else
                            currentValue.Append(c);
                        break;
                    }
            }
            ++index;
        }

        results.Add(currentValue.ToString());
        return results.ToArray();
    }   // eo ParseLine

private string[]ParseLine（字符串行）
{
列表结果=新列表（）；
bool-inQuotes=false；
int指数=0；
StringBuilder currentValue=新的StringBuilder（line.Length）；
while（索引<行长度）
{
char c=行[索引]；
开关（c）
{
案例“\”：
{
inQuotes=！inQuotes；
打破
}
违约：
{
如果（c=='，'&&&！引号）
{
results.Add（currentValue.ToString（））；
currentValue.Clear（）；
}
其他的
currentValue.Append（c）；
打破
}
}
++指数；
}
results.Add（currentValue.ToString（））；
返回结果。ToArray（）；
}//eo解析线
如果您发现正则表达式太复杂，可以这样做：
string initialString = "John, Doe, \"Sid, Nency\", Smith";

IEnumerable<string> splitted = initialString.Split('"');
splitted = splitted.SelectMany((str, index) => index % 2 == 0 ? str.Split(',') : new[] { str });
splitted = splitted.Where(str => !string.IsNullOrWhiteSpace(str)).Select(str => str.Trim());

string initialString=“John，Doe，\“Sid，Nency\”，Smith”；
IEnumerable splitted=initialString.Split（“”）；
splitted=splitted.SelectMany（（str，index）=>index%2==0？str.Split（'，'）：new[]{str}）；
splitted=splitted.Where（str=>！string.IsNullOrWhiteSpace（str））.Select（str=>str.Trim（））；
看起来您正在处理CSV输入？如果是这样，请使用CSV库-有很多好的库，这将为您节省很多痛苦！！如果不是，请澄清您的问题，解释为什么CSV库不合适…不，它不是CSV文档。它只是一个stringRB，如果您演示给我看，我将很高兴，我如何使用CSVLib处理这个问题的一种黑客方法是首先在“
之间拆分，然后通过，拆分备用字符串（在获得的数组中）。这里的Perl解决方案（因为您将标记放回）：@AndreiMikhalevich好的，我发布了一些代码作为说明。
string initialString = "John, Doe, \"Sid, Nency\", Smith";

IEnumerable<string> splitted = initialString.Split('"');
splitted = splitted.SelectMany((str, index) => index % 2 == 0 ? str.Split(',') : new[] { str });
splitted = splitted.Where(str => !string.IsNullOrWhiteSpace(str)).Select(str => str.Trim());