C# 使用正则表达式提取对象及其属性_C#_Regex

C# 使用正则表达式提取对象及其属性

c# regex

C# 使用正则表达式提取对象及其属性,c#,regex,C#,Regex,您好，我正在使用一个列表结果，该结果包含以下字符串 -为了简化它，让我用这样的话，但方案是一样的 01:01 A car consists of : wheels, engine, seats, 2 screws, a cotton lamp 01:02 A bike consists of : wheels 01:03 A car consists of : wheels, engine, seats, speakers, 5 screws, an indicator light 01:04

您好，我正在使用一个

列表

结果，该结果包含以下字符串 -为了简化它，让我用这样的话，但方案是一样的

01:01 A car consists of : wheels, engine, seats, 2 screws, a cotton lamp
01:02 A bike consists of : wheels
01:03 A car consists of : wheels, engine, seats, speakers, 5 screws, an indicator light
01:04 A small truck consists of : wheels, engine, seats, bed

因此，伪匹配器和所需的输出将是

00-99:0-99(space)A|An(space){get the car/bike or any other as object}(space)consists(space)of(space):{get the elements in here exploding the commas as attributes}

现在我在foreach循环中使用，它遍历我的列表，然后将行写入文本框

Foreach(Message _msg in _objects.Messages){
    richTextBox1.AppendText(_msg.Text);
}

伪显示器，将整个句子添加到我的文本框中

Foreach(Message _msg in _objects.Messages){
    richTextBox1.AppendText(parsefunction(_msg.Text));
}

parse function
{ 
    count(the elements exploaded , and list them)
    remove the unwanted parts of text
}

提取对象和属性后，我希望根据它们是否包含计数对它们进行求和，并从中删除a/a。这就是我被困的部分原因

所需的输出将是对任何重复项和出现的数量求和

2x Car
4x Wheels
3x Engine
3x Seats
7x Screws
1x Cotton Lamp
1x Bike
1x Speakers
1x Indicator Light
1x Small Truck
1x Bed

你能告诉我至少

Regex

，也许我能自己数一数剩下的，做完后和大家分享。我假设它必须是一个将在循环中调用的函数。

以下是我的想法（我相信它可以改进）：

利用所有这些信息做你想做的事

这就产生了以下假设：

数据总是由两个数字（

[\d]{2}

）、一个冒号（

：

）和另外两个数字（

[\d]{2}

）、一个空格（

）、一个a（

）和可选的一个n（

[n]？

）（用于

或

）和另一个空格（

）；所有这些都是在这行的最开始（

）

对象的名称

（

（[a-zA-Z\s]+）

可以包括：

字母（

a-z

，

a-z

）

空格（

\s

）

这些字符中至少有一个，并且尽可能多

接下来的单词将是一个空格（

），

由一个空格（
）和一个冒号（：
）组成


属性
（（[a-zA-Z\s0-9]+））的单词可以包括：
字母（a-z
，a-z
）
逗号（，
）
空格（\s
）
数字（0-9
）
这些字符中至少有一个，并且尽可能多

这些属性之后是字符串的结尾（$
）
最后，这假设属性
不是空
或无
-在属性
中至少有一个字符
还有，这里没有错误检查。你应该根据需要添加它。
一切似乎都正常。我添加了一些曲折，使其成为赢家形式，但元素的计数应该在哪里？for循环属性？我想我可以将元素放在两个列表中：1个类型，第二个列表中有单独的coma元素，但这不是吗这是一个较长的方法吗？你的确切意思是什么？你可以通过多种方法轻松实现。我将添加几个示例。你所说的错误检查是什么意思？我从来没有这样做过before@Kavvson我把它改写成了一个函数
。你应该可以更轻松地使用它。通过错误检查
，我的意思是我没有尝试o用输入捕获或处理任何错误
。您应该能够确定该错误的问题。您不能传递字符串
对象，必须传递列表
对象：在每个换行上拆分输入，然后将每行添加到列表以发送到函数。
public static List<KeyValuePair<string, string[]>> ParseData(List<string> data)
{
    Regex regex = new Regex(@"^[\d]{2}:[\d]{2} A[n]? ([a-zA-Z\s]+) consists of : ([a-zA-Z,\s0-9]+)$");
    var elementMap = new List<KeyValuePair<string, string[]>>();

    for (int i = 0; i < data.Count; i++)
    {
        var match = regex.Match(data[i]);
        var attributes = match.Groups[2].Value.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);

        if (match.Success && match.Groups[1].Value.Length > 0)
            elementMap.Add(new KeyValuePair<string, string[]>(match.Groups[1].Value, attributes));
    }

    return elementMap;
}

public static Dictionary<string, int> GetIndexedData(List<KeyValuePair<string, string[]>> data)
{
    Dictionary<string, int> displayObjects = new Dictionary<string, int>();

    foreach (KeyValuePair<string, string[]> item in data)
    {
        if (displayObjects.ContainsKey(item.Key))
            displayObjects[item.Key]++;
        else
            displayObjects.Add(item.Key, 1);

        foreach (string key2 in item.Value)
        {
            string[] attributeValues = key2.Split(' ');
            int add = 1;
            string addValue = key2;
            int c = 0;

            if (attributeValues.Length > 1 && int.TryParse(attributeValues[0], out c))
            {
                add = c;
                addValue = attributeValues[1];
            }

            if (addValue.Substring(0, 2) == "a ")
                addValue = addValue.Substring(2);
            else if (addValue.Substring(0, 3) == "an ")
                addValue = addValue.Substring(3);

            if (displayObjects.ContainsKey(addValue))
                displayObjects[addValue] += add;
            else
                displayObjects.Add(addValue, add);
        }
    }

    return displayObjects;
}

List<string> data = new List<string>();
data.Add("01:01 A car consists of : wheels, engine, seats, 2 screws, a cotton lamp");
data.Add("01:02 A bike consists of : wheels");
data.Add("01:03 A car consists of : wheels, engine, seats, speakers, 5 screws, an indicator light");
data.Add("01:04 A small truck consists of : wheels, engine, seats, bed");
var elementMap = ParseData(data);

var displayObjects = GetIndexedData(elementMap);

foreach (string key in displayObjects.Keys)
{
    Console.WriteLine(key + ": " + displayObjects[key]);
}

var match = regex.Match(data[i]);
// 'match.Groups[1].Value' is the name of the item
// 'match.Groups[2].Value' is the comma-separated list

// The following line will split all the attributes on ', ' therefore leaving them as just the words. (`wheels`, `engine`, `seats`)
var attributes = match.Groups[2].Value.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);