C# 正则表达式聊天消息检测

C# 正则表达式聊天消息检测,c#,regex,C#,Regex,我目前正试图开发一个软件来正确查看以.txt格式保存的WhatsApp消息(通过电子邮件发送),并试图制作一个解析器。 在过去的3个小时里,我一直在尝试使用Regex,但没有找到解决方案,因为我以前几乎没有使用过Regex 消息如下所示: 16.08.2015, 18:30 - Person 1: Some multiline text here still in the message 16.08.2015, 18:31 - Person 2: some other message which

我目前正试图开发一个软件来正确查看以.txt格式保存的WhatsApp消息(通过电子邮件发送),并试图制作一个解析器。 在过去的3个小时里,我一直在尝试使用Regex,但没有找到解决方案,因为我以前几乎没有使用过Regex

消息如下所示:

16.08.2015, 18:30 - Person 1: Some multiline text here
still in the message
16.08.2015, 18:31 - Person 2: some other message which could be multiline
16.08.2015, 18:33 - Person 1: once again
我正试图通过匹配正则表达式来正确地拆分它们 (像这样)


我一直在尝试使用非常混乱的正则表达式,它看起来像
\d\d\\.\d\d\\。[…]

我不会为此使用一个正则表达式。相反,我只会使用
StreamReader
StreamReader
;您必须检查当前处理行是否为“chat start”行(使用正则表达式),如果是,则检查以下任何一行是否为“chat start”行,并跟踪是否应追加或生成新行。我编写了一个快速扩展方法来演示这一点:

public static class ChatReader
{
    static string pattern = @"\d\d\.\d\d\.\d\d\d\d, \d\d:\d\d - .*?:";        
    static Regex rgx = new Regex(pattern);
    static string prevLine = "";
    static string currLine = "";

    public static IEnumerable<string> ReadChatMessages(this TextReader reader)
    {
        prevLine = reader.ReadLine();
        currLine = reader.ReadLine();

        bool isPrevChatMsg = rgx.IsMatch(prevLine);                

        while (currLine != null)
        {
            bool isCurrChatMsg = rgx.IsMatch(currLine);
            if (isPrevChatMsg && isCurrChatMsg)
            {
                yield return prevLine;
                prevLine = currLine;                    
            }
            else if (isCurrChatMsg)
            {
                yield return currLine;
                prevLine = currLine;
            }
            else
            {
                prevLine += '\n' + currLine;
            }
            currLine = reader.ReadLine();

        }
        yield return prevLine;

    }
}
公共静态类聊天阅读器
{
静态字符串模式=@“\d\d\。\d\d\。\d\d\d\d\d\d:\d\d-.*?:”;
静态正则表达式rgx=新正则表达式(模式);
静态字符串prevLine=“”;
静态字符串currLine=“”;
公共静态IEnumerable ReadChatMessages(此文本阅读器)
{
prevLine=reader.ReadLine();
currLine=reader.ReadLine();
bool isPrevChatMsg=rgx.IsMatch(prevLine);
while(currLine!=null)
{
bool isCurrChatMsg=rgx.IsMatch(currLine);
if(isPrevChatMsg&&isCurrChatMsg)
{
收益率线;
prevLine=currLine;
}
else if(isCurrChatMsg)
{
收益率回归线;
prevLine=currLine;
}
其他的
{
prevLine+='\n'+currLine;
}
currLine=reader.ReadLine();
}
收益率线;
}
}
可以像这样使用:

List<string> chatMessages = reader.ReadChatMessages().ToList();
List chatMessages=reader.ReadChatMessages().ToList();

您的正则表达式是什么?请发一张。你只需要提取<代码> 16.082015,1830,<代码> 16.082015,18:31 ,<代码> 16.082015,18:33 ?请编辑你的问题,因为你不清楚你希望如何解析你的消息和你被困在哪里。您得到的输出有什么问题?你想要什么样的输出?哎呀,是你想要消息看起来像你的最后一个盒子,还是它们看起来像那样,而不是你想要的?
public static class ChatReader
{
    static string pattern = @"\d\d\.\d\d\.\d\d\d\d, \d\d:\d\d - .*?:";        
    static Regex rgx = new Regex(pattern);
    static string prevLine = "";
    static string currLine = "";

    public static IEnumerable<string> ReadChatMessages(this TextReader reader)
    {
        prevLine = reader.ReadLine();
        currLine = reader.ReadLine();

        bool isPrevChatMsg = rgx.IsMatch(prevLine);                

        while (currLine != null)
        {
            bool isCurrChatMsg = rgx.IsMatch(currLine);
            if (isPrevChatMsg && isCurrChatMsg)
            {
                yield return prevLine;
                prevLine = currLine;                    
            }
            else if (isCurrChatMsg)
            {
                yield return currLine;
                prevLine = currLine;
            }
            else
            {
                prevLine += '\n' + currLine;
            }
            currLine = reader.ReadLine();

        }
        yield return prevLine;

    }
}
List<string> chatMessages = reader.ReadChatMessages().ToList();