Regex 提取内容中的特定文本

Regex 提取内容中的特定文本,regex,r,Regex,R,我正在处理作为数据帧值转储在第行中的电子邮件内容,并且必须根据从:到下一个从:开始到下一个的不同进行分隔: 样本数据: "______________________________________________ From: Kumar M, Sent: Tuesday, 21 October 2014 7:30 AM To: Deo, Ravinesh; G S, Venkatesh; Cc: Monteleone, Elif; Kabyanga, Isaac Subject:

我正在处理作为数据帧值转储在第行中的电子邮件内容,并且必须根据从:到下一个从:开始到下一个的不同进行分隔:

样本数据:

"______________________________________________ 
From:   Kumar M,  
Sent:   Tuesday, 21 October 2014 7:30 AM
To: Deo, Ravinesh; G S, Venkatesh;
Cc: Monteleone, Elif; Kabyanga, Isaac
Subject:    FW: Please Approve the Qlik  Access.


Hi Ravi,

We will work on the providing David access to Ql and an email will be sent out once the access is set up.   

Regards,
Santhosh

______________________________________________ 
From:   Deo, Ravinesh  
Sent:   Tuesday, 21 October 2014 7:20 AM
To: Kabyanga, Isaac; Kumar M, Santhosh
Cc: Monteleone, Elif
Subject:    FW: Please Approve the Qlikview Access.

Hi Isaac/Santhosh,

Appreciate if you can grant access to David Dennis for GPA – Timor.

David is CEO Timor Leste.

Thanks
Ravi

_____________________________________________
From: Dennis, David (Timor) 
Sent: Tuesday, 21 October 2014 11:34 AM
To: Deo, Ravinesh
Subject: FW: Please Approve the Q GPA Access.

Here you go - appreciate your help Rgds

______________________________________________ 
From:   Dennis, David (Timor)  
Sent:   Thursday, 9 October 2014 11:33 AM
To: Buchanan, Geoffrey (Solomon Islands)
Subject:    Please Approve the Qlikview Access.

Hello,

Can you please review the attached form and click ' Manager Approval' to approve.

Thanks"
我已经参考并使用了下面的代码

ex <- gsub("^[from:](.*?)[from:]$", "",impordata$Problem.Description[i] )
[2]

[3]

[4]

并使用regmatches

这也没用,

谁能纠正它!或者提供一些帮助

您可以使用strsplit功能

> strsplit(gsub("(?s)^_+\\s+", "", x, perl=T) , "_+\\s*(?=From:)", perl=T)[[1]]
[1] "From:   Kumar M,  \nSent:   Tuesday, 21 October 2014 7:30 AM\nTo: Deo, Ravinesh; G S, Venkatesh;\nCc: Monteleone, Elif; Kabyanga, Isaac\nSubject:    FW: Please Approve the Qlik  Access.\n\n\nHi Ravi,\n\nWe will work on the providing David access to Ql and an email will be sent out once the access is set up.   \n\nRegards,\nSanthosh\n\n"
[2] "From:   Deo, Ravinesh  \nSent:   Tuesday, 21 October 2014 7:20 AM\nTo: Kabyanga, Isaac; Kumar M, Santhosh\nCc: Monteleone, Elif\nSubject:    FW: Please Approve the Qlikview Access.\n\nHi Isaac/Santhosh,\n\nAppreciate if you can grant access to David Dennis for GPA – Timor.\n\nDavid is CEO Timor Leste.\n\nThanks\nRavi\n\n"               
[3] "From: Dennis, David (Timor) \nSent: Tuesday, 21 October 2014 11:34 AM\nTo: Deo, Ravinesh\nSubject: FW: Please Approve the Q GPA Access.\n\nHere you go - appreciate your help Rgds\n\n"                                                                                                                                                           
[4] "From:   Dennis, David (Timor)  \nSent:   Thursday, 9 October 2014 11:33 AM\nTo: Buchanan, Geoffrey (Solomon Islands)\nSubject:    Please Approve the Qlikview Access.\n\nHello,\n\nCan you please review the attached form and click ' Manager Approval' to approve.\n\nThanks"
看看strsplit:


想要的输出是什么?@karthikmanchala我已经给出了上面想要的输出!我使用了这个,因为我已经编辑了这个问题,但它没有用,我得到了[[1]]字符0
From:   Deo, Ravinesh  
    Sent:   Tuesday, 21 October 2014 7:20 AM
    To: Kabyanga, Isaac; Kumar M, Santhosh
    Cc: Monteleone, Elif
    Subject:    FW: Please Approve the Qlikview Access.

    Hi Isaac/Santhosh,

    Appreciate if you can grant access to David Dennis for GPA – Timor.

    David is CEO Timor Leste.

    Thanks
    Ravi
 From: Dennis, David (Timor) 
    Sent: Tuesday, 21 October 2014 11:34 AM
    To: Deo, Ravinesh
    Subject: FW: Please Approve the Q GPA Access.

    Here you go - appreciate your help Rgds
From:   Dennis, David (Timor)  
    Sent:   Thursday, 9 October 2014 11:33 AM
    To: Buchanan, Geoffrey (Solomon Islands)
    Subject:    Please Approve the Qlikview Access.

    Hello,

    Can you please review the attached form and click ' Manager Approval' to approve.

    Thanks"
#Converted a row as vector to apply regmatches
vec <- as.vector(impordata$Problem.Description[1])

matc <-regmatches(vec, gregexpr("(^[from:]).*?($[from:])", vec, perl = TRUE))
> strsplit(gsub("(?s)^_+\\s+", "", x, perl=T) , "_+\\s*(?=From:)", perl=T)[[1]]
[1] "From:   Kumar M,  \nSent:   Tuesday, 21 October 2014 7:30 AM\nTo: Deo, Ravinesh; G S, Venkatesh;\nCc: Monteleone, Elif; Kabyanga, Isaac\nSubject:    FW: Please Approve the Qlik  Access.\n\n\nHi Ravi,\n\nWe will work on the providing David access to Ql and an email will be sent out once the access is set up.   \n\nRegards,\nSanthosh\n\n"
[2] "From:   Deo, Ravinesh  \nSent:   Tuesday, 21 October 2014 7:20 AM\nTo: Kabyanga, Isaac; Kumar M, Santhosh\nCc: Monteleone, Elif\nSubject:    FW: Please Approve the Qlikview Access.\n\nHi Isaac/Santhosh,\n\nAppreciate if you can grant access to David Dennis for GPA – Timor.\n\nDavid is CEO Timor Leste.\n\nThanks\nRavi\n\n"               
[3] "From: Dennis, David (Timor) \nSent: Tuesday, 21 October 2014 11:34 AM\nTo: Deo, Ravinesh\nSubject: FW: Please Approve the Q GPA Access.\n\nHere you go - appreciate your help Rgds\n\n"                                                                                                                                                           
[4] "From:   Dennis, David (Timor)  \nSent:   Thursday, 9 October 2014 11:33 AM\nTo: Buchanan, Geoffrey (Solomon Islands)\nSubject:    Please Approve the Qlikview Access.\n\nHello,\n\nCan you please review the attached form and click ' Manager Approval' to approve.\n\nThanks"
splits <- strsplit(paste0(vec, collapse = "\n"), "_{45}")[[1]][-1]
cat(splits) 
cat(splits[1])
# _ 
# From:   Kumar M,  
# Sent:   Tuesday, 21 October 2014 7:30 AM
# To: Deo, Ravinesh; G S, Venkatesh;
# Cc: Monteleone, Elif; Kabyanga, Isaac
# Subject:    FW: Please Approve the Qlik  Access.
# 
# 
# Hi Ravi,
# 
# We will work on the providing David access to Ql and an email will be sent out once the access is set up.   
# 
# Regards,
# Santhosh