
在R/RStudio中,为另一列的每一行查找包含一列字符串的行ID,r,dataframe,R,Dataframe,如果一行在我的dataframe中的“ReferenceText”列中有文本,则“text”列中的相应文本表示回复注释。如果是NA,则“文本”列中的相应文本表示原始帖子 如果可能的话,我希望使用正则表达式(我考虑的是gregexpr和regmatches),但是如果在R/RStudio中有另一种形式的模式匹配,可以执行以下操作: 我想在“文本”列中找到“ReferenceText”文本的匹配文本,并为每个“ReferenceText”观察将“text”观察对应的“ID”放在我的“posted”列




例如,如果回复了原始帖子(PostID=5)(“ReferenceText”第6行第9行中的“Text”文本),“ReferenceText”第6行“PostID”观察结果将标记为“PostID=5”,“Sequence”观察结果将标记为“PostID=5 Sequence=1”。如果在“ReferenceText”(“ReferenceText”第9行)中再次回复或重复原始帖子,则“Sequence”观察结果将标记为“PostID=5 Sequence=2”。我有一个相当大的数据集(160000多个观测值),因此,如果有一个函数能够回答这个问题,我将不胜感激。有什么想法吗



## dput output assigned to my sampleDF data frame
> dput(sampleDF)
structure(list(ID = 1:30, Screen.Name = c("User 1", "User 2", 
"User 3", "User 4", "User 5", "User 6", "User 7", "User 8", "User 5", 
"User 9", "User 9", "User 1", "User 1", "User 10", "User 8", 
"User 11", "Company", "User 12", "User 13", "User 14", "User 15", 
"User 16", "User 17", "User 18", "User 19", "User 13", "User 20", 
"User 21", "Uer 21", "User 21"), Text = c("Can anyone tell me where in the bloody world this TROLL came from.  Is he the troll of the week at the national trolling academy?  https://www.facebook.com/joseph.barnhorst", 
"company's \"You're Kinda a Big Deal\" promotion is kinda lame and insulting.  How about a service that actually is up to speed as advertised?  Now THATwould be a big deal for company.", 
"Hope I win sumthing!", "Im paying 90 dollars for a reason, so fix whatever is broken so I can actually use my phone!", 
"How do you sign up for the Your A Big Deal Sweepstakes?", "http://company.promo.eprize.com/sweepstakes/:b=chrome/?INTCID=TSC:MyS:MyA:Skn:013113:EngagementSweeps#", 
"Thanks for your giant mess up. I'm down 370 dollars.", "When will the blackberry 10 be available ?", 
"Thank you for the link but should the email get you there also?  I clicked on the mobile ad in and e-bill options to finish the sign up for those and received an error message for both.  Perhaps a link isn't working properly there either.", 
"Wait, Could it be Joseph is upset cause his Milkshake didn't bring the boys to his yard? Must see, look at this > http://www.youtube.com/watch?v=gFK8zYYoMtQ", 
"What a putz.... lol", "LMBO!", "The Blackberry Q10 will be available to US carriers in April.", 
"My mobile hot spot just shut off Randomly and now It tell me to set it up again I already have it on my plan", 
"I'm really interested in seeing this phone, I hope it's as good if not better than iPhones cause nothing new has challenged apple really", 
"Turn off the LTE in Carlisle it does work like at all. Or please fix. Won't load anything under LTE", 
"Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"why we are still having service issues,  ", "unlimited data isn't worth anything when you can't get service with or without a femtocell and tech support has been next to useless over the past few months.", 
"Because everyone should be texting while walking in the rain... face palm.", 
"When is the LTE going to be available in NYC?? I was told end of last year... but it's Feb now.....", 
"Get us 4g already", "So silly. But people will buy it I am sure", 
"rubbish", "Free umbrella with company phones?! I bet it helps with the sewage internet connection you guys have.", 
"oh and more loveliness.. just had a company rep hang up on me..this is twice...nice job.", 
"Oh wow", "The pressure is getting to them. The CEO has put them in a no win situation.", 
"Never", "LTE means Lying To Everyone."), ReferenceText = c("NA", 
"NA", "NA", "NA", "NA", "How do you sign up for the Your A Big Deal Sweepstakes?", 
"NA", "NA", "How do you sign up for the Your A Big Deal Sweepstakes?", 
"Can anyone tell me where in the bloody world this TROLL came from.  Is he the troll of the week at the national trolling academy?  https://www.facebook.com/joseph", 
"Can anyone tell me where in the bloody world this TROLL came from.  Is he the troll of the week at the national trolling academy?  https://www.facebook.com/joseph", 
"Can anyone tell me where in the bloody world this TROLL came from.  Is he the troll of the week at the national trolling academy?  https://www.facebook.com/joseph", 
"When will the blackberry 10 be available ?", "NA", "When will the blackberry 10 be available ?", 
"NA", "NA", "NA", "NA", "Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"NA", "Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"NA", "Need a hand? Check out this redesigned umbrella handle that lets you keep texting even during a downpour. http://bit.ly/12fniVl", 
"oh and more loveliness.. just had a company rep hang up on me..this is twice...nice job. ", 
"When is the LTE going to be available in NYC?? I was told end of last year... but it's Feb now.....", 
"NA"), PostID = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), Sequence = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA), DATE = c("2/1/2013", "2/1/2013", "2/1/2013", 
"2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", 
"2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", 
"2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", 
"2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", "2/1/2013", 
"2/1/2013", "2/1/2013", "2/1/2013"), X_M__millitary_time_ = c("16:46:20", 
"16:52:07", "16:55:54", "17:08:41", "17:10:08", "17:13:01", "17:13:17", 
"17:15:17", "17:19:01", "17:36:39", "17:41:08", "17:42:44", "17:45:42", 
"17:50:08", "17:50:53", "17:53:25", "18:00:01", "18:01:18", "18:03:37", 
"18:04:26", "18:05:41", "18:10:58", "18:11:17", "18:11:20", "18:11:41", 
"18:11:58", "18:13:19", "18:17:13", "18:18:34", "18:19:53"), 
timestampM = c("2/1/2013 16:46", "2/1/2013 16:52", "2/1/2013 16:55", 
"2/1/2013 17:08", "2/1/2013 17:10", "2/1/2013 17:13", "2/1/2013 17:13", 
"2/1/2013 17:15", "2/1/2013 17:19", "2/1/2013 17:36", "2/1/2013 17:41", 
"2/1/2013 17:42", "2/1/2013 17:45", "2/1/2013 17:50", "2/1/2013 17:50", 
"2/1/2013 17:53", "2/1/2013 18:00", "2/1/2013 18:01", "2/1/2013 18:03", 
"2/1/2013 18:04", "2/1/2013 18:05", "2/1/2013 18:10", "2/1/2013 18:11", 
"2/1/2013 18:11", "2/1/2013 18:11", "2/1/2013 18:11", "2/1/2013 18:13", 
"2/1/2013 18:17", "2/1/2013 18:18", "2/1/2013 18:19"), timestampN = c("2/1/2013 16:46", 
"2/1/2013 16:52", "2/1/2013 16:55", "2/1/2013 17:08", "2/1/2013 17:10", 
"2/1/2013 17:13", "2/1/2013 17:13", "2/1/2013 17:15", "2/1/2013 17:19", 
"2/1/2013 17:36", "2/1/2013 17:41", "2/1/2013 17:42", "2/1/2013 17:45", 
"2/1/2013 17:50", "2/1/2013 17:50", "2/1/2013 17:53", "2/1/2013 18:00", 
"2/1/2013 18:01", "2/1/2013 18:03", "2/1/2013 18:04", "2/1/2013 18:05", 
"2/1/2013 18:10", "2/1/2013 18:11", "2/1/2013 18:11", "2/1/2013 18:11", 
"2/1/2013 18:11", "2/1/2013 18:13", "2/1/2013 18:17", "2/1/2013 18:18", 
"2/1/2013 18:19")), .Names = c("ID", "Screen.Name", "Text", 
"ReferenceText", "PostID", "Sequence", "DATE", "X_M__millitary_time_", 
"timestampM", "timestampN"), class = "data.frame", row.names = c(NA, 
