R 如何从字符向量中的不同文本中删除具有不同长度*的文本块？_R_Regex

R 如何从字符向量中的不同文本中删除具有不同长度*的文本块？

r regex

R 如何从字符向量中的不同文本中删除具有不同长度*的文本块？,r,regex,R,Regex,我有一个字符向量，它有231个文档，231行乘一列。每个文档的开头都有一块文本，我想从231个文档中删除它。问题是，此块的长度因文档而异让我们举一个例子，其中每个文本都有以下开头：我希望删除的文本：我尝试了以下选项，但没有结果： x <- c("Text that I wish to remove because I don't like it. I really want to remove the text but I cannot do it. I hope that stack

我有一个字符向量，它有231个文档，231行乘一列。每个文档的开头都有一块文本，我想从231个文档中删除它。问题是，此块的长度因文档而异

让我们举一个例子，其中每个文本都有以下开头：我希望删除的文本：

我尝试了以下选项，但没有结果：

x <- c("Text that I wish to remove because I don't like it. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove and I will remove it because some great data analyst will help me solve it. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.", 
  "Text that I wish to remove and who know whether I manage to make it work, it could be and it could not be. I really want to remove the text but I cannot do it. I hope that stackoverflow will sort it out.")

有人能帮我吗

非常感谢

您可以使用以下代码

  gsub("^.+\\. ", "", x)

[1] "I hope that stackoverflow will sort it out."
[2] "I hope that stackoverflow will sort it out."
[3] "I hope that stackoverflow will sort it out."
[4] "I hope that stackoverflow will sort it out."

您可以使用以下代码

  gsub("^.+\\. ", "", x)

[1] "I hope that stackoverflow will sort it out."
[2] "I hope that stackoverflow will sort it out."
[3] "I hope that stackoverflow will sort it out."
[4] "I hope that stackoverflow will sort it out."

在、、上拆分，然后得到最后一句话：

sapply(strsplit(x, ". ", fixed = TRUE), tail, n = 1)
# [1] "I hope that stackoverflow will sort it out."
# [2] "I hope that stackoverflow will sort it out."
# [3] "I hope that stackoverflow will sort it out."
# [4] "I hope that stackoverflow will sort it out."

在、、上拆分，然后得到最后一句话：

sapply(strsplit(x, ". ", fixed = TRUE), tail, n = 1)
# [1] "I hope that stackoverflow will sort it out."
# [2] "I hope that stackoverflow will sort it out."
# [3] "I hope that stackoverflow will sort it out."
# [4] "I hope that stackoverflow will sort it out."

您应该如何识别应该删除的文本？你需要定义一些计算机可以理解的规则。是否所有内容都符合并包括第一个句号？目前我唯一能看到的逻辑是保留最后两个句子，是否正确？@zx8754是的，它是正确的。Flick先生问的问题正是我需要的。我如何翻译成代码，只保留最后两句话？你应该如何识别应该删除的文本？你需要定义一些计算机可以理解的规则。是否所有内容都符合并包括第一个句号？目前我唯一能看到的逻辑是保留最后两个句子，是否正确？@zx8754是的，它是正确的。Flick先生问的问题正是我需要的。我怎样才能把最后两句话翻译成代码？是的，我测试过了。我在修改之前编写的代码也可以工作。是gsub.+\\，x但不安全。@Arma_91很高兴它能工作，只是与您的预期输出不匹配。@zx8754 True。然而，我现在明白了背后的原因，因此我可以将其调整到我的文本：@Arma_91为了不让未来的读者感到困惑，我建议您编辑问题中的预期输出。谢谢@zx8754。我刚刚注意到我的输出与预期的输出不同。要获得@Arma_91提供的预期输出，正确的代码应该是gsub^.+？\，x o stringi:：stri\u replace\u firstx，正则表达式=.+？\。是的，我试过了。我在修改之前编写的代码也可以工作。是gsub.+\\，x但不安全。@Arma_91很高兴它能工作，只是与您的预期输出不匹配。@zx8754 True。然而，我现在明白了背后的原因，因此我可以将其调整到我的文本：@Arma_91为了不让未来的读者感到困惑，我建议您编辑问题中的预期输出。谢谢@zx8754。我刚刚注意到我的输出与预期的输出不同。要获得@Arma_91提供的预期输出，正确的代码应该是gsub^.+？\，x o stringi:：stri\u replace\u firstx，正则表达式=.+？\。