如何删除R中以冒号结尾的文本模式？_R_Regex_Gsub

如何删除R中以冒号结尾的文本模式？

r regex

如何删除R中以冒号结尾的文本模式？,r,regex,gsub,R,Regex,Gsub,我有下面的句子 review <- C("1a. How long did it take for you to receive a personalized response to an internet or email inquiry made to THIS dealership?: Approx. It was very prompt however. 2f. Consideration of your time and responsiveness to your reques

我有下面的句子

review <- C("1a. How long did it take for you to receive a personalized response to an internet or email inquiry made to THIS dealership?: Approx. It was very prompt however. 2f. Consideration of your time and responsiveness to your requests.: Were a little bit pushy but excellent otherwise 2g. Your satisfaction with the process of coming to an agreement on pricing.: Were willing to try to bring the price to a level that was acceptable to me. Please provide any additional comments regarding your recent sales experience.: Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! ")

然而，它只删除了以冒号结尾的第一句话

预期成果：

Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me. Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)!

如有任何帮助或建议，将不胜感激。谢谢。

如果句子不复杂且没有缩写，您可以使用

gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)

看

请注意，您可以通过将

\\d+[a-zA-Z]

更改为

[0-9a-zA-Z]+

[[：alnum:][]+

以匹配1+个数字或字母来进一步概括它

详细信息

```
（？：\d+[a-zA-Z]\）？
```
-可选的
- ```
\d+
```
  -1+位
- ```
[a-zA-Z]
```
  -一个ASCII字母
- ```
\。
```
  -一个点
```
[^.？！：]*
```
-0个或更多字符，而不是
，
```
？
```
，
```
，：
```


[？！.]
-a？
，或

：
-冒号
\s*
-0+空格


R试验：
> gsub("(?:\\d+[a-zA-Z]\\.)?[^.?!:]*[?!.]:\\s*", "", review)
[1] "Approx. It was very prompt however. Were a little bit pushy but excellent otherwise Were willing to try to bring the price to a level that was acceptable to me.Abel is awesome! Took care of everything from welcoming me into the dealership to making sure I got the car I wanted (even the color)! "

扩展以处理缩写
如果添加替换项，则可以枚举例外：
gsub("(?:\\d+[a-zA-Z]\\.)?(?:i\\.?e\\.|[^.?!:])*[?!.]:\\s*", "", review)     
                          ^^^^^^^^^^^^^^^^^^^^^^ 

这里，（？：i\.？e\.[^.？！：]）*
匹配0个或多个即或即子字符串或除、、以外的任何字符或：

请参阅。
您的问题不清楚。之前的所有内容都可以包含所有字符。这是一个句子吗？所以你只想删除1a.
，2f.
，2g.
，：
？每行上的字符都一样吗？很抱歉弄糊涂了，基本上，我的意思是我想去掉句子中的所有问题，只保留答案。在我的例子中，问题以冒号结尾，这就是为什么我在colonTrygsub（（？：\\d+[a-zA-Z]\\）？[^.？！：]*[？！]：\\s*“，”，review）之前提到了所有内容。
如果你能给我解释一下regex，那就太好了。对于“4c.请在返回你的车上对你的状况进行评分（即清洁度，完好无损）：非常感谢您的清洗！”，正则表达式不会返回预期结果。我该怎么做？@gamyanaidu我在一开始就补充说：如果没有缩写。如果有，您可以手动添加它们，如（？：\d+[a-zA-Z]\）（？：i\.？e\.[^.？！：]）*[？！]：\s*，请参阅。完美答案。非常感谢你。
gsub("(?:\\d+[a-zA-Z]\\.)?(?:i\\.?e\\.|[^.?!:])*[?!.]:\\s*", "", review)     
                          ^^^^^^^^^^^^^^^^^^^^^^