提取由R中的特定模式包围的字符串中的所有数字_R_Regex

提取由R中的特定模式包围的字符串中的所有数字

r regex

提取由R中的特定模式包围的字符串中的所有数字,r,regex,R,Regex,我想提取两个标记/模式两侧的字符串中的所有数字。然而，R中的正则表达式是我的祸根我有这样的想法： string <- "<img src='images/stimuli/32.png' style='width:341.38790035587186px;height: 265px;'><img src='images/stimuli/36.png' style='width:341.38790035587186px;height: 265px;'>" marke

我想提取两个标记/模式两侧的字符串中的所有数字。然而，R中的正则表达式是我的祸根

我有这样的想法：

string  <- "<img src='images/stimuli/32.png' style='width:341.38790035587186px;height: 265px;'><img src='images/stimuli/36.png' style='width:341.38790035587186px;height: 265px;'>"
marker1 <- "images/stimuli/"
marker2 <- ".png"

但我得到的是：

[1] "32"

PS如果有人有一个很好的指南来理解正则表达式是如何工作的，请告诉我。我很确定答案很简单，但我不明白正则表达式：（

您可以使用

string  <- "<img src='images/stimuli/32.png' style='width:341.38790035587186px;height: 265px;'><img src='images/stimuli/36.png' style='width:341.38790035587186px;height: 265px;'>"
regmatches(string, gregexpr("images/stimuli/\\K\\d+(?=\\.png)", string, perl=TRUE))[[1]]
# => [1] "32" "36"

字符串[1]“32”“36”

注意：如果有任何东西，而不仅仅是数字，您可以将

\\d+

替换为

*？

请参阅和

使用

gregexpr

的

regmatches

提取输入中找到的所有匹配项

正则表达式匹配：

```
图像/刺激物/
```
-文字字符串
```
\K
```
-一个匹配重置操作符，它将丢弃迄今为止匹配的所有文本
```
\d+
```
-1+位
```
（？=\.png）
```
-a
```
.png
```
子字符串（
是一个特殊字符，需要转义）

您可以使用

stru extract

从软件包

stringr

：

library(stringr)
str_extract_all(string, "(?<=images/stimuli/)\\d+(?=\\.png)")
[[1]]
[1] "32" "36"

库（stringr）
str_extract_all（string），（？是的，我希望所有被标记包围的数字都作为字符串的向量。我从来不知道\K操作符。整洁！
library(stringr)
str_extract_all(string, "(?<=images/stimuli/)\\d+(?=\\.png)")
[[1]]
[1] "32" "36"