Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何基于字符串筛选行,但仅当字符串出现多次时_R_Stringr - Fatal编程技术网

R 如何基于字符串筛选行,但仅当字符串出现多次时

R 如何基于字符串筛选行,但仅当字符串出现多次时,r,stringr,R,Stringr,我正在清理一个数据集。下面是一个例子。如您所见,第一个条目多次出现vs.我想过滤掉任何出现多次的行,因为我只想在两个摔跤手之间进行比赛 2008-03-29 KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest

我正在清理一个数据集。下面是一个例子。如您所见,第一个条目多次出现
vs.
我想过滤掉任何出现多次的行,因为我只想在两个摔跤手之间进行比赛

2008-03-29 KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest
1935-04-17  Lou Thesz Vs. Otto Kuss Ime Limit Draw 
1976-05-09  Harley Race Vs. The Destroyer Ime Limit Draw 
我正在尝试以下方法,但不起作用。我不确定我还应该尝试什么

dataset_final <- dataset %>%
filter(
!str_detect(match, "( vs. | Vs. ){2,}")
)
dataset\u final%
滤器(
!str|u detect(匹配,“(vs.| vs.{2,}”)
)

任何关于如何完成此过滤器的想法都将不胜感激。谢谢

这将在vs.上拆分字符串,然后计算长度。这将返回日期为2008-03-29的第一行

df1$count <- lengths(strsplit(df1$V2, 'vs.'))
df1[df1$count > 1, ]
测试数据

df1 <- structure(list(V1 = c("2008-03-29", "1935-04-17", "1976-05-09"
), V2 = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago 
(Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi 
- No Contest", 
"Lou Thesz Vs. Otto Kuss Ime Limit Draw ", "Harley Race Vs. The Destroyer Ime Limit Draw "
), count = c(4L, 1L, 1L)), row.names = c(NA, -3L), class = "data.frame")

df1
stringr
包有一个名为
stru count
的函数,可以使用该函数

dataset <- structure(list(date = c("2008-03-29", "1935-04-17", "1976-05-09"
), str = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest",
"Lou Thesz Vs. Otto Kuss Ime Limit Draw", "Harley Race Vs. The Destroyer Ime Limit Draw"
)), class = "data.frame", row.names = c(NA, -3L))


library(stringr)
library(tidyverse)

dataset %>% 
mutate(str_low = tolower(str)) %>% filter(str_count(str_low, 'vs.') < 2) %>% 
select(date, str)
dataset%
突变(str_low=tolower(str))%%>%筛选(str_计数(str_low,'vs.)<2)%%>%
选择(日期,str)

嘿,谢谢你的回答<代码>str_count
是一条路要走。谢谢你的回答!
dataset <- structure(list(date = c("2008-03-29", "1935-04-17", "1976-05-09"
), str = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest",
"Lou Thesz Vs. Otto Kuss Ime Limit Draw", "Harley Race Vs. The Destroyer Ime Limit Draw"
)), class = "data.frame", row.names = c(NA, -3L))


library(stringr)
library(tidyverse)

dataset %>% 
mutate(str_low = tolower(str)) %>% filter(str_count(str_low, 'vs.') < 2) %>% 
select(date, str)