R 如何基于字符串筛选行,但仅当字符串出现多次时
我正在清理一个数据集。下面是一个例子。如您所见,第一个条目多次出现R 如何基于字符串筛选行,但仅当字符串出现多次时,r,stringr,R,Stringr,我正在清理一个数据集。下面是一个例子。如您所见,第一个条目多次出现vs.我想过滤掉任何出现多次的行,因为我只想在两个摔跤手之间进行比赛 2008-03-29 KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest
vs.
我想过滤掉任何出现多次的行,因为我只想在两个摔跤手之间进行比赛
2008-03-29 KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest
1935-04-17 Lou Thesz Vs. Otto Kuss Ime Limit Draw
1976-05-09 Harley Race Vs. The Destroyer Ime Limit Draw
我正在尝试以下方法,但不起作用。我不确定我还应该尝试什么
dataset_final <- dataset %>%
filter(
!str_detect(match, "( vs. | Vs. ){2,}")
)
dataset\u final%
滤器(
!str|u detect(匹配,“(vs.| vs.{2,}”)
)
任何关于如何完成此过滤器的想法都将不胜感激。谢谢 这将在vs.上拆分字符串,然后计算长度。这将返回日期为2008-03-29的第一行
df1$count <- lengths(strsplit(df1$V2, 'vs.'))
df1[df1$count > 1, ]
测试数据
df1 <- structure(list(V1 = c("2008-03-29", "1935-04-17", "1976-05-09"
), V2 = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago
(Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi
- No Contest",
"Lou Thesz Vs. Otto Kuss Ime Limit Draw ", "Harley Race Vs. The Destroyer Ime Limit Draw "
), count = c(4L, 1L, 1L)), row.names = c(NA, -3L), class = "data.frame")
df1stringr
包有一个名为stru count
的函数,可以使用该函数
dataset <- structure(list(date = c("2008-03-29", "1935-04-17", "1976-05-09"
), str = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest",
"Lou Thesz Vs. Otto Kuss Ime Limit Draw", "Harley Race Vs. The Destroyer Ime Limit Draw"
)), class = "data.frame", row.names = c(NA, -3L))
library(stringr)
library(tidyverse)
dataset %>%
mutate(str_low = tolower(str)) %>% filter(str_count(str_low, 'vs.') < 2) %>%
select(date, str)
dataset%
突变(str_low=tolower(str))%%>%筛选(str_计数(str_low,'vs.)<2)%%>%
选择(日期,str)
嘿,谢谢你的回答<代码>str_count
是一条路要走。谢谢你的回答!
dataset <- structure(list(date = c("2008-03-29", "1935-04-17", "1976-05-09"
), str = c("KO-D Openweight Title Number 1 Contendership Four Way Dance: Great Yago (Yoshiaki Yago) vs. I Am Chono Sanshiro (Sanshiro Takagi) vs. Koo vs. Seiya Morohashi - No Contest",
"Lou Thesz Vs. Otto Kuss Ime Limit Draw", "Harley Race Vs. The Destroyer Ime Limit Draw"
)), class = "data.frame", row.names = c(NA, -3L))
library(stringr)
library(tidyverse)
dataset %>%
mutate(str_low = tolower(str)) %>% filter(str_count(str_low, 'vs.') < 2) %>%
select(date, str)