R 将包含少于字符串的行删除到特定列中

R 将包含少于字符串的行删除到特定列中,r,R,如何删除特定列中包含少于4个字符串的行?字符串由单个空格分隔 示例输入 dd <- data.frame(id = c(1,2,3), text = c("remove this", "also remove this", "you can keep it however") 如果要使用tidyverse软件包,可以使用: library(dplyr) library(stringr) dd

如何删除特定列中包含少于4个字符串的行?字符串由单个空格分隔

示例输入

dd <- data.frame(id = c(1,2,3),
                 text = c("remove this", "also remove this", "you can keep it however")
如果要使用tidyverse软件包,可以使用:

library(dplyr)
library(stringr)

dd %>% filter(str_count(text, " ") >= 3)
这里我们假设少于4个字符串意味着少于3个空格。通过计算字符数,您可以得到一个比实际执行拆分字符串和为不需要的单独片段分配内存更有效的解决方案。

如果您想使用tidyverse软件包,您可以使用:

library(dplyr)
library(stringr)

dd %>% filter(str_count(text, " ") >= 3)
这里我们假设少于4个字符串意味着少于3个空格。通过计算字符数,您可以得到一个比实际执行拆分字符串并在不需要时为单独的片段分配内存的工作更有效的解决方案。

使用base R:

df
  id                    text
1  1             remove this
2  2        also remove this
3  3 you can keep it however
df$str_count <- sapply(strsplit(df$text, split = ' '), length)
df$text <- df$text[which(df$str_count>4)]
df$str_count <- NULL
df
  id                    text
1  1 you can keep it however
2  2 you can keep it however
3  3 you can keep it however
使用基本R:

df
  id                    text
1  1             remove this
2  2        also remove this
3  3 you can keep it however
df$str_count <- sapply(strsplit(df$text, split = ' '), length)
df$text <- df$text[which(df$str_count>4)]
df$str_count <- NULL
df
  id                    text
1  1 you can keep it however
2  2 you can keep it however
3  3 you can keep it however

按空间拆分,然后检查长度:

dd[ lengths(strsplit(dd$text, " ")) > 4, ]
#   id                    text
# 3  3 you can keep it however

按空间拆分,然后检查长度:

dd[ lengths(strsplit(dd$text, " ")) > 4, ]
#   id                    text
# 3  3 you can keep it however

预期输出,id应该只有3,对吗?@zx8754对。我更新了它。很多人都期待输出,id应该只有3,对吗?@zx8754对。我更新了它。很多人喜欢逆向逻辑,用空格代替单词。我喜欢逆向逻辑,用空格代替单词。