R 检查列表中哪些列具有精确的字符串值,并提取列和行
我有这个数据框:R 检查列表中哪些列具有精确的字符串值,并提取列和行,r,R,我有这个数据框: df <- data.frame ( A = c("ABC11234","ABC11"), B = c(11,1), C = c("11",11), D = c(11.1,"11.1")) df我建议使用lappy(): 如果您想从df中提取该值,可以执行以下操作: df[,rownames(do.call(rbind,a[lapply(a,length)==0]))]
df <- data.frame (
A = c("ABC11234","ABC11"),
B = c(11,1),
C = c("11",11),
D = c(11.1,"11.1"))
df我建议使用lappy()
:
如果您想从df
中提取该值,可以执行以下操作:
df[,rownames(do.call(rbind,a[lapply(a,length)==0]))]
输出:
$A
integer(0)
$D
integer(0)
A D
1 ABC11234 11.1
2 ABC11 11.1
在我看来,您想要的输出只是包含所需“字符串”的列中的列名和行号。不清楚您是否希望避免同时包含数值11
,因为字符串函数会将数字强制转换为字符串。然而,我的解决方案是使用stringr
包而不是base。我首先提取满足正则表达式的所有元素(即整个“字符串”是“11”
)
这将生成另一个长度为4的列表。每个元素都是一个向量,包含满足条件的每列的行
让我们取消这个列表
e <- unlist(lapply(d, function (x) which(x>0)))
为了消除重复的列名(C1、C2等),我们将再执行一个字符串函数,将任何列名后跟一个字母替换为列名:
names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
所有代码加在一起:
library(stringr)
d <- lapply(df, function(x) str_extract_all(x,"^11$"))
lapply(d, function (x) which(x>0))
e<- unlist(lapply(d, function (x) which(x>0)))
names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
e
库(stringr)
d(0))
e0)))
name(e)我不确定你是否想要下面这样的东西
a <- transform(
as.data.frame(
which(matrix(grepl("^11$", as.matrix(df)), nrow = nrow(df)),
arr.ind = TRUE
)),
col = names(df)[col]
)
还可以选择将格式改为“长”格式,然后获取相应的列名
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(across(everything(), as.character), row = row_number()) %>%
pivot_longer(cols = -row, names_to = 'col') %>%
group_by(row) %>%
summarise(col = unique(col[str_detect(value, '^11$')]), .groups = 'drop')
# A tibble: 3 x 2
# row col
# <int> <chr>
#1 1 B
#2 1 C
#3 2 C
库(dplyr)
图书馆(tidyr)
图书馆(stringr)
df%>%
变异(跨越(所有内容(),如.character),行=行\号())%>%
pivot_更长(cols=-row,names_to='col')%>%
分组依据(行)%>%
摘要(列=唯一(列[str_detect(值,^11$)])),.groups='drop')
#一个tibble:3x2
#行列
#
#11b
#2 1 C
#3.2 C
B C1 C2
2 1 2
names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
B C C
1 1 2
library(stringr)
d <- lapply(df, function(x) str_extract_all(x,"^11$"))
lapply(d, function (x) which(x>0))
e<- unlist(lapply(d, function (x) which(x>0)))
names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
e
a <- transform(
as.data.frame(
which(matrix(grepl("^11$", as.matrix(df)), nrow = nrow(df)),
arr.ind = TRUE
)),
col = names(df)[col]
)
> a
row col
1 1 B
2 1 C
3 2 C
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(across(everything(), as.character), row = row_number()) %>%
pivot_longer(cols = -row, names_to = 'col') %>%
group_by(row) %>%
summarise(col = unique(col[str_detect(value, '^11$')]), .groups = 'drop')
# A tibble: 3 x 2
# row col
# <int> <chr>
#1 1 B
#2 1 C
#3 2 C