Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/66.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 检查列表中哪些列具有精确的字符串值,并提取列和行_R - Fatal编程技术网

R 检查列表中哪些列具有精确的字符串值,并提取列和行

R 检查列表中哪些列具有精确的字符串值,并提取列和行,r,R,我有这个数据框: df <- data.frame ( A = c("ABC11234","ABC11"), B = c(11,1), C = c("11",11), D = c(11.1,"11.1")) df我建议使用lappy(): 如果您想从df中提取该值,可以执行以下操作: df[,rownames(do.call(rbind,a[lapply(a,length)==0]))]

我有这个数据框:

df <- data.frame (
  A = c("ABC11234","ABC11"),
  B = c(11,1),
  C = c("11",11),
  D =  c(11.1,"11.1"))

df我建议使用
lappy()

如果您想从
df
中提取该值,可以执行以下操作:

df[,rownames(do.call(rbind,a[lapply(a,length)==0]))]
输出:

$A
integer(0)

$D
integer(0)
         A    D
1 ABC11234 11.1
2    ABC11 11.1

在我看来,您想要的输出只是包含所需“字符串”的列中的列名和行号。不清楚您是否希望避免同时包含数值
11
,因为字符串函数会将数字强制转换为字符串。然而,我的解决方案是使用
stringr
包而不是base。我首先提取满足正则表达式的所有元素(即整个“字符串”是
“11”

这将生成另一个长度为4的列表。每个元素都是一个向量,包含满足条件的每列的行

让我们取消这个列表

    e <- unlist(lapply(d, function (x) which(x>0)))
为了消除重复的列名(C1、C2等),我们将再执行一个字符串函数,将任何列名后跟一个字母替换为列名:

    names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
所有代码加在一起:

    library(stringr)
    d <- lapply(df, function(x) str_extract_all(x,"^11$"))
    lapply(d, function (x) which(x>0))
    e<- unlist(lapply(d, function (x) which(x>0)))
    names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
    e
库(stringr)
d(0))
e0)))

name(e)我不确定你是否想要下面这样的东西

a <- transform(
  as.data.frame(
    which(matrix(grepl("^11$", as.matrix(df)), nrow = nrow(df)),
    arr.ind = TRUE
  )),
  col = names(df)[col]
)

还可以选择将格式改为“长”格式,然后获取相应的列名

library(dplyr)
library(tidyr)
library(stringr)
df %>% 
    mutate(across(everything(), as.character), row = row_number()) %>%
    pivot_longer(cols = -row, names_to = 'col') %>%
    group_by(row) %>% 
    summarise(col = unique(col[str_detect(value, '^11$')]), .groups = 'drop')
# A tibble: 3 x 2
#    row col  
#  <int> <chr>
#1     1 B    
#2     1 C    
#3     2 C    
库(dplyr)
图书馆(tidyr)
图书馆(stringr)
df%>%
变异(跨越(所有内容(),如.character),行=行\号())%>%
pivot_更长(cols=-row,names_to='col')%>%
分组依据(行)%>%
摘要(列=唯一(列[str_detect(值,^11$)])),.groups='drop')
#一个tibble:3x2
#行列
#   
#11b
#2 1 C
#3.2 C
    B C1 C2 
    2  1  2 
    names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
    B C C 
    1 1 2 
    library(stringr)
    d <- lapply(df, function(x) str_extract_all(x,"^11$"))
    lapply(d, function (x) which(x>0))
    e<- unlist(lapply(d, function (x) which(x>0)))
    names(e) <- str_replace_all(names(e),"([A-Z])\\d","\\1")
    e
a <- transform(
  as.data.frame(
    which(matrix(grepl("^11$", as.matrix(df)), nrow = nrow(df)),
    arr.ind = TRUE
  )),
  col = names(df)[col]
)
> a
  row col
1   1   B
2   1   C
3   2   C
library(dplyr)
library(tidyr)
library(stringr)
df %>% 
    mutate(across(everything(), as.character), row = row_number()) %>%
    pivot_longer(cols = -row, names_to = 'col') %>%
    group_by(row) %>% 
    summarise(col = unique(col[str_detect(value, '^11$')]), .groups = 'drop')
# A tibble: 3 x 2
#    row col  
#  <int> <chr>
#1     1 B    
#2     1 C    
#3     2 C