dplyr:：mutate（）返回；运算因子中的错误（gene_id$symbol，x）：因子的级别集不同；_R_Dplyr

dplyr:：mutate（）返回；运算因子中的错误（gene_id$symbol，x）：因子的级别集不同；

dplyr:：mutate（）返回；运算因子中的错误（gene_id$symbol，x）：因子的级别集不同；,r,dplyr,R,Dplyr,我的数据 gene_list <- data.frame(mouse_gene = c("Ccnb1", "Cdk1", "Cdh3", "Cdkn1c"), human_gene = c("SLCO2B1", "PPP1R3C", "MMP13", "CLEC6A")) gene_id <- data.frame(gene_id = c("23334", "100001", "12341236", "34553433", "22998",

我的数据

gene_list <- data.frame(mouse_gene = c("Ccnb1", "Cdk1", "Cdh3", "Cdkn1c"),
                    human_gene = c("SLCO2B1", "PPP1R3C", "MMP13", "CLEC6A"))

gene_id <- data.frame(gene_id = c("23334", "100001", "12341236", "34553433", "22998", "123121213"),
                  symbol = c("SLCO2B1", "PPP1R3C", "FX-232", "MMP13", "CLEC6A", "CSCCD"))

我知道在这种情况下我可以使用join。但是，我想知道如果我需要在dplyr:：mutate中使用函数，我应该怎么做

此外，有时当我想在一列中使用一个值，输入到一个函数中，然后放入一个新列时，我会得到

Column `new_column` must be length 568 (the number of rows) or one, not 2

有人能告诉我原因吗？感谢不要使用

==

，而是使用

match

获取索引，因为

会进行元素比较，如果两个数据集的行数不同，则会在

长度中产生问题，即它会将第一行的行1与第二行的行1进行比较，第二行的行2->row2，第3行->第3行，值可以位于列中的任何位置，在执行==

find_geneID <- function(x) {gene_id$gene_id[match(gene_list[[x]], gene_id$symbol)]}
gene_list %>% 
       mutate(gene_id = find_geneID('human_gene'))
#  mouse_gene human_gene  gene_id
#1      Ccnb1    SLCO2B1    23334
#2       Cdk1    PPP1R3C   100001
#3       Cdh3      MMP13 34553433
#4     Cdkn1c     CLEC6A    22998

数据
gene\u list我在[.data.frame
中得到一个错误（gene\u id，gene\u id$symbol%，在%x中）：未定义的列selected@pill45我在你的函数中注意到的一点是，你正在返回匹配的元素，但是长度将不同于gene_列表的行数。你期望的输出是什么。这应该是一个列表吗output@pill45还是字符串列？gene\u list%>%突变（new=toString（human\u gene[human_gene%在%gene_id$symbol中]）我只需要一个符号对应的gene_id，如SLC02B1 return 23334；MMP13 return34553433@pill45然后做一个连接
Error in Ops.factor(gene_id$symbol, x) : level sets of factors are different

Column `new_column` must be length 568 (the number of rows) or one, not 2

find_geneID <- function(x) {gene_id$gene_id[match(gene_list[[x]], gene_id$symbol)]}
gene_list %>% 
       mutate(gene_id = find_geneID('human_gene'))
#  mouse_gene human_gene  gene_id
#1      Ccnb1    SLCO2B1    23334
#2       Cdk1    PPP1R3C   100001
#3       Cdh3      MMP13 34553433
#4     Cdkn1c     CLEC6A    22998

left_join(gene_list, gene_id, by = c('human_gene' = 'symbol'))
#  mouse_gene human_gene  gene_id
#1      Ccnb1    SLCO2B1    23334
#2       Cdk1    PPP1R3C   100001
#3       Cdh3      MMP13 34553433
#4     Cdkn1c     CLEC6A    22998

gene_list <- data.frame(mouse_gene = c("Ccnb1", "Cdk1", "Cdh3", "Cdkn1c"),
                    human_gene = c("SLCO2B1", "PPP1R3C", "MMP13", "CLEC6A"),
                     stringsAsFactors = FALSE)

gene_id <- data.frame(gene_id = c("23334", "100001", "12341236", 
    "34553433", "22998", "123121213"),
                  symbol = c("SLCO2B1", "PPP1R3C", "FX-232", 
     "MMP13", "CLEC6A", "CSCCD"), stringsAsFactors = FALSE)