R:在单个表之间匹配列并添加另一列的特定值

R:在单个表之间匹配列并添加另一列的特定值,r,merge,dplyr,R,Merge,Dplyr,我想把老鼠和人类的基因配对。 因此,我有两个单独的表,我想比较鼠标列并添加另一个与人类基因名称匹配的列 我用dplyr软件包尝试了一下,但没能让它工作。 合并对我也没有帮助 表1 dput(d) 表2 dput(小鼠到人类基因) 我正试图和你合并 added_to_list <- d %>% mutate(mouse_to_human=if_else("d$SYMBOL" == "mouse_to_human_genes$MGI.symbol",

我想把老鼠和人类的基因配对。 因此,我有两个单独的表,我想比较鼠标列并添加另一个与人类基因名称匹配的列

我用dplyr软件包尝试了一下,但没能让它工作。 合并对我也没有帮助

表1

dput(d)

表2

dput(小鼠到人类基因)

我正试图和你合并

added_to_list <- d %>%
                  mutate(mouse_to_human=if_else("d$SYMBOL" == "mouse_to_human_genes$MGI.symbol", c("mouse_to_human_genes$HGNC.symbol"), as.character(NA))
将\u添加到\u列表%
变异(鼠标到人类=如果其他(“d$SYMBOL”=“鼠标到人类基因$MGI.SYMBOL”),c(“鼠标到人类基因$HGNC.SYMBOL”),如.character(NA))
在本栏中只给我一个带有NA的列表。 已经谢谢你的帮助了!!
我想你可以做以下几件事:

a$human=apply(a,1,函数(x)b$HGNC.symbol[tolower(b$HGNC.symbol)==tolower(x[3]),其中a为表1,b为表2


如果你想加入一个公共列,你可以使用dplyr的left_join。或者你也可以看看biomaRt。检查这个。

我从他们的网站上得到了一个例子。问题是,顺序改变了,如果没有同源,它不会给我一个“NA”,所以这很难。我可以重命名这些列,但其中一列有50000个genes,22000个基因上的人类..编辑了我的答案。看看是否有帮助。只给我FALSE。我想添加列HGNC.symbol的匹配值,而不是FALSE或TRUE。尝试a$human=apply(a,1,函数(x)b$HGNC.symbol[tolower(b$HGNC.symbol)==tolower(x[3]))其中a是表1,b表2也更新了上面的答案,以备将来参考。再次感谢。但这是错误:
$
structure(list(MGI.symbol = c("Pemt", "Mid2", "Ndufa9", "Ndufa9", 
"Cttnbp2", "Cdh1", "Brat1", "Ccm2", "Cdh4", "Itgb2l", "Tbrg4", 
"Slc22a18", "Itgb2", "Tfe3", "Alox12", "Gna12", "Galnt1", "Rnf17", 
"Igsf5", "Ccnd2", "Rtca", "Dbt", "Fgf23", "Fgf6", "Bcl6b", "Klf6", 
"Myf5", "Fap", "Cav2", "Pparg", "Slfn4", "Slfn4", "Gcg", "Dgke", 
"Apoh", "Raf1", "Cdc45", "Nalcn", "Ckmt1", "Mkrn2", "Tbx2", "Lck", 
"Xpo6", "Lhx2", "Gmpr", "Axin2", "Trim25", "Hddc2", "Trappc10", 
"Trappc10", "Mx1", "Cox5a", "Scml2", "Egfl6", "Comt", "Scpep1", 
"Tmprss2", "Dazap2", "Arvcf", "Tbx4", "Rem1", "Drp2", "Tpd52l1", 
"Tssk3", "Btbd17", "Gpr107", "Ins2", "Wnt9a", "Glra1", "Th", 
"Mnt", "Pih1d2", "Scmh1", "Scnn1g", "Tspan32", "Dlat", "Wnt3", 
"Fer", "Sdhd", "Sdhd", "Ckmt1", "Narf", "Ngfr"), HGNC.symbol = c("PEMT", 
"MID2", "", "NDUFA9", "CTTNBP2", "CDH1", "BRAT1", "CCM2", "CDH4", 
"ITGB2", "TBRG4", "SLC22A18", "ITGB2", "TFE3", "ALOX12", "GNA12", 
"GALNT1", "RNF17", "IGSF5", "CCND2", "RTCA", "DBT", "FGF23", 
"FGF6", "BCL6B", "KLF6", "MYF5", "FAP", "CAV2", "PPARG", "SLFN12", 
"SLFN12L", "GCG", "DGKE", "APOH", "RAF1", "CDC45", "NALCN", "CKMT1B", 
"MKRN2", "TBX2", "LCK", "XPO6", "LHX2", "GMPR", "AXIN2", "TRIM25", 
"HDDC2", "TRAPPC10", "", "MX1", "COX5A", "SCML2", "EGFL6", "COMT", 
"SCPEP1", "TMPRSS2", "DAZAP2", "ARVCF", "TBX4", "REM1", "DRP2", 
"TPD52L1", "TSSK3", "BTBD17", "GPR107", "INS", "WNT9A", "GLRA1", 
"TH", "MNT", "PIH1D2", "SCMH1", "SCNN1G", "TSPAN32", "DLAT", 
"WNT3", "FER", "", "SDHD", "CKMT1A", "NARF", "NGFR"), Chromosome.scaffold.name = c("17", 
"X", "12", "12", "7", "16", "7", "7", "20", "21", "7", "11", 
"21", "X", "17", "7", "18", "13", "21", "12", "1", "1", "12", 
"12", "17", "10", "12", "2", "7", "3", "17", "17", "2", "17", 
"17", "3", "22", "13", "15", "3", "17", "1", "16", "9", "6", 
"17", "17", "6", "21", "21", "21", "15", "X", "X", "22", "17", 
"21", "12", "22", "17", "20", "X", "6", "1", "17", "9", "11", 
"1", "5", "11", "17", "11", "1", "16", "11", "11", "17", "5", 
"11", "11", "15", "17", "17"), Gene.start..bp. = c(17505563L, 
107825755L, 4657634L, 4649095L, 117710651L, 68737225L, 2537877L, 
44999475L, 61252426L, 44885953L, 45100100L, 2899721L, 44885953L, 
49028726L, 6996065L, 2728112L, 35581117L, 24764152L, 39745407L, 
4273772L, 100266207L, 100186919L, 4368227L, 4428155L, 7023020L, 
3775996L, 80716912L, 162170684L, 116287380L, 12287368L, 35411060L, 
35464249L, 162142873L, 56834099L, 66212033L, 12583601L, 19479459L, 
101053776L, 43593054L, 12557014L, 61399896L, 32251239L, 28097979L, 
124001670L, 16238580L, 65528563L, 56887909L, 125219962L, 44012319L, 
5155499L, 41420304L, 74919791L, 18239314L, 13569605L, 19941607L, 
56978105L, 41464551L, 51238292L, 19969896L, 61452404L, 31475293L, 
101219769L, 125119049L, 32351521L, 74356416L, 130053426L, 2159779L, 
227918656L, 151822513L, 2163929L, 2384060L, 112064010L, 41027200L, 
23182715L, 2301997L, 112024814L, 46762506L, 108747822L, 112086824L, 
112086773L, 43692886L, 82458180L, 49495293L)), .Names = c("MGI.symbol", 
"HGNC.symbol", "Chromosome.scaffold.name", "Gene.start..bp."), class = "data.frame", row.names = c(NA, 
-83L))
added_to_list <- d %>%
                  mutate(mouse_to_human=if_else("d$SYMBOL" == "mouse_to_human_genes$MGI.symbol", c("mouse_to_human_genes$HGNC.symbol"), as.character(NA))