在R或python中基于一列和不同顺序连接文件
我有3个选项卡分隔的文件,如以下3个示例: 示例文件:在R或python中基于一列和不同顺序连接文件,r,pandas,join,R,Pandas,Join,我有3个选项卡分隔的文件,如以下3个示例: 示例文件: AB 45.2 4.56 0.21 FG 78.1 54.1 36.1 HG 98.1 25.0 12.6 TR 1.2 3.25 65.1 TR 5.2 41.6 10.21 HG 8.1 23.1 56.1 FG 9 32.0 32.6 AB 12.2 31.25 5.1 HG 15.2 21.6 20.21 TR 31.1
AB 45.2 4.56 0.21
FG 78.1 54.1 36.1
HG 98.1 25.0 12.6
TR 1.2 3.25 65.1
TR 5.2 41.6 10.21
HG 8.1 23.1 56.1
FG 9 32.0 32.6
AB 12.2 31.25 5.1
HG 15.2 21.6 20.21
TR 31.1 32.1 66.1
AB 12.1 12.0 62.6
FG 11.3 31.25 54.1
它们中的第一列有相似的项目,但顺序不同。我想根据第1列加入文件,并创建一个与预期输出类似的新文件:
预期产量
AB 45.2 4.56 0.21 12.2 31.25 5.1 12.1 12.0 62.6
FG 78.1 54.1 36.1 9 32.0 32.6 11.3 31.25 54.1
HG 98.1 25.0 12.6 8.1 23.1 56.1 15.2 21.6 20.21
TR 1.2 3.25 65.1 5.2 41.6 10.21 31.1 32.1 66.1
我试图在
R和pandas
中使用join
函数,但它们没有返回预期的输出。你知道我如何在python或R中做到这一点吗?在R中,你可以使用Reduce
和merge
Reduce(function(x, y) merge(x, y, by = 'V1'), list(df1, df2, df3))
#If there are lot of dataframes use `mget` and `ls`
#Reduce(function(x, y) merge(x, y, by = 'V1'), mget(ls(pattern = "df\\d+")))
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
#1 AB 45.2 4.56 0.21 12.2 31.2 5.1 12.1 12.0 62.6
#2 FG 78.1 54.10 36.10 9.0 32.0 32.6 11.3 31.2 54.1
#3 HG 98.1 25.00 12.60 8.1 23.1 56.1 15.2 21.6 20.2
#4 TR 1.2 3.25 65.10 5.2 41.6 10.2 31.1 32.1 66.1
数据
其中,数据中的一列在所有数据帧中都是通用的,而其余的数据帧具有不同的名称
df1 <- structure(list(V1 = structure(1:4, .Label = c("AB", "FG", "HG",
"TR"), class = "factor"), V2 = c(45.2, 78.1, 98.1, 1.2), V3 = c(4.56,
54.1, 25, 3.25), V4 = c(0.21, 36.1, 12.6, 65.1)),
class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(V1 = structure(4:1, .Label = c("AB", "FG", "HG",
"TR"), class = "factor"), V5 = c(5.2, 8.1, 9, 12.2), V6 = c(41.6,
23.1, 32, 31.25), V7 = c(10.21, 56.1, 32.6, 5.1)), class = "data.frame",
row.names = c(NA, -4L))
df3 <- structure(list(V1 = structure(c(3L, 4L, 1L, 2L), .Label = c("AB",
"FG", "HG", "TR"), class = "factor"), V8 = c(15.2, 31.1, 12.1,
11.3), V9 = c(21.6, 32.1, 12, 31.25), V10 = c(20.21, 66.1, 62.6,
54.1)), class = "data.frame", row.names = c(NA, -4L))
df1我们可以在tidyverse
中使用reduce
和internal\u-join
(在R
中)
数据
df1最好是显示您的代码,看起来您需要pd.concat([df1,df2,df3],axis=1)
它在加入之前对列进行排序吗?它将沿着索引加入
library(dplyr)
library(purrr)
mget(paste0('df', 1:3)) %>%
reduce(inner_join)
df1 <- structure(list(V1 = structure(1:4, .Label = c("AB", "FG", "HG",
"TR"), class = "factor"), V2 = c(45.2, 78.1, 98.1, 1.2), V3 = c(4.56,
54.1, 25, 3.25), V4 = c(0.21, 36.1, 12.6, 65.1)),
class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(V1 = structure(4:1, .Label = c("AB", "FG", "HG",
"TR"), class = "factor"), V5 = c(5.2, 8.1, 9, 12.2), V6 = c(41.6,
23.1, 32, 31.25), V7 = c(10.21, 56.1, 32.6, 5.1)), class = "data.frame",
row.names = c(NA, -4L))
df3 <- structure(list(V1 = structure(c(3L, 4L, 1L, 2L), .Label = c("AB",
"FG", "HG", "TR"), class = "factor"), V8 = c(15.2, 31.1, 12.1,
11.3), V9 = c(21.6, 32.1, 12, 31.25), V10 = c(20.21, 66.1, 62.6,
54.1)), class = "data.frame", row.names = c(NA, -4L))