Python根据行值合并三个数据帧
假设我有一个具有此结构的数据帧:Python根据行值合并三个数据帧,python,pandas,merge,Python,Pandas,Merge,假设我有一个具有此结构的数据帧: T1P1_T0 Count T1P1_T1 Count.1 T1P1_T3 Count.2 0 one 1207.0 four 1936 one 644.0 1 two 816.0 two 1601 seven 414.0 2 three 712.0 five 1457 NaN NaN 3 NaN NaN six
T1P1_T0 Count T1P1_T1 Count.1 T1P1_T3 Count.2
0 one 1207.0 four 1936 one 644.0
1 two 816.0 two 1601 seven 414.0
2 three 712.0 five 1457 NaN NaN
3 NaN NaN six 4564 NaN NaN
Mi所需输出如下:
Element T1P1_T0 T1P1_T1 T1P1_T3
0 one 1207 NaN 644.0
1 two 816 1601.0 NaN
2 three 712 NaN NaN
3 four NaN 1936.0 NaN
4 five 1456.0 NaN
5 six NaN 4564.0 NaN
6 seven NaN NaN 414.0
我尝试将初始数据帧分为三个部分:
df1 = df.iloc[:,:2]
df2 = df.iloc[:,2:4]
df3 = df.iloc[:,4:]
并尝试使用不同的pd方法合并前两个,然后合并第三个。合并:
例如:
result = pd.merge(df1, df2, right_on=df.iloc[:,0], left_on=df.iloc[:,0])
但结果不是我想要的:
key_0 T1P1_T0 Count T1P1_T1 Count.1
0 one one 1207.0 four 1936
1 two two 816.0 two 1601
2 three three 712.0 five 1457
3 NaN NaN NaN six 4564
我不知道如何使用元素名指定列作为合并操作的键值
有什么建议吗
感谢从您的数据出发,您可以进行更多的争论,以将数据转换为所需的形式;此外,与其合并,不如尝试连接: 作为旁注,我想知道是否可以以更好的格式接收数据,这样您就不必在错误可能渗透的地方进行争论
df1 = df.iloc[:, :2].dropna()
df1 = (
df1.set_index(df1.iloc[:, 0].rename("Element"))
.iloc[:, -1]
.rename(df1.iloc[:, 0].name)
)
df2 = df.iloc[:, 2:4].dropna()
df2 = (
df2.set_index(df2.iloc[:, 0].rename("Element"))
.iloc[:, -1]
.rename(df2.iloc[:, 0].name)
)
df3 = df.iloc[:, 4:].dropna()
df3 = (
df3.set_index(df3.iloc[:, 0].rename("Element"))
.iloc[:, -1]
.rename(df3.iloc[:, 0].name)
)
df1
Element
one 1207.0
two 816.0
three 712.0
Name: T1P1_T0, dtype: float64
df2
Element
four 1936
two 1601
five 1457
six 4564
Name: T1P1_T1, dtype: int64
df3
Element
one 644.0
seven 414.0
Name: T1P1_T3, dtype: float64
现在,连接:
pd.concat([df1, df2, df3], axis="columns")
T1P1_T0 T1P1_T1 T1P1_T3
Element
one 1207.0 NaN 644.0
two 816.0 1601.0 NaN
three 712.0 NaN NaN
four NaN 1936.0 NaN
five NaN 1457.0 NaN
six NaN 4564.0 NaN
seven NaN NaN 414.0
让我们来做
concat
out = pd.concat([x.set_index(x.columns[0]).iloc[:,0].dropna() for x in [df1,df2,df3]],keys=df.columns[::2],axis=1)
T1P1_T0 T1P1_T1 T1P1_T3
one 1207.0 NaN 644.0
two 816.0 1601.0 NaN
three 712.0 NaN NaN
four NaN 1936.0 NaN
five NaN 1457.0 NaN
six NaN 4564.0 NaN
seven NaN NaN 414.0