Python 熊猫合并或加入较小的数据帧_Python_Pandas_Dataframe

Python 熊猫合并或加入较小的数据帧

python pandas dataframe

Python 熊猫合并或加入较小的数据帧,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个问题，即我有一个长数据帧和一个短数据帧，我想合并，以便较短的数据帧重复自身，以填充较长（左）df的长度 df1： |索引|晶圆|芯片|值| --------------------------------- | 0 | 1 | 32 | 0.99 | | 1 | 1 | 33 | 0.89 | | 2 | 1 | 39 | 0.96 | | 3 | 2 | 32 | 0.81 | | 4

我有一个问题，即我有一个长数据帧和一个短数据帧，我想合并，以便较短的数据帧重复自身，以填充较长（左）df的长度

df1：
|索引|晶圆|芯片|值|
---------------------------------
| 0      | 1     | 32   | 0.99  |
| 1      | 1     | 33   | 0.89  |
| 2      | 1     | 39   | 0.96  |
| 3      | 2     | 32   | 0.81  |
| 4      | 2     | 33   | 0.87  |
df2：
|指数| x | y|
-------------------------
| 0      |   1   |   3  |
| 1      |   2   |   2  |
| 2      |   1   |   6  |
df_组合：
|索引|晶圆|芯片|值| x | y|
-------------------------------------------------
| 0      | 1     | 32   | 0.99  |   1   |   3   |
| 1      | 1     | 33   | 0.89  |   2   |   2   |
| 2      | 1     | 39   | 0.96  |   1   |   6   |
|3 | 2 | 32 | 0.81 | 1 | 3 |您可以重复df2
，直到它与df1
一样长，然后reset|u index
和merge
：
new_len = round(len(df1)/len(df2))
repeated = (pd.concat([df2] * new_len)
              .reset_index()
              .drop(["index"], 1)
              .iloc[:len(df1)])

repeated
   x  y
0  1  3
1  2  2
2  1  6
3  1  3
4  2  2

df1.merge(repeated, how="outer", left_index=True, right_index=True)
   Wafer  Chip  Value   x  y
0      1    32    0.99  1  3
1      1    33    0.89  2  2
2      1    39    0.96  1  6
3      2    32    0.81  1  3
4      2    33    0.87  2  2

有点不对劲，但应该能用
注意：我假设您的索引
列实际上不是一列，但实际上是用来表示数据帧索引的。我之所以做出这个假设，是因为您在merge（）
代码中引用了left\u index
/right\u index
参数。如果Index
实际上是它自己的列，那么这段代码基本上可以工作，如果您不想在最后df
中使用它，您只需删除Index
如果您不希望它出现在df1[“Index”]
df2[“Index”]的长度上的左连接就可以实现这一点：

那很有效，谢谢！我很想知道是否有一个更内置的方式，但现在很好，干杯。关于你的编辑：正确。在我的实际代码中，我使用了left\u index=True
和right\u index=True很好，很高兴我能帮上忙。
new_len = round(len(df1)/len(df2))
repeated = (pd.concat([df2] * new_len)
              .reset_index()
              .drop(["index"], 1)
              .iloc[:len(df1)])

repeated
   x  y
0  1  3
1  2  2
2  1  6
3  1  3
4  2  2

df1.merge(repeated, how="outer", left_index=True, right_index=True)
   Wafer  Chip  Value   x  y
0      1    32    0.99  1  3
1      1    33    0.89  2  2
2      1    39    0.96  1  6
3      2    32    0.81  1  3
4      2    33    0.87  2  2

# Creating Modular Index values on df1
n = df2.shape[0]
df1["Modular Index"] = df1["Index"].apply(lambda x: str(int(x)%n))

# Merging dataframes
df_combined = df1.merge(df2, how="left", left_on="Modular Index", right_on="Index")

# Dropping unnecessary columns
df_combined = df_combined.drop(["Modular Index", "Index_y"], axis=1)

print(df_combined)

0 Index_x Wafer Chip Value  x  y
0       0     1   32  0.99  1  3
1       1     1   33  0.89  2  2
2       2     1   39  0.96  1  6
3       3     2   32  0.81  1  3
4       4     2   33  0.87  2  2