Python 3.x 使用另一个dataframe列值扩展dataframe列
我有两个数据帧(df1和df2),如下所示:Python 3.x 使用另一个dataframe列值扩展dataframe列,python-3.x,pandas,Python 3.x,Pandas,我有两个数据帧(df1和df2),如下所示: In [4]:df1 Year Annual Counts 0 1979 45345 1 1980 15381 2 1981 32171 3 1982 30288 4 1983 50573 In [5]:df2 Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-
In [4]:df1
Year Annual Counts
0 1979 45345
1 1980 15381
2 1981 32171
3 1982 30288
4 1983 50573
In [5]:df2
Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-M
0 1984 10645 48143 57366 26979 37603
1 1985 15918 17178 34617 21304 31956
2 1986 51790 44111 50017 29233 61203
3 1987 34039 14504 23136 35848 34688
4 1988 68641 67681 24322 39591 34553
Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-M
0 1979 45345 45345 45345 45345 45345
1 1980 15381 15381 15381 15381 15381
2 1981 32171 32171 32171 32171 32171
3 1982 30288 30288 30288 30288 30288
4 1983 50573 50573 50573 50573 50573
5 1984 10645 48143 57366 26979 37603
6 1985 15918 17178 34617 21304 31956
7 1986 51790 44111 50017 29233 61203
8 1987 34039 14504 23136 35848 34688
9 1988 68641 67681 24322 39591 34553
我想将这两个数据帧组合如下:
In [4]:df1
Year Annual Counts
0 1979 45345
1 1980 15381
2 1981 32171
3 1982 30288
4 1983 50573
In [5]:df2
Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-M
0 1984 10645 48143 57366 26979 37603
1 1985 15918 17178 34617 21304 31956
2 1986 51790 44111 50017 29233 61203
3 1987 34039 14504 23136 35848 34688
4 1988 68641 67681 24322 39591 34553
Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-M
0 1979 45345 45345 45345 45345 45345
1 1980 15381 15381 15381 15381 15381
2 1981 32171 32171 32171 32171 32171
3 1982 30288 30288 30288 30288 30288
4 1983 50573 50573 50573 50573 50573
5 1984 10645 48143 57366 26979 37603
6 1985 15918 17178 34617 21304 31956
7 1986 51790 44111 50017 29233 61203
8 1987 34039 14504 23136 35848 34688
9 1988 68641 67681 24322 39591 34553
我有一个简单的解决方案:
df1 = pd.DataFrame(file1)
df1_list = df1['Annual Counts'].tolist()
# empty lists
ext1=[] ; ext2=[] ; ext3=[] ; ext4=[] ; ext5=[]
df2 = pd.DataFrame(file2)
models = ['CanESM2','GFDL-ESM2M','HadGEM2-ES365','IPSL-CM5A-MR','NorESM1-M']
for idx,m in enumerate(models):
ext+str(idx).append(df1_list)
df2_mod = df2[m].tolist()
ext+str(idx).extend(df2_mod)
如果熊猫有执行此任务的功能,而不需要创建多个列表并扩展它们,有什么建议吗?这里有一种方法:
将列年度计数
重命名为CanESM2
,然后在设置年度
和CanESM2
后用作索引,最后在轴=1上
(df1.rename(columns={'Annual Counts':'CanESM2'})
.set_index(['Year','CanESM2']).combine_first(df2.set_index(['Year','CanESM2']))
.reset_index().ffill(axis=1))
另一种方法是:
与anky_91使用重命名列的方法相同,但在axis=1上使用here和forwardfilling(ffill
):
pd.concat([df1.rename(columns={'Annual Counts':'CanESM2'}), df2],
ignore_index=True,
sort=False).ffill(axis=1)
输出:
Year CanESM2 GFDL-ESM2M HadGEM2-ES365 IPSL-CM5A-MR NorESM1-M
0 1979.0 45345.0 45345.0 45345.0 45345.0 45345.0
1 1980.0 15381.0 15381.0 15381.0 15381.0 15381.0
2 1981.0 32171.0 32171.0 32171.0 32171.0 32171.0
3 1982.0 30288.0 30288.0 30288.0 30288.0 30288.0
4 1983.0 50573.0 50573.0 50573.0 50573.0 50573.0
5 1984.0 10645.0 48143.0 57366.0 26979.0 37603.0
6 1985.0 15918.0 17178.0 34617.0 21304.0 31956.0
7 1986.0 51790.0 44111.0 50017.0 29233.0 61203.0
8 1987.0 34039.0 14504.0 23136.0 35848.0 34688.0
9 1988.0 68641.0 67681.0 24322.0 39591.0 34553.0