Python 在数据帧中合并行
我有不同公司的财务绩效指标,每年一行。现在,我想让每个公司在一个特定的范围内连续几年的所有指标 现在,我的数据与此类似:Python 在数据帧中合并行,python,dataframe,Python,Dataframe,我有不同公司的财务绩效指标,每年一行。现在,我想让每个公司在一个特定的范围内连续几年的所有指标 现在,我的数据与此类似: import numpy as np import pandas as pd startyear = 2014 endyear = 2015 df = pd.DataFrame(np.array([ ['AAPL', 2014, 0.2, 0.4, 1.5], ['AAPL', 2015, 0.3, 0.4, 2.0], ['AAPL', 2016,
import numpy as np
import pandas as pd
startyear = 2014
endyear = 2015
df = pd.DataFrame(np.array([
['AAPL', 2014, 0.2, 0.4, 1.5],
['AAPL', 2015, 0.3, 0.4, 2.0],
['AAPL', 2016, 0.2, 0.3, 1.5],
['GOGL', 2014, 0.4, 0.5, 0.5],
['GOGL', 2015, 0.6, 0.8, 1.0],
['GOGL', 2016, 0.3, 0.5, 2.0]]),
columns=['Name', 'Year', 'ROE', 'ROA', 'DE'])
newcolumns = (df.columns + [str(startyear)]).append(df.columns + [str(endyear)])
dfnew=pd.DataFrame(columns=newcolumns)
我想要的是(例如,仅2014年和2015年):
到目前为止,我只获得了新的列名,但不知何故,我无法了解如何填充此新数据框。创建新数据框,然后调整列名可能更容易:
# limit to data you want
dfnew = df[df.Year.isin(['2014', '2015'])]
# set index to 'Name' and pivot 'Year's into the columns
dfnew = dfnew.set_index(['Name', 'Year']).unstack()
# sort the columns by year
dfnew = dfnew.sortlevel(1, axis=1)
# rename columns
dfnew.columns = ["".join(a) for a in dfnew.columns.values]
# put 'Name' back into columns
dfnew.reset_index()
可能更容易创建新的数据框,然后调整列名:
# limit to data you want
dfnew = df[df.Year.isin(['2014', '2015'])]
# set index to 'Name' and pivot 'Year's into the columns
dfnew = dfnew.set_index(['Name', 'Year']).unstack()
# sort the columns by year
dfnew = dfnew.sortlevel(1, axis=1)
# rename columns
dfnew.columns = ["".join(a) for a in dfnew.columns.values]
# put 'Name' back into columns
dfnew.reset_index()