Python 根据与另一数据帧中的值匹配的行数创建新列
我想根据df2中每个水果的行数创建一个新列Python 根据与另一数据帧中的值匹配的行数创建新列,python,python-3.x,regex,pandas,dataframe,Python,Python 3.x,Regex,Pandas,Dataframe,我想根据df2中每个水果的行数创建一个新列 Expected Output of df1 No | Fruit_Name | 2018 | 2019 | 2020 1 | Apple | 2 | 1 | 0 2 | Banana | 0 | 0 | 1 3 | Cherries | 0 | 0 | 1 不起作用的代码: i=0 for i in range(3): df1['2018'] = len(df2.l
Expected Output of df1
No | Fruit_Name | 2018 | 2019 | 2020
1 | Apple | 2 | 1 | 0
2 | Banana | 0 | 0 | 1
3 | Cherries | 0 | 0 | 1
不起作用的代码:
i=0
for i in range(3):
df1['2018'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
df1['2019'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
df1['2020'] = len(df2.loc[df2['fruit_farmed'] == df1['Fruit_Name'][i]])
i=i+1
Output:
No Fruit_Name 2018 2019 2020
0 1 Apple 1 1 1
1 2 Banana 1 1 1
2 3 Cherries 1 1 1
您可以尝试使用
crosstab
然后join
s = pd.crosstab(df2.fruit_farmed, df2.year)
s = s.reindex(df1.Fruit_Name)
s.index=df1.index
df1 = df1.join(s)
另一种方法是按种植的水果分组,一年一次,然后一年一次
import pandas as pd
df2 = pd.DataFrame([[2018,'John','Apple'],[2019,'Timo','Apple'],
[2020,'Eva','Cherries'],[2020,'Frey','Banna'],
[2018,'Ali','Apple']],
columns=['year','farmer','fruit_farmed'])
df1 = df2.groupby(['fruit_farmed','year']).count().unstack('year').reset_index().fillna(0)
#rename the columns
df1.columns = ['fruit_farmed','2018','2019','2020']
print(df1)
fruit_farmed 2018 2019 2020
0 Apple 2.0 1.0 0.0
1 Banna 0.0 0.0 1.0
2 Cherries 0.0 0.0 1.0
import pandas as pd
df2 = pd.DataFrame([[2018,'John','Apple'],[2019,'Timo','Apple'],
[2020,'Eva','Cherries'],[2020,'Frey','Banna'],
[2018,'Ali','Apple']],
columns=['year','farmer','fruit_farmed'])
df1 = df2.groupby(['fruit_farmed','year']).count().unstack('year').reset_index().fillna(0)
#rename the columns
df1.columns = ['fruit_farmed','2018','2019','2020']
print(df1)
fruit_farmed 2018 2019 2020
0 Apple 2.0 1.0 0.0
1 Banna 0.0 0.0 1.0
2 Cherries 0.0 0.0 1.0