Python 将数据帧相乘，得到列中值的乘积_Python_Pandas_Recursion_Dataframe_Iteration

Python 将数据帧相乘，得到列中值的乘积

python pandas recursion dataframe

Python 将数据帧相乘，得到列中值的乘积,python,pandas,recursion,dataframe,iteration,Python,Pandas,Recursion,Dataframe,Iteration,我需要帮助创建Python函数以实现以下目标： 1）将3个数据帧作为输入（包含索引列，在第二列中包含关联的整数或浮点值）。这些定义如下： import pandas as pd df1=pd.DataFrame([['placementA',2],['placementB',4]],columns= ['placement','value']) df1.set_index('placement',inplace=True) df2=pd.DataFrame([['strategyA',1]

我需要帮助创建Python函数以实现以下目标：

1）将3个数据帧作为输入（包含索引列，在第二列中包含关联的整数或浮点值）。这些定义如下：

import pandas as pd

df1=pd.DataFrame([['placementA',2],['placementB',4]],columns=
['placement','value'])
df1.set_index('placement',inplace=True)

df2=pd.DataFrame([['strategyA',1],['strategyB',5],['strategyC',6]],columns=
['strategy','value'])
df2.set_index('strategy',inplace=True)

df3=pd.DataFrame([['categoryA',1.5],['categoryB',2.5]],columns=
['category','value'])
df3.set_index('category',inplace=True)

2）使用这三个数据帧，创建一个新的数据帧（“df4”），该数据帧组织前3列中3个索引的所有可能组合

3）在第4列中，附加来自三个源数据帧的所有相关“值”的数学积。因此，函数的DataFrame输出应如下所示：

非常感谢你的帮助

Colin

使用所有索引和列的

product

，并通过构造函数创建

DataFrame

，对于多个所有列使用：

备选方案是按列表理解的多个所有值：

import operator
import functools
from  itertools import product

names = ['placement','strategy','category']
a = list(product(df1.index, df2.index, df3.index))

b = product(df1['value'], df2['value'], df3['value'])
data = [functools.reduce(operator.mul, x, 1) for x in b]

df = pd.DataFrame(a, columns=names).assign(mult=data)
print (df)
     placement   strategy   category  mult
0   placementA  strategyA  categoryA   3.0
1   placementA  strategyA  categoryB   5.0
2   placementA  strategyB  categoryA  15.0
3   placementA  strategyB  categoryB  25.0
4   placementA  strategyC  categoryA  18.0
5   placementA  strategyC  categoryB  30.0
6   placementB  strategyA  categoryA   6.0
7   placementB  strategyA  categoryB  10.0
8   placementB  strategyB  categoryA  30.0
9   placementB  strategyB  categoryB  50.0
10  placementB  strategyC  categoryA  36.0
11  placementB  strategyC  categoryB  60.0

具有

数据帧列表的动态解决方案

，只需在每个数据帧中使用相同的列名

值

：

dfs = [df1, df2, df3]

names = ['placement','strategy','category']
a = list(product(*[x.index for x in dfs]))
b = list(product(*[x['value'] for x in dfs]))
data = pd.DataFrame(b).product(1)

df = pd.DataFrame(a, columns=names).assign(mult=data)
print (df)
     placement   strategy   category  mult
0   placementA  strategyA  categoryA   3.0
1   placementA  strategyA  categoryB   5.0
2   placementA  strategyB  categoryA  15.0
3   placementA  strategyB  categoryB  25.0
4   placementA  strategyC  categoryA  18.0
5   placementA  strategyC  categoryB  30.0
6   placementB  strategyA  categoryA   6.0
7   placementB  strategyA  categoryB  10.0
8   placementB  strategyB  categoryA  30.0
9   placementB  strategyB  categoryB  50.0
10  placementB  strategyC  categoryA  36.0
11  placementB  strategyC  categoryB  60.0

@Bharath猜测

itertools

最好确保

itertools

和

pd。多索引

生成相同的产品，或者匹配的值错误。先生，产品的多索引非常好。Op在输出中添加了C类，但没有提出问题。感谢耶兹雷尔-我注意到这与我的预期输出并不完全匹配。看起来好像我的“C类”在输出组合中丢失了。我想这是一个简单的改变？此外，我想我还可以提到，我希望将此解决方案扩展到可变数量的输入数据帧（它不会像本例中那样总是3帧）。@ColinBlyth这是一个非常好的解决方案，现在您可以投票了，所以请投票并感谢回答者。

dfs = [df1, df2, df3]

names = ['placement','strategy','category']
a = list(product(*[x.index for x in dfs]))
b = list(product(*[x['value'] for x in dfs]))
data = pd.DataFrame(b).product(1)

df = pd.DataFrame(a, columns=names).assign(mult=data)
print (df)
     placement   strategy   category  mult
0   placementA  strategyA  categoryA   3.0
1   placementA  strategyA  categoryB   5.0
2   placementA  strategyB  categoryA  15.0
3   placementA  strategyB  categoryB  25.0
4   placementA  strategyC  categoryA  18.0
5   placementA  strategyC  categoryB  30.0
6   placementB  strategyA  categoryA   6.0
7   placementB  strategyA  categoryB  10.0
8   placementB  strategyB  categoryA  30.0
9   placementB  strategyB  categoryB  50.0
10  placementB  strategyC  categoryA  36.0
11  placementB  strategyC  categoryB  60.0