Python 如何将多个列的乘积创建为新列

Python 如何将多个列的乘积创建为新列,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个pandas数据框: import pandas as pd df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0], 'dummy_2' : [1, 1, 0, 0, 1, 1], 'dummy_3' : [1, 1, 1, 0, 0, 0]}) 我想添加新的列(在同一数据框中)产品,每个列的产品,以及其他两个 因此,生成的数据帧如下所示: df = pd

我有一个
pandas
数据框:

 import pandas as pd

 df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0],
                    'dummy_2' : [1, 1, 0, 0, 1, 1],
                    'dummy_3' : [1, 1, 1, 0, 0, 0]})
我想添加新的列(在同一数据框中)产品,每个列的产品,以及其他两个

因此,生成的数据帧如下所示:

df = pd.DataFrame({     'dummy_1' : [0, 0, 0, 1, 1, 0],
                        'dummy_2' : [1, 1, 0, 0, 1, 1],
                        'dummy_3' : [1, 1, 1, 0, 0, 0],
                        'dummy_12' :[0, 0, 0, 0, 1, 0],
                        'dummy_13' :[0, 0, 0, 0, 0, 0],
                        'dummy_23' :[1, 1, 0, 0, 0, 0]})
有没有一种有效的方法?所谓高效,我指的是一种适用于50列的方法,您需要:

import pandas as pd

df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0],
                    'dummy_2' : [1, 1, 0, 0, 1, 1],
                    'dummy_3' : [1, 1, 1, 0, 0, 0]})

df['dummy_12'] = df['dummy_1']*df['dummy_2']
df['dummy_13'] = df['dummy_1']*df['dummy_3']
df['dummy_23'] = df['dummy_2']*df['dummy_3']

print(df)
输出:

    dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    
   dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    
你需要:

import pandas as pd

df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0],
                    'dummy_2' : [1, 1, 0, 0, 1, 1],
                    'dummy_3' : [1, 1, 1, 0, 0, 0]})

df['dummy_12'] = df['dummy_1']*df['dummy_2']
df['dummy_13'] = df['dummy_1']*df['dummy_3']
df['dummy_23'] = df['dummy_2']*df['dummy_3']

print(df)
输出:

    dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    
   dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    

使用
itertools.combines
获取所有组合,并迭代这些组合以计算矢量化乘积并分配给新列

import pandas as pd
from itertools import combinations
df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0],
                'dummy_2' : [1, 1, 0, 0, 1, 1],
                'dummy_3' : [1, 1, 1, 0, 0, 0]})
for i in combinations(df.columns, 2):
    col_name = i[0] + i[1].split('_')[-1]
    df[col_name] = df[i[0]] * df[i[1]]
输出

dummy_1 dummy_2 dummy_3 dummy_12    dummy_13    dummy_23
0       1       1       0           0           1
0       1       1       0           0           1
0       0       1       0           0           0
1       0       0       0           0           0
1       1       0       1           0           0
0       1       0       0           0           0

使用
itertools.combines
获取所有组合,并迭代这些组合以计算矢量化乘积并分配给新列

import pandas as pd
from itertools import combinations
df = pd.DataFrame({'dummy_1' : [0, 0, 0, 1, 1, 0],
                'dummy_2' : [1, 1, 0, 0, 1, 1],
                'dummy_3' : [1, 1, 1, 0, 0, 0]})
for i in combinations(df.columns, 2):
    col_name = i[0] + i[1].split('_')[-1]
    df[col_name] = df[i[0]] * df[i[1]]
输出

dummy_1 dummy_2 dummy_3 dummy_12    dummy_13    dummy_23
0       1       1       0           0           1
0       1       1       0           0           1
0       0       1       0           0           0
1       0       0       0           0           0
1       1       0       1           0           0
0       1       0       0           0           0

这应该可以满足您的需要,而不需要任何额外的导入,只需更改i和j的最大范围,以便在更大的数据帧上使用它(例如50)

输出:

    dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    
   dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    

这应该可以满足您的需要,而不需要任何额外的导入,只需更改i和j的最大范围,以便在更大的数据帧上使用它(例如50)

输出:

    dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    
   dummy_1  dummy_2  dummy_3  dummy_12  dummy_13  dummy_23                                                                                     
0        0        1        1         0         0         1                                                                                     
1        0        1        1         0         0         1                                                                                     
2        0        0        1         0         0         0                                                                                     
3        1        0        0         0         0         0                                                                                     
4        1        1        0         1         0         0                                                                                     
5        0        1        0         0         0         0    

我正在寻找一种适用于50个专栏的方法,例如see@mad_的解决方案。它更通用。我正在寻找一种适用于例如50列的方法。请参阅@mad_的解决方案。它更通用。