Python 在pandas中查找3列可能的唯一组合

Python 在pandas中查找3列可能的唯一组合,python,pandas,dataframe,Python,Pandas,Dataframe,我试图在pandas中找到3个变量列的所有可能组合。示例df如下所示: Variable_Name Variable1 Variable2 Variable3 0 X 6.0% 8.0% 10.0% 1 Y 3.0% 4.0% 5.0% 2 Z 1.0% 3.0% 5.0% 这些组合只能

我试图在pandas中找到3个变量列的所有可能组合。示例df如下所示:

          Variable_Name Variable1 Variable2 Variable3
0                  X      6.0%      8.0%     10.0%
1                  Y      3.0%      4.0%      5.0%
2                  Z      1.0%      3.0%      5.0%
这些组合只能从该列中获取值,而不能将值移动到其他列,例如,使用4.0%作为“X”是不正确的

尝试使用
itertools.compositions
itertools.product
itertools.permutation
,但这些结果给出了所有可能的组合

我希望结果如下所示,给出27种可能的组合:

     Y      X     Z
0   3.0%   6.0%  1.0%
1   3.0%   6.0%  3.0%
2   3.0%   6.0%  5.0%
3   3.0%   8.0%  1.0%
4   3.0%   8.0%  3.0%
5   3.0%   8.0%  5.0%
6   3.0%  10.0%  1.0%
7   3.0%  10.0%  3.0%
8   3.0%  10.0%  5.0%
9   4.0%   8.0%  3.0%
10  4.0%   8.0%  1.0%
11  4.0%   8.0%  5.0%
12  4.0%   6.0%  1.0%
13  4.0%   6.0%  3.0%
14  4.0%   6.0%  5.0%
15  4.0%  10.0%  1.0%
16  4.0%  10.0%  3.0%
17  4.0%  10.0%  5.0%
18  5.0%  10.0%  5.0%
19  5.0%  10.0%  1.0%
20  5.0%  10.0%  3.0%
21  5.0%   8.0%  1.0%
22  5.0%   8.0%  3.0%
23  5.0%   8.0%  5.0%
24  5.0%   6.0%  1.0%
25  5.0%   6.0%  3.0%
26  5.0%   6.0%  5.0%

任何帮助都将不胜感激。

让我们尝试连续交叉合并每个变量的值:

从functools导入reduce
作为pd进口熊猫
df=pd.DataFrame({'Variable_Name':{0:X',1:Y',2:Z'},
'Variable1':{0:'6.0%',1:'3.0%',2:'1.0%},
'Variable2':{0:'8.0%',1:'4.0%',2:'3.0%},
'Variable3':{0:'10.0%',1:'5.0%',2:'5.0%'})
#保存变量名称以备以后使用
变量名称=df['Variable\u Name']
#在自己的行中获取变量选项
new_df=df.set_index('Variable_Name').stack()\
.液滴液位(1,0)\
.reset_index()
#获取数据帧集合,每个数据帧都有自己的变量
dfs=元组(新的_-df[新的_-df['Variable _-Name']].eq(v)]
.drop(列=['Variable\u Name'])用于变量名中的v)
#连续交叉合并
new_df=reduce(lambda left,right:pd.merge(left,right,how='cross'),dfs)
#固定列名
new_df.columns=变量名称
#固定轴名称
new_df=new_df.重命名_轴(无,轴=1)
#展示
打印(新的字符串到字符串())
输出:

X Y Z 0 6.0% 3.0% 1.0% 1 6.0% 3.0% 3.0% 2 6.0% 3.0% 5.0% 3 6.0% 4.0% 1.0% 4 6.0% 4.0% 3.0% 5 6.0% 4.0% 5.0% 6 6.0% 5.0% 1.0% 7 6.0% 5.0% 3.0% 8 6.0% 5.0% 5.0% 9 8.0% 3.0% 1.0% 10 8.0% 3.0% 3.0% 11 8.0% 3.0% 5.0% 12 8.0% 4.0% 1.0% 13 8.0% 4.0% 3.0% 14 8.0% 4.0% 5.0% 15 8.0% 5.0% 1.0% 16 8.0% 5.0% 3.0% 17 8.0% 5.0% 5.0% 18 10.0% 3.0% 1.0% 19 10.0% 3.0% 3.0% 20 10.0% 3.0% 5.0% 21 10.0% 4.0% 1.0% 22 10.0% 4.0% 3.0% 23 10.0% 4.0% 5.0% 24 10.0% 5.0% 1.0% 25 10.0% 5.0% 3.0% 26 10.0% 5.0% 5.0% X Y Z 0 6.0% 3.0% 1.0% 1 6.0% 3.0% 3.0% 2 6.0% 3.0% 5.0% 3 6.0% 4.0% 1.0% 4 6.0% 4.0% 3.0% 5 6.0% 4.0% 5.0% 6 6.0% 5.0% 1.0% 7 6.0% 5.0% 3.0% 8 6.0% 5.0% 5.0% 9 8.0% 3.0% 1.0% 10 8.0% 3.0% 3.0% 11 8.0% 3.0% 5.0% 12 8.0% 4.0% 1.0% 13 8.0% 4.0% 3.0% 14 8.0% 4.0% 5.0% 15 8.0% 5.0% 1.0% 16 8.0% 5.0% 3.0% 17 8.0% 5.0% 5.0% 18 10.0% 3.0% 1.0% 19 10.0% 3.0% 3.0% 20 10.0% 3.0% 5.0% 21 10.0% 4.0% 1.0% 22 10.0% 4.0% 3.0% 23 10.0% 4.0% 5.0% 24 10.0% 5.0% 1.0% 25 10.0% 5.0% 3.0% 26 10.0% 5.0% 5.0%
您可以使用交叉连接。在pandas中,您可以使用参数
how='cross'
使用
pd.merge()
pd.DataFrame.join()
。但是在交叉连接之前,您需要将每个变量放置在长(未插入)格式的数据帧中(您的表是宽格式(透视的))

如果您需要在循环中使用代码,它将是这样的

variables = df['Variable_Name'].unique()
columns_to_cross = ['Variable1', 'Variable2', 'Variable3']
cross_join_df = df.loc[df['Variable_Name'] == variables[0], columns_to_cross].T
for var in variables[1:]:
    to_join_df = df.loc[df['Variable_Name'] == var, columns_to_cross].T
    cross_join_df = pd.merge(cross_join_df, to_join_df, how='cross')
cross_join_df.columns = variables

@HenryEcker这是一个错误,它已被更改。
variables = df['Variable_Name'].unique()
columns_to_cross = ['Variable1', 'Variable2', 'Variable3']
cross_join_df = df.loc[df['Variable_Name'] == variables[0], columns_to_cross].T
for var in variables[1:]:
    to_join_df = df.loc[df['Variable_Name'] == var, columns_to_cross].T
    cross_join_df = pd.merge(cross_join_df, to_join_df, how='cross')
cross_join_df.columns = variables