Python3熊猫-如何创建4列的所有组合,并将它们写入CSV中的行中?

Python3熊猫-如何创建4列的所有组合,并将它们写入CSV中的行中?,python,csv,pandas,combinations,Python,Csv,Pandas,Combinations,我有4个CSV文件和一列。每列表示名称的一部分(4部分): CSV 1: first_name michael madonna steve albert second_name luke han kurt first_last_name jackson jobs skywalker second_last_name solo cobain einstein CSV 2: first_name michael madonna steve albert second_name luke h

我有4个CSV文件和一列。每列表示名称的一部分(4部分):

CSV 1:

first_name
michael
madonna
steve
albert
second_name
luke
han
kurt
first_last_name
jackson
jobs
skywalker
second_last_name
solo
cobain
einstein
CSV 2:

first_name
michael
madonna
steve
albert
second_name
luke
han
kurt
first_last_name
jackson
jobs
skywalker
second_last_name
solo
cobain
einstein
CSV 3:

first_name
michael
madonna
steve
albert
second_name
luke
han
kurt
first_last_name
jackson
jobs
skywalker
second_last_name
solo
cobain
einstein
CSV 4:

first_name
michael
madonna
steve
albert
second_name
luke
han
kurt
first_last_name
jackson
jobs
skywalker
second_last_name
solo
cobain
einstein
我想要的最终结果是获得所有4列(4个CSV)之间的所有可能组合:


使用熊猫时,我将每个CSV转换为一个数据帧,但我不知道如何将这四个组合起来。如何实现这一点?

使用
itertools.product
进行重物搬运

import pandas as pd
from itertools import product

lists = [list(pd.read_csv('data{}.csv'.format(i), header=0).iloc[:,0]) for i in range(1,5)]
combined = list(','.join(items) for items in product(*lists))
pd.DataFrame(combined).to_csv('combined.csv', index=0)
如果只需要列表表单,请使用
组合
。它看起来像:

['michael,luke,jackson,solo',
 'michael,luke,jackson,cobain',
 'michael,luke,jackson,einstein',
 'michael,luke,jobs,solo',
 'michael,luke,jobs,cobain',
 'michael,luke,jobs,einstein',
 'michael,luke,skywalker,solo',
 'michael,luke,skywalker,cobain',
 'michael,luke,skywalker,einstein',
 ...

或者最后一行将组合值写入CSV。

对于重载,使用
itertools.product

import pandas as pd
from itertools import product

lists = [list(pd.read_csv('data{}.csv'.format(i), header=0).iloc[:,0]) for i in range(1,5)]
combined = list(','.join(items) for items in product(*lists))
pd.DataFrame(combined).to_csv('combined.csv', index=0)
import numpy as np 
import pandas as pd 
import itertools
import functools

def cartesian(df1, df2):
    rows = itertools.product(df1.iterrows(), df2.iterrows())    
    df = pd.DataFrame(left.append(right) for (_, left), (_, right) in rows)
    return df.reset_index(drop=True)

df1 = pd.read_csv('first_name.csv')
df2 = pd.read_csv('second_name.csv')
df3 = pd.read_csv('first_last_name.csv')
df4 = pd.read_csv('second_last_name.csv')

combined = functools.reduce(cartesian, [df1, df2, df3, df4])
combined.to_csv('combined.csv')
如果只需要列表表单,请使用
组合
。它看起来像:

['michael,luke,jackson,solo',
 'michael,luke,jackson,cobain',
 'michael,luke,jackson,einstein',
 'michael,luke,jobs,solo',
 'michael,luke,jobs,cobain',
 'michael,luke,jobs,einstein',
 'michael,luke,skywalker,solo',
 'michael,luke,skywalker,cobain',
 'michael,luke,skywalker,einstein',
 ...
或者最后一行将组合值写入CSV

import numpy as np 
import pandas as pd 
import itertools
import functools

def cartesian(df1, df2):
    rows = itertools.product(df1.iterrows(), df2.iterrows())    
    df = pd.DataFrame(left.append(right) for (_, left), (_, right) in rows)
    return df.reset_index(drop=True)

df1 = pd.read_csv('first_name.csv')
df2 = pd.read_csv('second_name.csv')
df3 = pd.read_csv('first_last_name.csv')
df4 = pd.read_csv('second_last_name.csv')

combined = functools.reduce(cartesian, [df1, df2, df3, df4])
combined.to_csv('combined.csv')