Python 如何从不同长度和条件的列中形成数据帧？_Python_Pandas_Dataframe

Python 如何从不同长度和条件的列中形成数据帧？

python pandas dataframe

Python 如何从不同长度和条件的列中形成数据帧？,python,pandas,dataframe,Python,Pandas,Dataframe,我有两列来自不同的数据帧，长度不同（60,14），我想将60的每一项与14的所有项进行比较，然后将结果放在另一列与相关的比较列中。我有一个列表列表比较的结果[列60的外部列表[结果1,1的内部列表]，[结果1,2的内部列表]，…[结果（60,14）的内部列表] 我的问题是如何形成这种格式的数据帧？（列60=60行，列14=14*60，列结果=14*60行）：注意：列的项目是列表 col_60 col_14 col_result 1

我有两列来自不同的数据帧，长度不同（60,14），我想将60的每一项与14的所有项进行比较，然后将结果放在另一列与相关的比较列中。我有一个列表列表比较的结果

[列60的外部列表[结果1,1的内部列表]，[结果1,2的内部列表]，…[结果（60,14）的内部列表]

我的问题是如何形成这种格式的数据帧？（列60=60行，列14=14*60，列结果=14*60行）：注意：列的项目是列表

col_60     col_14        col_result
              1          result_of(1,1)
              2          result_of(1,2)
              3             ..
  1           4
              ..
              ..            ..
              ..            ..
              13        result_of(1,13)
              14        result_of(1,14)
____________________________________________
             1          result_of(2,1)
             2          result_of(2,2)
             3             ..
  2          4
             ..
             ..            ..
             ..            ..
             13        result_of(2,13)
             14        result_of(2,14)
____________________________________________
            1          result_of(3,1)
            2          result_of(3,2)
            3             ..
  3         4
            ..
            ..            ..
            ..            ..
            13        result_of(3,13)
            14        result_of(3,14)
____________________________________________
              ..
              ..
              ..

我在中使用了接受的答案，但它将结果列堆叠起来，而没有相关列，结果NaN

您可以使用层次索引来解决此问题。下面是一个示例，说明它如何适用于长度为14的前两对组合

import pandas as pd 



results = ["result(1,1)", "result(1,2)", "result(1,3)", ... "result(2,14)",] 
#put all the results in just one list instead of a list of lists

data = pd.Series(results, index = [['1', '1', '1', '1', '1', '1', #14 ones
                                  '1', '1', '1','1', '1', '1', '1', '1',
'2', '2', '2', '2', '2', '2', '2','2', '2', '2', '2', '2', '2', '2'], #14 two's
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 
 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]])

print(data)



1  1      result(1,1)
   2      result(1,2)
   3      result(1,3)
   4      result(1,4)
   5      result(1,5)
   6      result(1,6)
   7      result(1,7)
   8      result(1,8)
   9      result(1,9)
   10    result(1,10)
   11    result(1,11)
   12    result(1,12)
   13    result(1,13)
   14    result(1,14)
2  1      result(2,1)
   2      result(2,2)
   3      result(2,3)
   4      result(2,4)
   5      result(2,5)
   6      result(2,6)
   7      result(2,7)
   8      result(2,8)
   9      result(2,9)
   10    result(2,10)
   11    result(2,11)
   12    result(2,12)
   13    result(2,13)
   14    result(2,14)
dtype: object
>>>

如果将所有60*14结果放在一个列表中，下面是准备好其他索引的代码：

first_index_raw = [[str(i)]*14 for i in range(1,60)]
first_index_final = [e for e in first_index_raw for e in e]
#a massive list that looks like this: [1,1,...1, 2, 2,....2, 3, 3,    
# 59, 59,...,60,60,...60] every element is repeated 14 times 

second_index = [i for i in range(1,15)]*60
#[1, 2,...14, 1, 2,...14,...1,2,...14] 60 times. 

data = pd.Series(results, index= [first_index_final,second_index])

诚然，您得到的是一个系列，而不是一个数据帧，但我希望这会有所帮助！

感谢您的回复，但我需要形成df并获取值，而无需像前面提到的问题那样直接向其指示，因为事实上我正在处理巨大的数据集，稍后我必须处理df列。