Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/320.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Concat两个数据帧和重新排序列_Python_Pandas_Join_Concat - Fatal编程技术网

Python Concat两个数据帧和重新排序列

Python Concat两个数据帧和重新排序列,python,pandas,join,concat,Python,Pandas,Join,Concat,我有两个数据帧(df1和df2,如下所示),它们的列在顺序和计数上都不同。我需要将这两个数据框附加到Excel文件中,其中列顺序必须符合下面的Col_list中的规定 df1是: durable_medical_equipment pcp specialist diagnostic imaging generic formulary_brand non_preferred_generic emergency_room inpatient_facility medical

我有两个数据帧(df1和df2,如下所示),它们的列在顺序和计数上都不同。我需要将这两个数据框附加到Excel文件中,其中列顺序必须符合下面的
Col_list
中的规定

df1是:

 durable_medical_equipment    pcp  specialist  diagnostic  imaging  generic  formulary_brand  non_preferred_generic  emergency_room  inpatient_facility  medical_deductible_single  medical_deductible_family  maximum_out_of_pocket_limit_single  maximum_out_of_pocket_limit_family plan_name      pdf_name
0                      False  False       False       False    False    False            False                  False           False               False                      False                      False                               False                               False   ABCBCBC  adjnajdn.pdf
。。。df2是:

   pcp  specialist  generic  formulary_brand  emergency_room  urgent_care  inpatient_facility  durable_medical_equipment  medical_deductible_single  medical_deductible_family  maximum_out_of_pocket_limit_single  maximum_out_of_pocket_limit_family plan_name      pdf_name
0  True        True    False            False            True         True                True                       True                       True                       True                                True                                True   ABCBCBC  adjnajdn.pdf
我正在创建一个列列表,它与excel中的列顺序相同

Col_list = ['durable_medical_equipment', 'pcp', 'specialist', 'diagnostic',
            'imaging', 'generic', 'formulary_brand', 'non_preferred_generic',
            'emergency_room', 'inpatient_facility', 'medical_deductible_single',
            'medical_deductible_family', 'maximum_out_of_pocket_limit_single', 'maximum_out_of_pocket_limit_family',
            'urgent_care', 'plan_name', 'pdf_name']
我正在尝试使用
concat()
根据列列表对数据帧重新排序。对于数据帧中不存在的列值,该值可以为NaN

result = pd.concat([df, pd.DataFrame(columns=list(Col_list))])
这不正常。如何实现这种重新排序

我尝试了以下方法:

 result = pd.concat([df_repo, pd.DataFrame(columns=list(Col_list))], sort=False, ignore_index=True)
        print(result.to_string())
我得到的结果是:

 durable_medical_equipment    pcp specialist diagnostic imaging generic formulary_brand non_preferred_generic emergency_room inpatient_facility medical_deductible_single medical_deductible_family maximum_out_of_pocket_limit_single maximum_out_of_pocket_limit_family plan_name      pdf_name urgent_care
0                     False  False      False      False   False   False           False                 False          False              False                     False                     False                              False                              False   ABCBCBC  adjnajdn.pdf         NaN
    pcp specialist generic formulary_brand emergency_room urgent_care inpatient_facility durable_medical_equipment medical_deductible_single medical_deductible_family maximum_out_of_pocket_limit_single maximum_out_of_pocket_limit_family plan_name      pdf_name diagnostic imaging non_preferred_generic
0  True       True   False           False           True        True               True                      True                      True                      True                               True                               True   ABCBCBC  adjnajdn.pdf        NaN     NaN                   NaN

如果需要,使用列表中的值更改顺序添加并传递到
concat

df = pd.concat([df1.reindex(Col_list, axis=1), 
                df2.reindex(Col_list, axis=1)], sort=False, ignore_index=True)
print (df)
   durable_medical_equipment    pcp  specialist  diagnostic  imaging  generic  \
0                      False  False       False         0.0      0.0    False   
1                       True   True        True         NaN      NaN    False   

   formulary_brand  non_preferred_generic  emergency_room  inpatient_facility  \
0            False                    0.0           False               False   
1            False                    NaN            True                True   

   medical_deductible_single  medical_deductible_family  \
0                      False                      False   
1                       True                       True   

   maximum_out_of_pocket_limit_single  maximum_out_of_pocket_limit_family  \
0                               False                               False   
1                                True                                True   

   urgent_care plan_name      pdf_name  
0          NaN   ABCBCBC  adjnajdn.pdf  
1          1.0   ABCBCBC  adjnajdn.pdf  

我尝试了
result=pd.concat([df\u repo,pd.DataFrame(columns=list(coll\u list))],sort=False,ignore\u index=True)
。它没有给我正确的输出。我已经用我得到的输出更新了我的问题,顺序仍然不一样。实际上,我需要根据我定义的列表更改数据框的顺序,因为在这之后,我将以loop@user1896796-这是我的第二个解决方案,第一个现在被删除了。是的,我可以使用reindex来完成这项工作。我做了如下类似的事情-
result=pd.concat([df_repo,pd.DataFrame(columns=list(Col_list))],sort=False,ignore_index=True)result=result.reindex(Col_list,axis=1)
对于连接使用
concat
而不是
merge
似乎是一个错误,因为您的数据框架共享许多公共列(
pcp,specialist,generic
)。您真的希望这些列在输出中显示两次吗?使用concat,它不会提供重复项。如果您希望组合2个以上具有共享列的数据帧,请使用
merge
而不是
concat