Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
连接Python中的数据帧字典_Python_Pandas_Dataframe_Operating System_Concatenation - Fatal编程技术网

连接Python中的数据帧字典

连接Python中的数据帧字典,python,pandas,dataframe,operating-system,concatenation,Python,Pandas,Dataframe,Operating System,Concatenation,我循环浏览了一个excel文件文件夹,将它们转换为数据帧,并将这些数据帧放入字典中,其中键是文件名。我想做的是使这个大数据框中的文件名无关紧要,因为我需要的数据的列名是唯一的。我想合并“基因”列,因为它们重复,填充NaN分数w/零,然后删除“比率”列 import numpy as np import pandas as pd import math import os folder = r'C:\Users\camer\Desktop\Stack Overflow' # Folder pat

我循环浏览了一个excel文件文件夹,将它们转换为数据帧,并将这些数据帧放入字典中,其中键是文件名。我想做的是使这个大数据框中的文件名无关紧要,因为我需要的数据的列名是唯一的。我想合并“基因”列,因为它们重复,填充NaN分数w/零,然后删除“比率”列

import numpy as np
import pandas as pd
import math
import os

folder = r'C:\Users\camer\Desktop\Stack Overflow' # Folder path
files = os.listdir(folder) 

dict1 = {}
for file in files:
    if file.endswith('.xlsx'):
        df1 = pd.read_excel(os.path.join(folder,file))
        dict1[file] = df1

# Putting all excel files from file into dataframes, then setting those dataframes as the values in the preallocated dict,
# where the keys are the file names

df1 = pd.concat(dict1, axis=1)
df1

如果我尝试在数据框仍由文件名分隔的情况下对基因列进行分组,我会得到以下结果:

df1 = pd.concat(dict1, axis=1)
df1 = df1.groupby(df1.columns, axis=1).sum()
df1

我认为这应该适合您:

pd.concat(dict1.values())
pd.concat(dict1.values(),sort=False).groupby('Genes').sum()