Pandas 如何在循环期间通过在列上连接列来合并多个csv文件

Pandas 如何在循环期间通过在列上连接列来合并多个csv文件,pandas,csv,merge,iteration,multiple-columns,Pandas,Csv,Merge,Iteration,Multiple Columns,这是我的问题。我有100个文件,它们都有两列:“时间坡度”和“坡度”。我想创建一个包含所有内容的文件。以下是一个例子: -----file 1---- 2001.1 10 2001.2 20 2001.3 12 2001.4 4 2001.5 1 2001.6 13 -----file 2---- 2001.3 20 2001.4 15 2001.5 6 -----file 3---- 2001.6 15 2

这是我的问题。我有100个文件,它们都有两列:“时间坡度”和“坡度”。我想创建一个包含所有内容的文件。以下是一个例子:

-----file 1----
2001.1     10
2001.2     20
2001.3     12
2001.4      4
2001.5      1
2001.6     13

-----file 2----
2001.3     20
2001.4     15
2001.5     6

-----file 3----
2001.6     15
2001.7     15
2001.8     15
2001.9     20
2002.0     23

**The expected result is:**
------- output file ---------
date    file1 file2 file3
2001.1    10   NAN  NAN
2001.2    20   NAN  NAN         
2001.3    12   NAN  NAN          
2001.4     4    15  NAN                     
2001.5     1     6  NAN                     
2001.6    13   NAN   15
2001.7   NAN   NAN   15
2001.8   NAN   NAN   15
2001.9   NAN   NAN   20
2002.0   NAN   NAN   23
以下是我尝试过的:

import pandas as pd
import os, glob
import numpy as np

filename_list = []

file_path = r"C:\Users\Path"
for file in glob.glob(path + "/*.csv"):
    filename_list.append(file)

from numpy import genfromtxt
df_ini = pd.read_csv('output.csv')         #IN FILE OUTPUT THERE ARE ALREADY TWO COLUMNS WITH VALUES
df_ini.columns=['time_slopes','slope']      
for filename in filename_list:
    with open(filename, 'r') as f:
    # convert numpy array into DataFrame
    numpyarray = genfromtxt(f, delimiter=',')
    df = pd.DataFrame({'time_slopes':numpyarray[:, 0],'slope':numpyarray[:, 1]})
    # remove NaN values:
    df = df.dropna(how='all')
    # re-index file:
    df.reset_index(drop=True, inplace=True)
    # merge file:
    dfmerge = df_ini.merge(df,on='time_slopes',how='left')
    dfmerge.to_csv("output.csv", sep=',', index=False)
这段代码只返回两列——第一列(来自df_ini)和最后一列(来自文件号100)……在每次迭代中,最后一列被重写而不是添加。 日期文件1文件100 2001.1 10南

有人知道怎么解决这个问题吗? 谢谢

这可能会对你有所帮助

file_1 = pd.DataFrame({'date': [2001.1, 2001.2, 2001.3], 'slope': [10, 20, 12]})
file_2 = pd.DataFrame({'date': [2001.4, 2001.5, 2001.6], 'slope': [20, 15, 6]})
file_3 = pd.DataFrame({'date': [2001.6, 2001.7, 2001.8], 'slope': [30, 40, 90]})

df_list = [file_1, file_2, file_3]
for df in df_list:
    df.index = df['date']
    df.drop(['date'], axis=1, inplace=True)

final_df = pd.concat(df_list, axis=1, ignore_index=True)
final_df = final_df.reset_index()

print(final_df)
输出:

     date     0     1     2
0  2001.1  10.0   NaN   NaN
1  2001.2  20.0   NaN   NaN
2  2001.3  12.0   NaN   NaN
3  2001.4   NaN  20.0   NaN
4  2001.5   NaN  15.0   NaN
5  2001.6   NaN   6.0  30.0
6  2001.7   NaN   NaN  40.0
7  2001.8   NaN   NaN  90.0