Python 如何将CSV文件与熊猫组合（并添加识别列）_Python_Csv_Pandas

Python 如何将CSV文件与熊猫组合（并添加识别列）

python csv pandas

Python 如何将CSV文件与熊猫组合（并添加识别列）,python,csv,pandas,Python,Csv,Pandas,如何将多个CSV文件添加到一起，并添加一个额外的列来指示每个文件的来源到目前为止，我已经： import os import pandas as pd import glob os.chdir('C:\...') # path to folder where all CSVs are stored for f, i in zip(glob.glob('*.csv'), short_list): df = pd.read_csv(f, header = None) df.inde

如何将多个CSV文件添加到一起，并添加一个额外的列来指示每个文件的来源

到目前为止，我已经：

import os
import pandas as pd
import glob

os.chdir('C:\...')  # path to folder where all CSVs are stored
for f, i in zip(glob.glob('*.csv'), short_list):
   df = pd.read_csv(f, header = None)
   df.index = i * len(df) 
   dfs.append(df)

all_data = pd.concat(dfs, ignore_index=True)

除了识别列之外，它都工作得很好

是一个

字符串列表

，我想将其放入

所有数据的a列中。每列的每行一个字符串。相反，它返回许多数字，并给出一个TypeError:Index（..）必须用某种类型的集合调用
预期产出：
str1 file1entry1
str1 file1entry2
str1 file1entry3
str2 file2entry1
str2 file2entry2
str2 file2entry3

其中short\u list=['str1'，'str2'，'str3']
和file1entry1，file2entry2。。。etc
来自我已有的CSV文件
解决方案：
我无法像解决方案建议的那样在一行中获得所有信息，但它为我指明了正确的方向
for f zip(glob.glob('*csv')):
    df = pd.read_csv(f, header = None)
    df = df.assign(id = os.path.basename(f)) # simpler than pulling from the array. Adds file name to each line. 
    dfs.append(df)

all_data = pd.concat(dfs)

您可以使用方法，该方法将向每个解析的CSV添加id
列，并使用i
值填充该列：
df = pd.concat([pd.read_csv(f, header = None).assign(id=i)
                for f, i in zip(glob.glob('*.csv), short_list)],
               ignore_index=True)

我想回应你的评论<代码>str1、str2、str3
存储在短列表
中。打字错误。无需使用*len（df）
。将标量指定给新列时，值将应用于每一行。请注意，您实际上不需要在此处使用熊猫。您可以简单地使用csv
模块。