Python 3.x 在pandas中读取for循环时如何连接交叉表_Python 3.x_Pandas_Crosstab

Python 3.x 在pandas中读取for循环时如何连接交叉表

python-3.x pandas

Python 3.x 在pandas中读取for循环时如何连接交叉表,python-3.x,pandas,crosstab,Python 3.x,Pandas,Crosstab,我正在使用Python3.5中的Pandas模块从子目录递归读取交叉表，我希望在调用pd.crosstab（）并在for循环后将输出写入excel文件后，将交叉表连接到for循环中。在调用pd.crosstab（）之后，我尝试将表1复制到表3（请参见下面的代码），但是如果后面的数据文件中不存在某些值，那么表3会显示这些条目的NaN。我查看了pd.concat，但找不到如何在for循环中使用它的示例数据文件看起来像（有100个文件，有很多列，但这里只显示我感兴趣的列）：我的python程序看起

我正在使用Python3.5中的Pandas模块从子目录递归读取交叉表，我希望在调用pd.crosstab（）并在for循环后将输出写入excel文件后，将交叉表连接到for循环中。在调用pd.crosstab（）之后，我尝试将表1复制到表3（请参见下面的代码），但是如果后面的数据文件中不存在某些值，那么表3会显示这些条目的NaN。我查看了pd.concat，但找不到如何在for循环中使用它的示例

数据文件看起来像（有100个文件，有很多列，但这里只显示我感兴趣的列）：

我的python程序看起来像（从文件顶部删除导入）

我试图翻译你的代码

fields = ['StudentID', 'Grade']
path= 'C:/script_testing/'
i=0

parse = lambda f: pd.read_csv(f, usecols=fields)
table3 = pd.concat(
    [parse(f) for f in glob.glob('C:/script_testing/**/*.txt', recursive=True)]
).pipe(lambda d: pd.crosstab(d.StudentID, d.Grade))

writer = pd.ExcelWriter('Report.xlsx', engine='xlsxwriter')
table3.to_excel(writer, sheet_name='StudentID_vs_Grade')
writer.save()

谢谢！成功了。请回答一个后续问题。从glob.gob（）读取时是否有排除某些文件的方法？例如，我想读取所有文件名为Data*，txt的文件，但排除那些数据为*Old.txt的文件？不客气。这是一个

glob

问题。或者，如果不是f中的“DataOld”，您可以执行类似于

[解析glob.glob中的f（'C:/script\u testing/**/Data*.txt'，recursive=True）]的操作。

fields = ['StudentID', 'Grade']
path= 'C:/script_testing/'
i=0

for filename in glob.glob('C:/script_testing/**/*.txt', recursive=True):
    temp = pd.read_csv(filename, sep=',', usecols=fields)
    table1 = pd.crosstab(temp.StudentID, temp.Grade)
    # Note the if condition is executed only once to initlialize table3
    if(i==0):
        table3 = table1
        i = i + 1
    table3 = table3 + table1

writer = pd.ExcelWriter('Report.xlsx', engine='xlsxwriter')
table3.to_excel(writer, sheet_name='StudentID_vs_Grade')
writer.save()

pd.concat([df1, df2, df3]).pipe(lambda d: pd.crosstab(d.StudentID, d.Grade))

Grade      A  B  C
StudentID         
1          1  2  0
2          1  1  1
3          3  0  0

fields = ['StudentID', 'Grade']
path= 'C:/script_testing/'
i=0

parse = lambda f: pd.read_csv(f, usecols=fields)
table3 = pd.concat(
    [parse(f) for f in glob.glob('C:/script_testing/**/*.txt', recursive=True)]
).pipe(lambda d: pd.crosstab(d.StudentID, d.Grade))

writer = pd.ExcelWriter('Report.xlsx', engine='xlsxwriter')
table3.to_excel(writer, sheet_name='StudentID_vs_Grade')
writer.save()