Python在工作表和下拉列表中进行迭代_Python_Excel_Pandas_Text Parsing

Python在工作表和下拉列表中进行迭代

python excel pandas

Python在工作表和下拉列表中进行迭代,python,excel,pandas,text-parsing,Python,Excel,Pandas,Text Parsing,我需要阅读一个excel文件，并在每张表上进行一些计算。基本上，如果列日期不是“今天”，它需要删除行到目前为止，我得到了以下代码：导入日期时间作为pd进口熊猫 ''' Parsing main excel sheet to save transactions != today's date ''' mainSource = pd.ExcelFile('path/to/file.xlsx') dfs = {sheet_name: mainSource.parse(sheet_name)

我需要阅读一个excel文件，并在每张表上进行一些计算。基本上，如果列日期不是“今天”，它需要删除行

到目前为止，我得到了以下代码：

导入日期时间作为pd进口熊猫

'''
Parsing main excel sheet to save transactions != today's date
'''

mainSource = pd.ExcelFile('path/to/file.xlsx')
dfs = {sheet_name: mainSource.parse(sheet_name)
        for sheet_name in mainSource.sheet_names }

for i in dfs:
    now = datetime.date.today();
    dfs = dfs.drop(dfs.columns[6].dt.year != now, axis = 1);    # It is the 6th column
    if datetime.time()<datetime.time(11,0,0,0):
        dfs.to_excel(r'path\to\outpt\test\'+str(i)+now+'H12.xlsx', index=False); #Save as sheetname+timestamp+textstring
    else:
        dfs.to_excel(r'path\to\output\'+str(i)+now+'H16.xlsx', index=False)

有什么建议吗

谢谢

我认为您需要将

替换为

dfs[I]

，因为

dfs

是由

DataFrames

决定的：

df1 = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':['10-05-2011','10-05-2012','10-10-2016']})

df1.C = pd.to_datetime(df1.C)
print (df1)
   A  B          C
0  1  4 2011-10-05
1  2  5 2012-10-05
2  3  6 2016-10-10

df2 = pd.DataFrame({'A':[3,5,7],
                   'B':[9,3,4],
                   'C':['08-05-2013','08-05-2012','10-10-2016']})

df2.C = pd.to_datetime(df2.C)
print (df2)
   A  B          C
0  3  9 2013-08-05
1  5  3 2012-08-05
2  7  4 2016-10-10

names = ['a','b']

dfs = {names[i]:x for i, x in enumerate([df1,df2])}
print (dfs)
{'a':    A  B          C
0  1  4 2011-10-05
1  2  5 2012-10-05
2  3  6 2016-10-10, 'b':    A  B          C
0  3  9 2013-08-05
1  5  3 2012-08-05
2  7  4 2016-10-10}

通过以下方式删除所有行：

dfs中的i的

：
now=pd.datetime.today（）.date（）；
打印（现在）
#选择3.5列，在实际数据中替换为5
mask=dfs[i].iloc[：，2].dt.date==now
打印（遮罩）
df=dfs[i][mask]
打印（df）
2016-10-10
0错误
1错误
2正确
姓名：C，数据类型：bool
A、B、C
2  3  6 2016-10-10
2016-10-10
0错误
1错误
2正确
姓名：C，数据类型：bool
A、B、C
2  7  4 2016-10-10    
如果datetime.time（）我认为您需要将dfs
替换为I
：I=I.drop（I.columns[6].dt.year！=now，axis=1）
然后i.to_excel（…）
好的，dfs是一个数据帧字典，因此您不能像.drop那样对其使用数据帧操作。谢谢您的回复。但是，尝试了该操作后，出现了一个错误，错误是AttributeError:'str'对象没有属性'drop'这解决了迭代问题，因此问题得以解决。非常感谢你。现在我必须处理日期格式，因为这会弄乱脚本的其余部分。
df1 = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':['10-05-2011','10-05-2012','10-10-2016']})

df1.C = pd.to_datetime(df1.C)
print (df1)
   A  B          C
0  1  4 2011-10-05
1  2  5 2012-10-05
2  3  6 2016-10-10

df2 = pd.DataFrame({'A':[3,5,7],
                   'B':[9,3,4],
                   'C':['08-05-2013','08-05-2012','10-10-2016']})

df2.C = pd.to_datetime(df2.C)
print (df2)
   A  B          C
0  3  9 2013-08-05
1  5  3 2012-08-05
2  7  4 2016-10-10

names = ['a','b']

dfs = {names[i]:x for i, x in enumerate([df1,df2])}
print (dfs)
{'a':    A  B          C
0  1  4 2011-10-05
1  2  5 2012-10-05
2  3  6 2016-10-10, 'b':    A  B          C
0  3  9 2013-08-05
1  5  3 2012-08-05
2  7  4 2016-10-10}

for i in dfs:
    now = pd.datetime.today().date();
    print (now)
    #select 3.column, in real data replace to 5
    mask = dfs[i].iloc[:,2].dt.date == now
    print (mask)
    df = dfs[i][mask]
    print (df)

2016-10-10
0    False
1    False
2     True
Name: C, dtype: bool
   A  B          C
2  3  6 2016-10-10
2016-10-10
0    False
1    False
2     True
Name: C, dtype: bool
   A  B          C
2  7  4 2016-10-10    

    if datetime.time()<datetime.time(11,0,0,0):
        df.to_excel(r'path\to\outpt\test\'+str(i)+now+'H12.xlsx', index=False); 
    else:
        df.to_excel(r'path\to\output\'+str(i)+now+'H16.xlsx', index=False)