Python 使用日期时间索引进行过滤
通过数据帧的日期时间索引,可以轻松获取日期范围:Python 使用日期时间索引进行过滤,python,pandas,Python,Pandas,通过数据帧的日期时间索引,可以轻松获取日期范围: df[datetime(2018,1,1):datetime(2018,1,10)] 过滤也很简单: df[ (df['column A'] = 'Done') & (df['column B'] < 3.14 )] df[(df['column A']='Done')&(df['column B']“2018-01-01” m2=df.index
df[datetime(2018,1,1):datetime(2018,1,10)]
过滤也很简单:
df[ (df['column A'] = 'Done') & (df['column B'] < 3.14 )]
df[(df['column A']='Done')&(df['column B']<3.14)]
但是,同时按日期范围和任何其他非日期标准过滤的最佳方法是什么
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2018-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2018-2-1':'2018-2-10'])
希望!这将有助于3个布尔条件
如果存在多个布尔掩码,则可以使用:
m1=df.index>“2018-01-01”
m2=df.index<'2018-01-10'
m3=df[“A列”]=“完成”
m4=df[‘B’列]<3.14
#皮尔斯菲尔德数据样本
df=df[np.logical_和.reduce([m1,m2,m3,m4])]
打印(df)
A列B列
2018-01-04完成2.533385
2018-01-06完成2.789072
2018-01-08完成2.230017
我在下面这样做是为了筛选两个数据帧是否具有相同的日期
corn_url = 'https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=WPU012202&scale=left&cosd=1971-01-01&coed=2020-04-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2020-06-09&revision_date=2020-06-09&nd=1971-01-01'
wheat_url ='https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=WPU0121&scale=left&cosd=1947-01-01&coed=2020-04-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2020-06-09&revision_date=2020-06-09&nd=1947-01-01'
corn = pd.read_csv(corn_url,index_col=0,parse_dates=True)
wheat = pd.read_csv(wheat_url,index_col=0, parse_dates=True)
corn.head()
PP Index 1982
DATE
1971-01-01 63.4
1971-02-01 63.6
1971-03-01 62.0
1971-04-01 60.8
1971-05-01 60.2
wheat.head()
PP Index 1982
DATE
1947-01-01 53.1
1947-02-01 56.5
1947-03-01 68.0
1947-04-01 66.0
1947-05-01 66.7
wheat = wheat[wheat.index > '1970-12-31']
wheat.head()
PP Index 1982
DATE
1971-01-01 42.6
1971-02-01 42.6
1971-03-01 41.4
1971-04-01 41.7
1971-05-01 41.8
发布一个sampledf及其预期dft,它显示如何按日期过滤,但不显示如何同时按其他列过滤。如果我想按日期范围内的行进行筛选,而a列中的值小于3.14,该怎么办?我可以这样做:df[(df.index>datetime(2018,1,1))&(df.index
np.random.seed([3, 1415])
df = pd.DataFrame({
'column A': ['Done', 'Not Done'] * 10,
'column B': np.random.randn(20) + np.pi
}, pd.date_range('2017-12-25', periods=20))
df
column A column B
2017-12-25 Done 1.011868
2017-12-26 Not Done 1.873127
2017-12-27 Done 1.171093
2017-12-28 Not Done 0.882538
2017-12-29 Done 2.792306
2017-12-30 Not Done 3.114638
2017-12-31 Done 3.457829
2018-01-01 Not Done 3.490375
2018-01-02 Done 3.856957
2018-01-03 Not Done 3.912356
2018-01-04 Done 2.533385
2018-01-05 Not Done 3.493983
2018-01-06 Done 2.789072
2018-01-07 Not Done 2.725724
2018-01-08 Done 2.230017
2018-01-09 Not Done 2.999055
2018-01-10 Done 3.888432
2018-01-11 Not Done 1.637436
2018-01-12 Done 3.752955
2018-01-13 Not Done 3.541812
m1 = df.index > '2018-01-01'
m2 = df.index < '2018-01-10'
m3 = df['column A'] == 'Done'
m4 = df['column B'] < 3.14
#piRSquared's data sample
df = df[np.logical_and.reduce([m1, m2, m3, m4])]
print (df)
column A column B
2018-01-04 Done 2.533385
2018-01-06 Done 2.789072
2018-01-08 Done 2.230017
corn_url = 'https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=WPU012202&scale=left&cosd=1971-01-01&coed=2020-04-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2020-06-09&revision_date=2020-06-09&nd=1971-01-01'
wheat_url ='https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=WPU0121&scale=left&cosd=1947-01-01&coed=2020-04-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2020-06-09&revision_date=2020-06-09&nd=1947-01-01'
corn = pd.read_csv(corn_url,index_col=0,parse_dates=True)
wheat = pd.read_csv(wheat_url,index_col=0, parse_dates=True)
corn.head()
PP Index 1982
DATE
1971-01-01 63.4
1971-02-01 63.6
1971-03-01 62.0
1971-04-01 60.8
1971-05-01 60.2
wheat.head()
PP Index 1982
DATE
1947-01-01 53.1
1947-02-01 56.5
1947-03-01 68.0
1947-04-01 66.0
1947-05-01 66.7
wheat = wheat[wheat.index > '1970-12-31']
wheat.head()
PP Index 1982
DATE
1971-01-01 42.6
1971-02-01 42.6
1971-03-01 41.4
1971-04-01 41.7
1971-05-01 41.8