在Dataframe python中使用NaT筛选列中的所有行_Python_Pandas_Dataframe

在Dataframe python中使用NaT筛选列中的所有行

python pandas dataframe

在Dataframe python中使用NaT筛选列中的所有行,python,pandas,dataframe,Python,Pandas,Dataframe,我有这样一个df： a b c 1 NaT w 2 2014-02-01 g 3 NaT x df=df[df.b=='2014-02-01'] 会给我 a b c 2 2014-02-01 g 我想要一个b列中包含NaT的所有行的数据库 df=df[df.b==None] #Doesn't work 我想要这个： a b

我有这样一个df：

    a b           c
    1 NaT         w
    2 2014-02-01  g
    3 NaT         x   

    df=df[df.b=='2014-02-01']

会给我

    a  b          c
    2 2014-02-01  g

我想要一个b列中包含NaT的所有行的数据库

   df=df[df.b==None] #Doesn't work

我想要这个：

    a b           c
    1 NaT         w
    3 NaT         x

isnull

和

notnull

与

NaT

一起工作，因此您可以以处理

NaNs

的方式处理它们：

>>> df

   a          b  c
0  1        NaT  w
1  2 2014-02-01  g
2  3        NaT  x

>>> df.dtypes

a             int64
b    datetime64[ns]
c            object

只需使用

isnull

选择：

df[df.b.isnull()]

   a   b  c
0  1 NaT  w
2  3 NaT  x

使用示例数据帧：

df = pd.DataFrame({"a":[1,2,3], 
                   "b":[pd.NaT, pd.to_datetime("2014-02-01"), pd.NaT], 
                   "c":["w", "g", "x"]})

在v0.17之前，这不起作用：

df.query('b != b')

你必须这样做：

df.query('b == "NaT"') # yes, surprisingly, this works!

但是，由于v0.17，这两种方法都有效，尽管我只推荐第一种。

对于那些感兴趣的人，在我的例子中，我想删除数据帧的DateTimeIndex中包含的NaT。我不能像Karl D建议的那样直接使用notnull结构。首先必须从索引中创建一个临时列，然后应用掩码，然后再次删除临时列

df["TMP"] = df.index.values                # index is a DateTimeIndex
df = df[df.TMP.notnull()]                  # remove all NaT values
df.drop(["TMP"], axis=1, inplace=True)     # delete TMP again

我觉得@DSM的评论本身就值得回答，因为这回答了根本问题

这种误解来自这样的假设：

pd.NaT

的行为类似于

None

。但是，虽然

None==None

True

，

pd.NaT==pd.NaT

False

。Pandas

NaT

的行为类似于浮点

NaN

，它与自身不相等

正如前面的答案所解释的，您应该使用

df[df.b.isnull()] # or notnull(), respectively

嗯，df[df.b==pd.NaT]？@acushner:

pd.NaT！=但是pd.NaT

，就像

nan！=nan

。也许最近版本的pandas最好使用.loc方法对数据帧进行切片，类似这样：

df.loc[df.b.isnull（）]