Python 如何在Dataframe的192列（每列80000个值）中找到NaT和NaN第一次出现的位置_Python_Pandas

Python 如何在Dataframe的192列（每列80000个值）中找到NaT和NaN第一次出现的位置

python pandas

Python 如何在Dataframe的192列（每列80000个值）中找到NaT和NaN第一次出现的位置,python,pandas,Python,Pandas,我有一个192列x 80000个值的数据帧。但是，有些列有NaN（不是数字）和NaT（不是时间）。如何找到他们第一次出现的地点？>我尝试了以下方法： import pandas as pd import matplotlib.pyplot as plt #df has 192 columns and each column has 80000 values for i,j in zip(df.columns[::2],df.columns[1::2]): print(df[(str(d

我有一个192列x 80000个值的数据帧。但是，有些列有

NaN（不是数字）

和

NaT（不是时间）

。如何找到他们第一次出现的地点？>我尝试了以下方法：

import pandas as pd
import matplotlib.pyplot as plt

#df has 192 columns and each column has 80000 values
for i,j in zip(df.columns[::2],df.columns[1::2]):
    print(df[(str(df[i])=='NaT')].index,df[(str(df[j])=='NaN')].index)

输出为：

KeyError: False

During handling of the above exception, another exception occurred:

83912 83912
83451 83451
83681 83681
83697 83697
83873 83873
83660 83660
82975 82975
83847 83847
0 0
83537 83537
83762 83762
.

对于以下修改代码：

print(df[(df[i]=='NaT')].index,df[(df[j]=='NaN')].index)

我得到的输出是：

Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
Int64Index([], dtype='int64') Int64Index([], dtype='int64')
.
.

错在哪里？为什么这里没有出现值

根据@ChrisA的回答：

print(df[i].isna().idxmax(),df[j].isna().idxmax())

输出为：

KeyError: False

During handling of the above exception, another exception occurred:

83912 83912
83451 83451
83681 83681
83697 83697
83873 83873
83660 83660
82975 82975
83847 83847
0 0
83537 83537
83762 83762
.

为什么中间的一些列返回0？这是否意味着，这两列的第一个样本为空？但实际上这两列都有值

根据标记的重复项使用

pd.Series.isnull

，不需要字符串转换。使用

df.isna（）.idxmax（）

将为每个column@ChrisA谢谢你的方法确实奏效了。但是，你看到输出了吗？为什么中间的一些列返回0？这是否意味着，这两列的第一个样本为空？但实际上这两列都有值。@Msquare很好。您可以尝试：

df.isna（）.idxmax（）*np.where（df.isna（）.any（），1，np.nan）

0现在应该表示索引0处的nan-nan表示没有nan值