Python 如何在数据帧中只记录非零值并替换O';s与NA';s
如何记录数据帧中的非零值并用NA替换0 我的数据框架如下所示:Python 如何在数据帧中只记录非零值并替换O';s与NA';s,python,python-2.7,python-3.x,dataframe,Python,Python 2.7,Python 3.x,Dataframe,如何记录数据帧中的非零值并用NA替换0 我的数据框架如下所示: time y1 y2 0 2017-08-06 00:52:00 0 10 1 2017-08-06 00:52:10 1 20 2 2017-08-06 00:52:20 2 0 3 2017-08-06 00:52:30 3 0 4 2017-08-06 00:52:40 0 5 5 2017-08-06 00
time y1 y2
0 2017-08-06 00:52:00 0 10
1 2017-08-06 00:52:10 1 20
2 2017-08-06 00:52:20 2 0
3 2017-08-06 00:52:30 3 0
4 2017-08-06 00:52:40 0 5
5 2017-08-06 00:52:50 4 6
6 2017-08-06 00:53:00 6 11
7 2017-08-06 00:53:10 7 12
8 2017-08-06 00:53:20 8 0
9 2017-08-06 00:53:30 0 13
我想获取除第一列时间之外的所有列的日志,日志应该只计算非零值,并且零应该替换为NA?我该怎么做
所以,我试着这样做:
cols = df.columns.difference(['time'])
# Replacing O's with NA's using below:
df[cols] = df[cols].mask(np.isclose(df[cols].values, 0), np.nan)
df[cols] = np.log(df[cols]) # but this will try take log of NA's also.
请帮忙
输出应为具有相同时间列的dataframe,所有零均替换为NA,且除第1列外,所有列的剩余值的日志等效值均为NA。如果我理解正确,您只需将零替换为
np.nan
,然后直接调用np.log
——它可以忽略nan
值
np.log(df[['y1', 'y2']].replace(0, np.nan))
示例
>>> df = pd.DataFrame({'time': pd.date_range('20170101', '20170110'),
'y1' : np.random.randint(0, 3, 10),
'y2': np.random.randint(0, 3, 10)})
>>> df
time y1 y2
0 2017-01-01 1 2
1 2017-01-02 0 1
2 2017-01-03 2 0
3 2017-01-04 0 1
4 2017-01-05 1 0
5 2017-01-06 1 1
6 2017-01-07 2 0
7 2017-01-08 1 0
8 2017-01-09 0 1
9 2017-01-10 2 1
>>> df[['log_y1', 'log_y2']] = np.log(df[['y1', 'y2']].replace(0, np.nan))
>>> df
time y1 y2 log_y1 log_y2
0 2017-01-01 1 2 0.000000 0.693147
1 2017-01-02 0 1 NaN 0.000000
2 2017-01-03 2 0 0.693147 NaN
3 2017-01-04 0 1 NaN 0.000000
4 2017-01-05 1 0 0.000000 NaN
5 2017-01-06 1 1 0.000000 0.000000
6 2017-01-07 2 0 0.693147 NaN
7 2017-01-08 1 0 0.000000 NaN
8 2017-01-09 0 1 NaN 0.000000
9 2017-01-10 2 1 0.693147 0.000000
你能把你期望的作为输出吗?