Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在数据帧中只记录非零值并替换O';s与NA';s_Python_Python 2.7_Python 3.x_Dataframe - Fatal编程技术网

Python 如何在数据帧中只记录非零值并替换O';s与NA';s

Python 如何在数据帧中只记录非零值并替换O';s与NA';s,python,python-2.7,python-3.x,dataframe,Python,Python 2.7,Python 3.x,Dataframe,如何记录数据帧中的非零值并用NA替换0 我的数据框架如下所示: time y1 y2 0 2017-08-06 00:52:00 0 10 1 2017-08-06 00:52:10 1 20 2 2017-08-06 00:52:20 2 0 3 2017-08-06 00:52:30 3 0 4 2017-08-06 00:52:40 0 5 5 2017-08-06 00

如何记录数据帧中的非零值并用NA替换0

我的数据框架如下所示:

     time                 y1  y2
0    2017-08-06 00:52:00   0   10
1    2017-08-06 00:52:10   1   20
2    2017-08-06 00:52:20   2   0
3    2017-08-06 00:52:30   3   0
4    2017-08-06 00:52:40   0   5
5    2017-08-06 00:52:50   4   6
6    2017-08-06 00:53:00   6   11
7    2017-08-06 00:53:10   7   12
8    2017-08-06 00:53:20   8   0
9    2017-08-06 00:53:30   0   13
我想获取除第一列时间之外的所有列的日志,日志应该只计算非零值,并且零应该替换为NA?我该怎么做

所以,我试着这样做:

cols = df.columns.difference(['time'])
# Replacing O's with NA's using below:

df[cols] = df[cols].mask(np.isclose(df[cols].values, 0), np.nan)

df[cols] = np.log(df[cols]) # but this will try take log of NA's also.
请帮忙


输出应为具有相同时间列的dataframe,所有零均替换为NA,且除第1列外,所有列的剩余值的日志等效值均为NA。

如果我理解正确,您只需将零替换为
np.nan
,然后直接调用
np.log
——它可以忽略
nan

np.log(df[['y1', 'y2']].replace(0, np.nan))
示例

>>> df = pd.DataFrame({'time': pd.date_range('20170101', '20170110'), 
                       'y1' : np.random.randint(0, 3, 10), 
                       'y2': np.random.randint(0, 3, 10)})

>>> df 
        time  y1  y2
0 2017-01-01   1   2
1 2017-01-02   0   1
2 2017-01-03   2   0
3 2017-01-04   0   1
4 2017-01-05   1   0
5 2017-01-06   1   1
6 2017-01-07   2   0
7 2017-01-08   1   0
8 2017-01-09   0   1
9 2017-01-10   2   1

>>> df[['log_y1', 'log_y2']] = np.log(df[['y1', 'y2']].replace(0, np.nan))

>>> df
        time  y1  y2    log_y1    log_y2
0 2017-01-01   1   2  0.000000  0.693147
1 2017-01-02   0   1       NaN  0.000000
2 2017-01-03   2   0  0.693147       NaN
3 2017-01-04   0   1       NaN  0.000000
4 2017-01-05   1   0  0.000000       NaN
5 2017-01-06   1   1  0.000000  0.000000
6 2017-01-07   2   0  0.693147       NaN
7 2017-01-08   1   0  0.000000       NaN
8 2017-01-09   0   1       NaN  0.000000
9 2017-01-10   2   1  0.693147  0.000000

你能把你期望的作为输出吗?