Python apply()根据两列之间的差异返回NaN
我想用以下代码计算两列Python apply()根据两列之间的差异返回NaN,python,numpy,pandas,Python,Numpy,Pandas,我想用以下代码计算两列I和Imean的绝对差值 def diff(row): """ calculate absolute difference of this row """ return np.abs(row['I'] - row['Imean']) spectrum['diff'] = spectrum.apply(diff, axis=1) 每当spectrum['I']全为零时,spectrum['diff']包含所有nan。我错过了
I
和Imean
的绝对差值
def diff(row):
""" calculate absolute difference of this row """
return np.abs(row['I'] - row['Imean'])
spectrum['diff'] = spectrum.apply(diff, axis=1)
每当spectrum['I']
全为零时,spectrum['diff']
包含所有nan
。我错过了什么?
(如果我检查全零情况下的spectrum['I']
,然后检查spectrum['diff']=spectrum['Imean']
,我可以避免这个错误。但仍然…)
新增信息:
好的,我进一步调查了我的问题。我通过曲线下方的区域对数据进行标准化,并尽量避免被零除,因为我知道可能存在所有零数据
s = spectrum['I'].sum()
try:
spectrum['I'] /= s
except ValueError:
spectrum['I'] = 0.0
我没有从脚本中得到运行时警告,但是如果我在Ipython控制台中运行代码,我会得到RuntimeWarning:true\u divide中遇到无效值,并且spectrum['I']
被NaN
s替换。如果我使用ZeroDivisionError
也一样。
那么我如何正确地避免这里被零除呢?如果我理解正确,您可以这样做:
In [6]: df = pd.DataFrame(np.random.randint(0, 20, (10,2)), columns=['I', 'Imean'])
In [7]: df['diff'] = (df['I'] - df['Imean']).abs()
In [8]: df
Out[8]:
I Imean diff
0 2 9 7
1 9 1 8
2 18 11 7
3 6 19 13
4 5 12 7
5 4 8 4
6 13 3 10
7 1 19 18
8 6 5 1
9 7 0 7
全零:
In [9]: df.I=0
In [10]: df
Out[10]:
I Imean diff
0 0 9 7
1 0 1 8
2 0 11 7
3 0 19 13
4 0 12 7
5 0 8 4
6 0 3 10
7 0 19 18
8 0 5 1
9 0 0 7
In [11]: df['diff'] = (df['I'] - df['Imean']).abs()
In [12]: df
Out[12]:
I Imean diff
0 0 9 9
1 0 1 1
2 0 11 11
3 0 19 19
4 0 12 12
5 0 8 8
6 0 3 3
7 0 19 19
8 0 5 5
9 0 0 0
PS如前所述,请始终提供可复制的样本和所需的数据集,当问熊猫问题时可能会遗漏很多东西,但我注意到遗漏的第一件事是样本数据集。提出问题时,尽量遵守以下标准: