Python apply()根据两列之间的差异返回NaN

Python apply()根据两列之间的差异返回NaN,python,numpy,pandas,Python,Numpy,Pandas,我想用以下代码计算两列I和Imean的绝对差值 def diff(row): """ calculate absolute difference of this row """ return np.abs(row['I'] - row['Imean']) spectrum['diff'] = spectrum.apply(diff, axis=1) 每当spectrum['I']全为零时,spectrum['diff']包含所有nan。我错过了

我想用以下代码计算两列
I
Imean
的绝对差值

    def diff(row):
        """ calculate absolute difference of this row """
        return np.abs(row['I'] - row['Imean'])

    spectrum['diff'] = spectrum.apply(diff, axis=1)
每当
spectrum['I']
全为零时,
spectrum['diff']
包含所有
nan
。我错过了什么? (如果我检查全零情况下的
spectrum['I']
,然后检查
spectrum['diff']=spectrum['Imean']
,我可以避免这个错误。但仍然…)

新增信息:

好的,我进一步调查了我的问题。我通过曲线下方的区域对数据进行标准化,并尽量避免被零除,因为我知道可能存在所有零数据

    s = spectrum['I'].sum()
    try:
        spectrum['I'] /= s
    except ValueError:
        spectrum['I'] = 0.0
我没有从脚本中得到运行时警告,但是如果我在Ipython控制台中运行代码,我会得到
RuntimeWarning:true\u divide中遇到无效值,并且
spectrum['I']
NaN
s替换。如果我使用
ZeroDivisionError
也一样。
那么我如何正确地避免这里被零除呢?

如果我理解正确,您可以这样做:

In [6]: df = pd.DataFrame(np.random.randint(0, 20, (10,2)), columns=['I', 'Imean'])

In [7]: df['diff'] = (df['I'] - df['Imean']).abs()

In [8]: df
Out[8]:
    I  Imean  diff
0   2      9     7
1   9      1     8
2  18     11     7
3   6     19    13
4   5     12     7
5   4      8     4
6  13      3    10
7   1     19    18
8   6      5     1
9   7      0     7
全零:

In [9]: df.I=0

In [10]: df
Out[10]:
   I  Imean  diff
0  0      9     7
1  0      1     8
2  0     11     7
3  0     19    13
4  0     12     7
5  0      8     4
6  0      3    10
7  0     19    18
8  0      5     1
9  0      0     7

In [11]: df['diff'] = (df['I'] - df['Imean']).abs()

In [12]: df
Out[12]:
   I  Imean  diff
0  0      9     9
1  0      1     1
2  0     11    11
3  0     19    19
4  0     12    12
5  0      8     8
6  0      3     3
7  0     19    19
8  0      5     5
9  0      0     0

PS如前所述,请始终提供可复制的样本和所需的数据集,当问熊猫问题时

可能会遗漏很多东西,但我注意到遗漏的第一件事是样本数据集。提出问题时,尽量遵守以下标准: