Python 使用pandas查找两列之间的差异
我想找出数据帧中int类型的两列之间的差异。我正在使用python 2.7。列如下所示-Python 使用pandas查找两列之间的差异,python,pandas,dataframe,nan,subtraction,Python,Pandas,Dataframe,Nan,Subtraction,我想找出数据帧中int类型的两列之间的差异。我正在使用python 2.7。列如下所示- >>> df INVOICED_QUANTITY QUANTITY_SHIPPED 0 15 NaN 1 20 NaN 2 7 NaN 3 7
>>> df
INVOICED_QUANTITY QUANTITY_SHIPPED
0 15 NaN
1 20 NaN
2 7 NaN
3 7 NaN
4 7 NaN
现在,我想从发票数量中减去发货数量&我做以下操作-
>>> df['Diff'] = df['QUANTITY_INVOICED'] - df['SHIPPED_QUANTITY']
>>> df
QUANTITY_INVOICED SHIPPED_QUANTITY Diff
0 15 NaN NaN
1 20 NaN NaN
2 7 NaN NaN
3 7 NaN NaN
4 7 NaN NaN
我该怎么照顾楠家?我希望得到以下结果,因为我希望NaN被视为0(零)-
我不想做df.fillna(0)
。总而言之,我会尝试下面的方法&它有效,但不会有什么不同-
>>> df['Sum'] = df[['QUANTITY_INVOICED', 'SHIPPED_QUANTITY']].sum(axis=1)
>>> df
INVOICED_QUANTITY QUANTITY_SHIPPED Diff Sum
0 15 NaN NaN 15
1 20 NaN NaN 20
2 7 NaN NaN 7
3 7 NaN NaN 7
4 7 NaN NaN 7
我认为一个简单的0填充NaN将帮助您解决问题
df['Diff'] = df['INVOICED_QUANTITY'] - df['QUANTITY_SHIPPED'].fillna(0)
Out[153]:
INVOICED_QUANTITY QUANTITY_SHIPPED Diff
0 15 NaN 15
1 20 NaN 20
2 7 NaN 7
3 7 NaN 7
4 7 NaN 7
您可以使用
sub
方法执行减法-此方法允许将NaN
值视为指定值:
df['Diff'] = df['INVOICED_QUANTITY'].sub(df['QUANTITY_SHIPPED'], fill_value=0)
产生:
INVOICED_QUANTITY QUANTITY_SHIPPED Diff
0 15 NaN 15
1 20 NaN 20
2 7 NaN 7
3 7 NaN 7
4 7 NaN 7
另一种简洁的方法是:填写列中缺少的值(创建列的副本),然后按常规进行减法 这两种方法几乎相同,尽管
sub
效率稍高一些,因为它不需要事先生成列的副本;它只是“动态”填充缺少的值:
#李建勋-我不想做一件事(0)。还有其他选择吗?已经编辑了我的问题,请看一看。
fillna()
只返回一个副本,而不是修改基础框架。我修改了代码以适应您的需要。
INVOICED_QUANTITY QUANTITY_SHIPPED Diff
0 15 NaN 15
1 20 NaN 20
2 7 NaN 7
3 7 NaN 7
4 7 NaN 7
In [46]: %timeit df['INVOICED_QUANTITY'] - df['QUANTITY_SHIPPED'].fillna(0)
10000 loops, best of 3: 144 µs per loop
In [47]: %timeit df['INVOICED_QUANTITY'].sub(df['QUANTITY_SHIPPED'], fill_value=0)
10000 loops, best of 3: 81.7 µs per loop