Python 3.x 两个熊猫系列的减法
我有一个被xlwings检索到的数据帧。我想在两列中生成A.OUT-B.in和A.OUT-C中的两个结果,但它总是显示NaN。有一组三个相同的生产任务单。每个生产任务单有三个不同的停止点a、B、C。所有停止点都有输入量和输出量。要计算A.OUT-B.IN所需的相同MO(停止输出量减去B停止输入量)。A.OUT-C.in中的差异是输入量为C停止。如果我得到NaN值,我应该怎么做?我试图将这两个系列改为数字,搜索谷歌,阅读熊猫文档,但仍然找不到解决方案 以下是示例代码:Python 3.x 两个熊猫系列的减法,python-3.x,pandas,Python 3.x,Pandas,我有一个被xlwings检索到的数据帧。我想在两列中生成A.OUT-B.in和A.OUT-C中的两个结果,但它总是显示NaN。有一组三个相同的生产任务单。每个生产任务单有三个不同的停止点a、B、C。所有停止点都有输入量和输出量。要计算A.OUT-B.IN所需的相同MO(停止输出量减去B停止输入量)。A.OUT-C.in中的差异是输入量为C停止。如果我得到NaN值,我应该怎么做?我试图将这两个系列改为数字,搜索谷歌,阅读熊猫文档,但仍然找不到解决方案 以下是示例代码: import pandas
import pandas as pd
df = pd.DataFrame({'MO': ['510-20200701001', '510-20200701001', '510-20200701001', '510-20200701002', '510-20200701002', '510-20200701002', '510-20200701003', '510-20200701003', '510-20200701003', '510-20200701004', '510-20200701004', '510-20200701004', '510-20200701005', '510-20200701005', '510-20200701005', '510-20200701006', '510-20200701006', '510-20200701006'],
'Stop Name': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'Amount of Input': [21000, 22112, 22476, 12000, 12609, 12775, 15000, 15595, 15844, 600, 775, 790, 1000, 1149, 1176, 6000, 6225, 6289],
'Amount of Output': [22400, 22057, 22330, 12800, 12586, 12685, 16000, 15587, 15718, 800, 775, 783, 1200, 1139, 1162, 6400 ,6225, 6278],
'A.OUT-B.IN':['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'A.OUT-C.IN': ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] })
df.loc[df['Stop Name'] == 'B', 'A.OUT-B.IN'] = df.loc[df['Stop Name'] == 'A', 'Amount of Output'] - df.loc[df['Stop Name'] == 'B', 'Amount of Input']
df.loc[df['Stop Name'] == 'C', 'A.OUT-C.IN'] = df.loc[df['Stop Name'] == 'A', 'Amount of Output'] - df.loc[df['Stop Name'] == 'C', 'Amount of Input']
这是输出
MO Stop Name Amount of Input Amount of Output A.OUT-B.IN A.OUT-C.IN
0 510-20200701001 A 21000 22400
1 510-20200701001 B 22112 22057 NaN
2 510-20200701001 C 22476 22330 NaN
3 510-20200701002 A 12000 12800
4 510-20200701002 B 12609 12586 NaN
5 510-20200701002 C 12775 12685 NaN
6 510-20200701003 A 15000 16000
7 510-20200701003 B 15595 15587 NaN
8 510-20200701003 C 15844 15718 NaN
9 510-20200701004 A 600 800
10 510-20200701004 B 775 775 NaN
11 510-20200701004 C 790 783 NaN
12 510-20200701005 A 1000 1200
13 510-20200701005 B 1149 1139 NaN
14 510-20200701005 C 1176 1162 NaN
15 510-20200701006 A 6000 6400
16 510-20200701006 B 6225 6225 NaN
17 510-20200701006 C 6289 6278 NaN
这里是输出
MO Stop Name Amount of Input Amount of Output A.OUT-B.IN A.OUT-C.IN
0 510-20200701001 A 21000 22400
1 510-20200701001 B 22112 22057 288
2 510-20200701001 C 22476 22330 -76
3 510-20200701002 A 12000 12800
4 510-20200701002 B 12609 12586 191
5 510-20200701002 C 12775 12685 25
6 510-20200701003 A 15000 16000
7 510-20200701003 B 15595 15587 405
8 510-20200701003 C 15844 15718 156
9 510-20200701004 A 600 800
10 510-20200701004 B 775 775 25
11 510-20200701004 C 790 783 10
12 510-20200701005 A 1000 1200
13 510-20200701005 B 1149 1139 51
14 510-20200701005 C 1176 1162 24
15 510-20200701006 A 6000 6400
16 510-20200701006 B 6225 6225 175
17 510-20200701006 C 6289 6278 111
如果您知道所有对应向量的长度都是正确的,那么只需将“.values”添加到调用中即可,如下所示
import pandas as pd
df = pd.DataFrame({'MO': ['510-20200701001', '510-20200701001', '510-20200701001', '510-20200701002', '510-20200701002', '510-20200701002', '510-20200701003', '510-20200701003', '510-20200701003', '510-20200701004', '510-20200701004', '510-20200701004', '510-20200701005', '510-20200701005', '510-20200701005', '510-20200701006', '510-20200701006', '510-20200701006'],
'Stop Name': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'Amount of Input': [21000, 22112, 22476, 12000, 12609, 12775, 15000, 15595, 15844, 600, 775, 790, 1000, 1149, 1176, 6000, 6225, 6289],
'Amount of Output': [22400, 22057, 22330, 12800, 12586, 12685, 16000, 15587, 15718, 800, 775, 783, 1200, 1139, 1162, 6400 ,6225, 6278],
'A.OUT-B.IN':['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''], 'A.OUT-C.IN': ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] })
df.loc[df['Stop Name'] == 'B', 'A.OUT-B.IN'] = df.loc[df['Stop Name'] == 'A', 'Amount of Output'].values - df.loc[df['Stop Name'] == 'B', 'Amount of Input'].values
df.loc[df['Stop Name'] == 'C', 'A.OUT-C.IN'] = df.loc[df['Stop Name'] == 'A', 'Amount of Output'].values - df.loc[df['Stop Name'] == 'C', 'Amount of Input'].values
将一个
.to_numpy()
,例如:df.loc[df['Stop Name']=='a','Amount of Output'].放到_numpy()
中,对于要减去的值也是如此。。。?由于列名不同,因此不会为熊猫标识键subtraction@anky我试过了,但它显示了TypeError:-:“int”和“method”的操作数类型不受支持。您可能缺少括号,请检查下面的答案,这就是我的意思。唯一不同的是。值现在是。to_numpy()
@anky,谢谢。我查过了,非常感谢。这正是我想要的。@johann很好,它解决了你的问题。