Python 熊猫比较数据帧中的行_Python_Pandas_Nan_Missing Data

Python 熊猫比较数据帧中的行

python pandas

Python 熊猫比较数据帧中的行,python,pandas,nan,missing-data,Python,Pandas,Nan,Missing Data,我有以下数据框（由下面的字典表示）：我试图了解新旧序列之间的变化。如果按命名序列\u old=Sequence\u new，则无任何更改。如果Sequence+\u new为'nan'，则删除名称。你能在熊猫中帮助实现这一点吗？到目前为止所做的尝试都没有成功： for i in range(0, len(Merge)): if Merge.iloc[i]['Sequence_x'] == Merge.iloc[i]['Sequence_y']: Merge.iloc[

我有以下数据框（由下面的字典表示）：

我试图了解新旧序列之间的变化。如果按

命名序列\u old=Sequence\u new

，则无任何更改。如果

Sequence+\u new

为

'nan'

，则删除名称。你能在熊猫中帮助实现这一点吗？到目前为止所做的尝试都没有成功：

for i in range(0, len(Merge)):
    if Merge.iloc[i]['Sequence_x'] == Merge.iloc[i]['Sequence_y']:
        Merge.iloc[i]['New'] = 'N'
    else:
        Merge.iloc[i]['New'] = 'Y'

谢谢

您可以在以下条件下使用double：

如果Sequence+\u new是'nan'，Name removed，你能解释更多吗？所需的输出是什么？是的，nan被删除，我希望使用更聪明的布尔条件来查找这些名称。谢谢，谢谢。什么是更聪明？您能从这个输入中添加所需的输出吗？如果Sequence_new是nan，则表示名称已删除，我想将其标记为已删除。如果序列“旧”和序列“新”都存在，则它们将保留删除意味着删除

sequence\u new

12 111145 NaN 13 Y

中

NaN

被删除的所有行？或者将所有值设置为

NaN

？

for i in range(0, len(Merge)):
    if Merge.iloc[i]['Sequence_x'] == Merge.iloc[i]['Sequence_y']:
        Merge.iloc[i]['New'] = 'N'
    else:
        Merge.iloc[i]['New'] = 'Y'

mask = df.Sequence_old == df.Sequence_new

df['New'] = np.where(df.Sequence_new.isnull(), 'Removed', 
            np.where(mask, 'N', 'Y'))

print (df)
     Name  Sequence_new  Sequence_old      New
0      204           1.0             1        N
1   110838           2.0             2        N
2   110999           3.0             3        N
3   110998           4.0             4        N
4   111155           5.0             5        N
5   111710           6.0             6        N
6   111157           7.0             7        N
7   111156           8.0             8        N
8   111144           9.0             9        N
9   118972          10.0            10        N
10  111289          11.0            11        N
11  111288          12.0            12        N
12  111145           NaN            13  Removed
13  121131          13.0            14        Y
14  118990          14.0            15        Y
15  110653          15.0            16        Y
16  110693          16.0            17        Y
17  110694          17.0            18        Y
18  111577          18.0            19        Y
19  111702          19.0            20        Y
20  115424          20.0            21        Y
21  115127          21.0            22        Y
22  115178          22.0            23        Y
23  111578          23.0            24        Y
24  115409          24.0            25        Y
25  115468          25.0            26        Y
26  111711          26.0            27        Y
27  115163          27.0            28        Y
28  115149          28.0            29        Y
29  115251          29.0            30        Y

dic_new = {0: 1.0, 1: 2.0, 2: 3.0, 3: 4.0, 4: 5.0, 5: 6.0, 6: 7.0, 7: 8.0, 8: 9.0, 9: 10.0, 10: 11.0, 11: 12.0,
           12: 'Nan', 13: 13.0, 14: 14.0, 15: 15.0, 16: 16.0, 17: 17.0, 18: 18.0, 19: 19.0, 20: 20.0, 21: 21.0,
           22: 22.0, 23: 23.0, 24: 24.0, 25: 25.0, 26: 26.0, 27: 27.0, 28: 28.0, 29: 29.0}
dic_old = {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10, 10: 11, 11: 12, 12: 13, 13: 14, 14: 15, 15: 16,
           16: 17, 17: 18, 18: 19, 19: 20, 20: 21, 21: 22, 22: 23, 23: 24, 24: 25, 25: 26, 26: 27, 27: 28, 28: 29,
           29: 30}

# Does the same thing as the code below
for a, b in zip(dic_new.items(), dic_old.items()):
    if b[1].lower() != 'nan':
        # You can add whatever print statement you want here
        print(a[1] == b[1])

# Does the same thing as the code above
[print(a[1] == b[1]) for a, b in zip(dic_new.items(), dic_old.items()) if b[1].lower() != 'nan']