Python 条件语句问题

Python 条件语句问题,python,pandas,Python,Pandas,嗨~我正在处理我的数据 我想用条件语句提取数据 这是我的密码 # -*- coding: utf-8 -*- import pandas as pd import numpy as np import os join_file = r'D:\handling data\complete data\조인\after_join.csv' pwd = os.getcwd() os.chdir(os.path.dirname(join_file)) join_data = pd.read_csv(os

嗨~我正在处理我的数据

我想用条件语句提取数据

这是我的密码

# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import os

join_file = r'D:\handling data\complete data\조인\after_join.csv'
pwd = os.getcwd()
os.chdir(os.path.dirname(join_file))
join_data = pd.read_csv(os.path.basename(join_file), sep=',', encoding='utf-8')

print(join_data.head())

执行第二次打印后(join_data.head())。 我得到了像图片一样的错误

我怎样才能修好它??
提前感谢。

似乎您省略了许多条件之间的括号,更好的方法是使用:

原件:

join_data['cluster_z']
[((join_data['cluster_x'] == 3 | 
   join_data['cluster_x'] == 2 | 
   join_data['cluster_x'] == 4 ) &
  (join_data['cluster_y'] == 3 |
   join_data['cluster_y'] == 1))] = 1
改为:

join_data.loc[
((join_data['cluster_x'] == 3) | 
 (join_data['cluster_x'] == 2) | 
 (join_data['cluster_x'] == 4) ) & 
((join_data['cluster_y'] == 3) | 
 (join_data['cluster_y'] == 1)), 'cluster_z'] = 1 
或更好地使用:

总而言之:

join_data = pd.DataFrame({'cluster_x':[3,2,5,3],
                         'cluster_y':[3,0,1,2]})

print (join_data)
   cluster_x  cluster_y
0          3          3
1          2          0
2          5          1
3          3          2

join_data['cluster_z'] = 4

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1 

join_data.loc[
(join_data['cluster_x'].isin([1,5])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 2 

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([2,4])), 'cluster_z'] = 3

print (join_data)
   cluster_x  cluster_y  cluster_z
0          3          3          1
1          2          0          4
2          5          1          2
3          3          2          3
或更具可读性:

mask1 = join_data['cluster_x'].isin([3,2,4])
mask2 = join_data['cluster_y'].isin([3,1])
mask3 = join_data['cluster_x'].isin([1,5])
mask4 = join_data['cluster_y'].isin([2,4])

join_data['cluster_z'] = 4
join_data.loc[mask1 & mask2 , 'cluster_z'] = 1 
join_data.loc[mask3 & mask2 , 'cluster_z'] = 2 
join_data.loc[mask1 & mask4 , 'cluster_z'] = 3 

print (join_data)
   cluster_x  cluster_y  cluster_z
0          3          3          1
1          2          0          4
2          5          1          2
3          3          2          3
具有多个功能的解决方案:


谢谢~~你真是个了不起的家伙!!有很多方法可以处理它。哈哈。你怎么会知道很多方法。谢谢~~祝你今天愉快~~
join_data = pd.DataFrame({'cluster_x':[3,2,5,3],
                         'cluster_y':[3,0,1,2]})

print (join_data)
   cluster_x  cluster_y
0          3          3
1          2          0
2          5          1
3          3          2

join_data['cluster_z'] = 4

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 1 

join_data.loc[
(join_data['cluster_x'].isin([1,5])) & 
(join_data['cluster_y'].isin([3,1])), 'cluster_z'] = 2 

join_data.loc[
(join_data['cluster_x'].isin([3,2,4])) & 
(join_data['cluster_y'].isin([2,4])), 'cluster_z'] = 3

print (join_data)
   cluster_x  cluster_y  cluster_z
0          3          3          1
1          2          0          4
2          5          1          2
3          3          2          3
mask1 = join_data['cluster_x'].isin([3,2,4])
mask2 = join_data['cluster_y'].isin([3,1])
mask3 = join_data['cluster_x'].isin([1,5])
mask4 = join_data['cluster_y'].isin([2,4])

join_data['cluster_z'] = 4
join_data.loc[mask1 & mask2 , 'cluster_z'] = 1 
join_data.loc[mask3 & mask2 , 'cluster_z'] = 2 
join_data.loc[mask1 & mask4 , 'cluster_z'] = 3 

print (join_data)
   cluster_x  cluster_y  cluster_z
0          3          3          1
1          2          0          4
2          5          1          2
3          3          2          3
mask1 = join_data['cluster_x'].isin([3,2,4])
mask2 = join_data['cluster_y'].isin([3,1])
mask3 = join_data['cluster_x'].isin([1,5])
mask4 = join_data['cluster_y'].isin([2,4])

join_data['cluster_z'] = np.where(mask1 & mask2, 1,
                         np.where(mask3 & mask2, 2,
                         np.where(mask1 & mask4, 3, 4)))        

print (join_data)
   cluster_x  cluster_y  cluster_z
0          3          3          1
1          2          0          4
2          5          1          2
3          3          2          3