Python 基于多个列值创建列
如果我的df如下所示:Python 基于多个列值创建列,python,python-3.x,pandas,data-science,Python,Python 3.x,Pandas,Data Science,如果我的df如下所示: ID | Car | Plane | Tank | Scooter | Misc | Day 4 Yes No Yes No 32 Mon 2 No No No No 22 Tues 1 Yes No No No 11 Wed 如果在汽车、飞机、坦克或滑板车的任何列中有值为是或否,我如何创建一个表示真或假的新列?谢
ID | Car | Plane | Tank | Scooter | Misc | Day
4 Yes No Yes No 32 Mon
2 No No No No 22 Tues
1 Yes No No No 11 Wed
如果在汽车、飞机、坦克或滑板车的任何列中有值为是或否,我如何创建一个表示真或假的新列?谢谢您可以使用
.iloc
确定要用于检查的列。你可以。使用.any(1)
查看是否有任何值是'Yes'
或'No'
代码如下。我添加了第4行,其中的值为'Maybe',以显示记录不符合'Yes','No'条件
#created the DataFrame with a few sample values
import pandas as pd
df = pd.DataFrame({'ID':[4,2,1,3],
'Car':['Yes','No','Yes','Maybe'],
'Plane':['No','No','No','Maybe'],
'Tank':['Yes','No','No','Maybe'],
'Scooter':['No','No','No','Maybe'],
'Misc':[32,22,11,44],
'Day':['Mon','Tues','Wed','Thu']})
#printing the full DataFrame to make sure the values are as expected
print(df)
#the iloc option can be used to filter the columns you want to checked
#printing it out for you to see which ones are being used for selection
print(df.iloc[:,1:-2])
#if you want to check for 'Yes' or 'No', then use |. If either then it will set to 'True'
#if you want to check for only for 'Yes', then you dont need the second part
df['Check'] = ((df.iloc[:,1:-2] == 'Yes') | (df.iloc[:,1:-2] == 'No')).any(1)
#the DataFrame will have the new column with True or False
print (df)
结果如下:
df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
初始数据帧:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
从数据帧筛选的列为:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
供您使用的最终结果:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
如果您的状况发生以下变化:
如果'Car'、'Plane'、'Tank'、'Scooter'中的任何值='Yes',则将'Check'设置为True。对于所有其他情况,请将“检查”设置为False
然后,前面的代码可以简化如下:
df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
此操作的输出如下:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
如果您的数据帧不是由汽车、飞机、坦克和滑板车相邻构成的,您可以将它们放入一个列表中,并使用该列表进行过滤和检查
例如,如果您的数据帧如下所示:
df = pd.DataFrame({'ID':[4,2,1,3],
'Car':['Yes','No','Yes','Maybe'],
'Plane':['No','No','No','Maybe'],
'Misc':[32,22,11,44],
'Tank':['Yes','No','No','Maybe'],
'Day':['Mon','Tues','Wed','Thu'],
'Scooter':['No','No','No','Maybe']})
然后它会像这样
ID Car Plane Misc Tank Day Scooter
0 4 Yes No 32 Yes Mon No
1 2 No No 22 No Tues No
2 1 Yes No 11 No Wed No
3 3 Maybe Maybe 44 Maybe Thu Maybe
您将无法使用.iloc[:,1:-2]
。相反,您可以将所有列放入一个列表中,并按如下方式使用该列表
cols = ['Car','Plane','Tank','Scooter']
print(df[cols])
df['Check'] = (df[cols] == 'Yes').any(1)
这将为您提供与我们前面讨论的iloc选项相同的结果
输出将为:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
您可以使用
.iloc
确定要用于检查的列。你可以。使用.any(1)
查看是否有任何值是'Yes'
或'No'
代码如下。我添加了第4行,其中的值为'Maybe',以显示记录不符合'Yes','No'条件
#created the DataFrame with a few sample values
import pandas as pd
df = pd.DataFrame({'ID':[4,2,1,3],
'Car':['Yes','No','Yes','Maybe'],
'Plane':['No','No','No','Maybe'],
'Tank':['Yes','No','No','Maybe'],
'Scooter':['No','No','No','Maybe'],
'Misc':[32,22,11,44],
'Day':['Mon','Tues','Wed','Thu']})
#printing the full DataFrame to make sure the values are as expected
print(df)
#the iloc option can be used to filter the columns you want to checked
#printing it out for you to see which ones are being used for selection
print(df.iloc[:,1:-2])
#if you want to check for 'Yes' or 'No', then use |. If either then it will set to 'True'
#if you want to check for only for 'Yes', then you dont need the second part
df['Check'] = ((df.iloc[:,1:-2] == 'Yes') | (df.iloc[:,1:-2] == 'No')).any(1)
#the DataFrame will have the new column with True or False
print (df)
结果如下:
df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
初始数据帧:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
从数据帧筛选的列为:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
供您使用的最终结果:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
如果您的状况发生以下变化:
如果'Car'、'Plane'、'Tank'、'Scooter'中的任何值='Yes',则将'Check'设置为True。对于所有其他情况,请将“检查”设置为False
然后,前面的代码可以简化如下:
df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
此操作的输出如下:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
如果您的数据帧不是由汽车、飞机、坦克和滑板车相邻构成的,您可以将它们放入一个列表中,并使用该列表进行过滤和检查
例如,如果您的数据帧如下所示:
df = pd.DataFrame({'ID':[4,2,1,3],
'Car':['Yes','No','Yes','Maybe'],
'Plane':['No','No','No','Maybe'],
'Misc':[32,22,11,44],
'Tank':['Yes','No','No','Maybe'],
'Day':['Mon','Tues','Wed','Thu'],
'Scooter':['No','No','No','Maybe']})
然后它会像这样
ID Car Plane Misc Tank Day Scooter
0 4 Yes No 32 Yes Mon No
1 2 No No 22 No Tues No
2 1 Yes No 11 No Wed No
3 3 Maybe Maybe 44 Maybe Thu Maybe
您将无法使用.iloc[:,1:-2]
。相反,您可以将所有列放入一个列表中,并按如下方式使用该列表
cols = ['Car','Plane','Tank','Scooter']
print(df[cols])
df['Check'] = (df[cols] == 'Yes').any(1)
这将为您提供与我们前面讨论的iloc选项相同的结果
输出将为:
ID Car Plane Tank Scooter Misc Day
0 4 Yes No Yes No 32 Mon
1 2 No No No No 22 Tues
2 1 Yes No No No 11 Wed
3 3 Maybe Maybe Maybe Maybe 44 Thu
Car Plane Tank Scooter
0 Yes No Yes No
1 No No No No
2 Yes No No No
3 Maybe Maybe Maybe Maybe
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues True
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Tank Scooter Misc Day Check
0 4 Yes No Yes No 32 Mon True
1 2 No No No No 22 Tues False
2 1 Yes No No No 11 Wed True
3 3 Maybe Maybe Maybe Maybe 44 Thu False
ID Car Plane Misc Tank Day Scooter Check
0 4 Yes No 32 Yes Mon No True
1 2 No No 22 No Tues No False
2 1 Yes No 11 No Wed No True
3 3 Maybe Maybe 44 Maybe Thu Maybe False
如果任何列的每行值为“是”,则以下代码应为True
df['new col'] = df[['Car', 'Plane', 'Tank', 'Scooter']].apply(lambda x: any(x == 'Yes'), axis = 1)
如果任何列的每行值为“是”,则以下代码应为True
df['new col'] = df[['Car', 'Plane', 'Tank', 'Scooter']].apply(lambda x: any(x == 'Yes'), axis = 1)
df['new_column']=df['Car','Plane','Scooter'].eq(“是”).任何(1)
df['new_column']=df['Car','Plane','Scooter'].eq(“是”).任何(1)