Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/308.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于多个列值创建列_Python_Python 3.x_Pandas_Data Science - Fatal编程技术网

Python 基于多个列值创建列

Python 基于多个列值创建列,python,python-3.x,pandas,data-science,Python,Python 3.x,Pandas,Data Science,如果我的df如下所示: ID | Car | Plane | Tank | Scooter | Misc | Day 4 Yes No Yes No 32 Mon 2 No No No No 22 Tues 1 Yes No No No 11 Wed 如果在汽车、飞机、坦克或滑板车的任何列中有值为是或否,我如何创建一个表示真或假的新列?谢

如果我的df如下所示:

ID | Car | Plane | Tank | Scooter | Misc  | Day
4    Yes    No      Yes    No        32     Mon
2    No     No      No     No        22     Tues
1    Yes    No      No     No        11     Wed

如果在汽车、飞机、坦克或滑板车的任何列中有值为是或否,我如何创建一个表示真或假的新列?谢谢

您可以使用
.iloc
确定要用于检查的列。你可以。使用
.any(1)
查看是否有任何值是
'Yes'
'No'

代码如下。我添加了第4行,其中的值为'Maybe',以显示记录不符合'Yes','No'条件

#created the DataFrame with a few sample values
import pandas as pd
df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Tank':['Yes','No','No','Maybe'],
                   'Scooter':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Day':['Mon','Tues','Wed','Thu']})

#printing the full DataFrame to make sure the values are as expected
print(df)

#the iloc option can be used to filter the columns you want to checked
#printing it out for you to see which ones are being used for selection 
print(df.iloc[:,1:-2])

#if you want to check for 'Yes' or 'No', then use |. If either then it will set to 'True'
#if you want to check for only for 'Yes', then you dont need the second part
df['Check'] = ((df.iloc[:,1:-2] == 'Yes') | (df.iloc[:,1:-2] == 'No')).any(1)

#the DataFrame will have the new column with True or False
print (df)
结果如下:

df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
初始数据帧:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
从数据帧筛选的列为:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
供您使用的最终结果:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
如果您的状况发生以下变化:

如果'Car'、'Plane'、'Tank'、'Scooter'中的任何值='Yes',则将'Check'设置为True。对于所有其他情况,请将“检查”设置为False

然后,前面的代码可以简化如下:

df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
此操作的输出如下:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
如果您的数据帧不是由汽车、飞机、坦克和滑板车相邻构成的,您可以将它们放入一个列表中,并使用该列表进行过滤和检查

例如,如果您的数据帧如下所示:

df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Tank':['Yes','No','No','Maybe'],
                   'Day':['Mon','Tues','Wed','Thu'],
                   'Scooter':['No','No','No','Maybe']})
然后它会像这样

   ID    Car  Plane  Misc   Tank   Day Scooter
0   4    Yes     No    32    Yes   Mon      No
1   2     No     No    22     No  Tues      No
2   1    Yes     No    11     No   Wed      No
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe
您将无法使用
.iloc[:,1:-2]
。相反,您可以将所有列放入一个列表中,并按如下方式使用该列表

cols = ['Car','Plane','Tank','Scooter']

print(df[cols])

df['Check'] = (df[cols] == 'Yes').any(1)
这将为您提供与我们前面讨论的iloc选项相同的结果

输出将为:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False

您可以使用
.iloc
确定要用于检查的列。你可以。使用
.any(1)
查看是否有任何值是
'Yes'
'No'

代码如下。我添加了第4行,其中的值为'Maybe',以显示记录不符合'Yes','No'条件

#created the DataFrame with a few sample values
import pandas as pd
df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Tank':['Yes','No','No','Maybe'],
                   'Scooter':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Day':['Mon','Tues','Wed','Thu']})

#printing the full DataFrame to make sure the values are as expected
print(df)

#the iloc option can be used to filter the columns you want to checked
#printing it out for you to see which ones are being used for selection 
print(df.iloc[:,1:-2])

#if you want to check for 'Yes' or 'No', then use |. If either then it will set to 'True'
#if you want to check for only for 'Yes', then you dont need the second part
df['Check'] = ((df.iloc[:,1:-2] == 'Yes') | (df.iloc[:,1:-2] == 'No')).any(1)

#the DataFrame will have the new column with True or False
print (df)
结果如下:

df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
初始数据帧:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
从数据帧筛选的列为:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
供您使用的最终结果:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
如果您的状况发生以下变化:

如果'Car'、'Plane'、'Tank'、'Scooter'中的任何值='Yes',则将'Check'设置为True。对于所有其他情况,请将“检查”设置为False

然后,前面的代码可以简化如下:

df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)
此操作的输出如下:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
如果您的数据帧不是由汽车、飞机、坦克和滑板车相邻构成的,您可以将它们放入一个列表中,并使用该列表进行过滤和检查

例如,如果您的数据帧如下所示:

df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Tank':['Yes','No','No','Maybe'],
                   'Day':['Mon','Tues','Wed','Thu'],
                   'Scooter':['No','No','No','Maybe']})
然后它会像这样

   ID    Car  Plane  Misc   Tank   Day Scooter
0   4    Yes     No    32    Yes   Mon      No
1   2     No     No    22     No  Tues      No
2   1    Yes     No    11     No   Wed      No
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe
您将无法使用
.iloc[:,1:-2]
。相反,您可以将所有列放入一个列表中,并按如下方式使用该列表

cols = ['Car','Plane','Tank','Scooter']

print(df[cols])

df['Check'] = (df[cols] == 'Yes').any(1)
这将为您提供与我们前面讨论的iloc选项相同的结果

输出将为:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu
     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False
   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False

如果任何列的每行值为“是”,则以下代码应为True

df['new col'] = df[['Car', 'Plane', 'Tank', 'Scooter']].apply(lambda x: any(x == 'Yes'), axis = 1)

如果任何列的每行值为“是”,则以下代码应为True

df['new col'] = df[['Car', 'Plane', 'Tank', 'Scooter']].apply(lambda x: any(x == 'Yes'), axis = 1)

df['new_column']=df['Car','Plane','Scooter'].eq(“是”).任何(1)
df['new_column']=df['Car','Plane','Scooter'].eq(“是”).任何(1)