Python 如果多列中的行包含1,如何在包含列名称的数据框中添加新列,其中值为1
我的问题是在“位置”列下,如何在行值为1的情况下添加列名,例如对日本/美国、纽约、大阪,应在“位置”列下打印……请建议如何在Python中解决此问题?尝试以下方法:Python 如果多列中的行包含1,如何在包含列名称的数据框中添加新列,其中值为1,python,python-3.x,pandas,Python,Python 3.x,Pandas,我的问题是在“位置”列下,如何在行值为1的情况下添加列名,例如对日本/美国、纽约、大阪,应在“位置”列下打印……请建议如何在Python中解决此问题?尝试以下方法: import pandas as pd DF = pd.DataFrame S = pd.Series def construct_df() -> DF: data = { "Countries": ["Japan/US", "Australia
import pandas as pd
DF = pd.DataFrame
S = pd.Series
def construct_df() -> DF:
data = {
"Countries": ["Japan/US", "Australia & NZ", "America & India"],
"Portugal": [0, 0, 0],
"Newyork": [1, 0, 1],
"Delhi": [0, 0, 0],
"Osaka": [1, 0, 1],
"Bangalore": [0, 0, 0],
"Sydney": [0, 0, 0],
"Mexico": [0, 0, 0],
}
return pd.DataFrame(data)
def calc_locations(x: DF) -> S:
x__location_cols_only = x.select_dtypes("integer")
x__ones_as_location_col_name = x__location_cols_only.apply(
lambda ser: ser.replace({0: "", 1: ser.name})
)
location_cols = x__location_cols_only.columns.tolist()
ret = x__ones_as_location_col_name[location_cols[0]]
for colname in location_cols[1:]:
col = x__ones_as_location_col_name[colname]
ret = ret.str.cat(col, sep=",")
ret = ret.str.replace(r",+", ",").str.strip(",")
return ret
df_final = construct_df().assign(Locations=calc_locations)
assert df_final["Locations"].tolist() == ["Newyork,Osaka", "", "Newyork,Osaka"]
假设您的数据如下所示
import pandas as pd
data = {'Countries': ['JP/US', 'Aus/NZ', 'America/India'],
'Portugal': [0, 0, 0],
'Newyork': [1, 0, 1],
'Delhi': [0, 0, 1],
'Osaka': [1, 0, 0],
'Sydney': [0, 0, 0],
'Mexico': [0, 0, 0],
}
data_df = pd.DataFrame(data)
Delhi Mexico Newyork Osaka Portugal Sydney Locations
Countries
JP/US 0 0 1 1 0 0 Newyork, Osaka
Aus/NZ 0 0 0 0 0 0
America/India 1 0 1 0 0 0 Delhi, Newyork
DF如下所示(请您提供上述数据集,以便为我们构建DF,并向您提供结果):
如果您执行以下语句
data_df = data_df.set_index('Countries')
data_df['Locations'] = data_df.apply(lambda x: ", ".join(x[x!=0].index.tolist()), axis=1)
您的输出如下所示
import pandas as pd
data = {'Countries': ['JP/US', 'Aus/NZ', 'America/India'],
'Portugal': [0, 0, 0],
'Newyork': [1, 0, 1],
'Delhi': [0, 0, 1],
'Osaka': [1, 0, 0],
'Sydney': [0, 0, 0],
'Mexico': [0, 0, 0],
}
data_df = pd.DataFrame(data)
Delhi Mexico Newyork Osaka Portugal Sydney Locations
Countries
JP/US 0 0 1 1 0 0 Newyork, Osaka
Aus/NZ 0 0 0 0 0 0
America/India 1 0 1 0 0 0 Delhi, Newyork
你可以做:
data = {
"Countries": ["Japan/US", "Australia & NZ", "America & India"],
"Portugal": [0, 0, 0],
"Newyork": [1, 0, 1],
"Delhi": [0, 0, 0],
"Osaka": [1, 0, 1],
"Bangalore": [0, 0, 0],
"Sydney": [0, 0, 0],
"Mexico": [0, 0, 0],
}
df=pd.DataFrame(data)
df1=df.iloc[:,1:]*df.iloc[:,1:].columns
df['Location']=df1.values.tolist()
df['Location']=df['Location'].apply(lambda x:','.join([y for y in x if len(y)>1]))
请将数据(而不是图像)添加到您的问题中,作为您应该阅读的新用户,以及当涉及熊猫时。我们经常需要复制,因此需要可复制(文本)数据。感谢Serge Ballesta的建议感谢Ankit的解决方案,我还有两个问题Q1:如何将现有数据自动转换为以下提到的格式数据={'Countries':['JP/US'、'Aus/NZ'、'America/India']、'Portugal':[0,0,0]、'Newyork':[1,0,1]、'Delhi':[0,0,1]、'Osaka':[1,0,0]、'Sydney':[0,0,0]、'Mexico':[0,0,0]、}data_df=pd。DataFrame(data)感谢Austin Ray的解决方案。