Python 通过其他两列更新熊猫
我有一个大约3000个条目和10列的数据集,所以这是一个简单得多的版本Python 通过其他两列更新熊猫,python,pandas,Python,Pandas,我有一个大约3000个条目和10列的数据集,所以这是一个简单得多的版本 df = asset tail more_info 0 x a this is a long text field that is right 1 x b this is a long text field that is almost right 2 y a this is right 3 y b
df =
asset tail more_info
0 x a this is a long text field that is right
1 x b this is a long text field that is almost right
2 y a this is right
3 y b this is probably not right
期望结果
df =
asset tail more_info
0 x a this is a long text field that is right
1 x b this is a long text field that is right
2 y a this is right
3 y b this is right
所以我试图更新我的more_info字段,其中资产匹配,尾部等于'a'
事实上,数据集更复杂,所以我需要以编程方式进行,这就是我在该逻辑中画空白的地方
def my_func(x):
if x.asset == x.asset and x.tail =='b':
'''
this would be where I'd set it to x.more_info where tail = 'a' maybe numpy where ??
'''
df['more_info'] = df['more_info'].apply(lambda x: my_func(x))
您可以尝试以下方法:
# reduce dataframe to contain only
df1 = df[['asset', 'tail']].copy()
# slice df to get only the ones you want to use for "more_info"
df2 = df[df['tail']=='a'][['asset', 'more_info']].copy()
df1.merge(df2, on=['asset'])
# asset tail more_info
# 0 x a this_is_a_long_text_field_that_is_right
# 1 x b this_is_a_long_text_field_that_is_right
# 2 y a this_is_right
# 3 y b this_is_right
或者在一行中:
df[['asset', 'tail']].merge(df[df['tail']=='a'][['asset', 'more_info']], on=['asset'])
归档所有“a”记录,通过“资产”连接回原始数据