Python 分配效率
我有一个Pandas数据框,我将在其中添加一个新列(建议)。 添加新列后,我将使用以下模式根据QUERY列的值使用新值对其进行更新。例如:Python 分配效率,python,pandas,Python,Pandas,我有一个Pandas数据框,我将在其中添加一个新列(建议)。 添加新列后,我将使用以下模式根据QUERY列的值使用新值对其进行更新。例如: QUERY = 'query' SUGGESTED = 'suggested' df[SUGGESTED] = numpy.nan s_query = 'de' new_value = 'delaware' df.loc[(df[QUERY] == s_query), [SUGGESTED]] = new_value 例如: query suggested
QUERY = 'query'
SUGGESTED = 'suggested'
df[SUGGESTED] = numpy.nan
s_query = 'de'
new_value = 'delaware'
df.loc[(df[QUERY] == s_query), [SUGGESTED]] = new_value
例如:
query suggested
al alabama
ca california
de NaN
之后:
query suggested
al alabama
ca california
de delaware
到目前为止,它似乎还有效,不确定是否有更有效的方法在熊猫身上使用。我认为您可以首先在
loc
和np中省略df[建议]=numpy.nan
。where
解决方案,因为它添加了新的列:
QUERY = 'query'
SUGGESTED = 'suggested'
s_query = 'de'
new_value = 'delaware'
#if need update existing column
df[SUGGESTED] = df[SUGGESTED].mask(df[QUERY] == s_query, new_value)
print (df)
query suggested
0 al alabama
1 ca california
2 de delaware
如果只有一个条件,则使用loc
的解决方案可以简化删除()
,如果只有一列,则删除[]
:
#for updating existing column
df.loc[df[QUERY] == s_query, SUGGESTED] = new_value
print (df)
query suggested
0 al alabama
1 ca california
2 de delaware
#same for creating new column
df.loc[df[QUERY] == s_query, SUGGESTED] = new_value
print (df)
query suggested
0 al NaN
1 ca NaN
2 de delaware
如果需要替换为不匹配的NaN
:
#same for creating and updating existing column
df[SUGGESTED] = np.where(df[QUERY] == s_query, new_value, np.nan)
print (df)
query suggested
0 al nan
1 ca nan
2 de delaware
在np中,例如有一个选项:如果找到查询,则匹配行,使用新的_值更新列,但如果不匹配,则不执行任何操作?正如您指出的,当前示例使用np.nan覆盖其他行。我认为您需要
df[suggered]=df[suggered].mask(df[QUERY]==s\u QUERY,new\u值)
-请参阅