Python 熊猫-在数据框中追加字符串:ValueError:无法从重复的轴重新索引
我有一个data.frame,看起来与此类似(除了更长,颜色名称更多): 我希望data.frame如下所示:Python 熊猫-在数据框中追加字符串:ValueError:无法从重复的轴重新索引,python,string,pandas,indexing,dataframe,Python,String,Pandas,Indexing,Dataframe,我有一个data.frame,看起来与此类似(除了更长,颜色名称更多): 我希望data.frame如下所示: ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']}) 我尝试了以
ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']})
我尝试了以下方法:
ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'brown'
ff.loc[ff['OldCol'].str.contains(r'brown|cognac',na=False) & ~ff['NewCol'].str.contains(r'nan|brown'), 'NewCol'] = ff['NewCol']+'/ brown'
ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ff['NewCol'].str.contains(r'nan'), 'NewCol'] = 'beige'
ff.loc[ff['OldCol'].str.contains(r'beige|sand',na=False) & ~ff['NewCol'].str.contains(r'nan|beige'), 'NewCol'] = ff['NewCol'] +'/ beige'
在较长的data.frame中,通常会出现以下错误:
ValueError:无法从重复轴重新编制索引
有人能帮忙吗?
非常感谢 索引中的重复项存在问题。您可以将索引的所有值替换为to
Regular index
(0,1,2..len(df)-1
)。旧值通过参数drop=True删除:
ff.reset_index(drop=True, inplace=True)
测试:
如果我的回答有帮助,别忘了。谢谢
ff.reset_index(drop=True, inplace=True)
ff = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['nan','nan','nan','nan','nan','nan']})
ffnew = pd.DataFrame({'OldCol':['darkbrown','lightbeige','lightbrown / beige','beige','brown','beige / cognac'], 'NewCol':['brown','beige','beige / brown','sand','brown','sand / brown']})
ff.index = [0,0,2,3,4,5]
#ValueError: cannot reindex from a duplicate axis
ff.reset_index(drop=True, inplace=True)