Python 提取多个列
使用Pandas,我将向数据帧添加新列:Python 提取多个列,python,pandas,dataframe,Python,Pandas,Dataframe,使用Pandas,我将向数据帧添加新列: df["Year"] = df["concat"].str.extract("(\d\d\d\d$)", expand=False) df["Month"] = df["concat"].str.extract("(\d\d)\_\d\d\d\d$", expand=False) df["Measure"] = df["concat"].str.extract("^(.*)\_\d\d\_\d\d\d\d$", expand=False) 这是可行的,
df["Year"] = df["concat"].str.extract("(\d\d\d\d$)", expand=False)
df["Month"] = df["concat"].str.extract("(\d\d)\_\d\d\d\d$", expand=False)
df["Measure"] = df["concat"].str.extract("^(.*)\_\d\d\_\d\d\d\d$", expand=False)
这是可行的,但速度很慢。我在考虑一步完成所有3项操作(希望这能提高性能):
df[“度量”、“年”、“月”]=(df[“concat”].str.extract(^(?P.*)\ \u(?P\d\d)\ \u(?P\d\d\d)$,expand=True))
但这不起作用(ValueError:传递的项目数错误3,位置意味着1)
如何使其工作或如何有效地提取此信息?您正在将3个单独的值作为
df[“度量”、“年”、“月”]
传递到df参考中<代码>“度量”、“年”
和“月”
而不是单个数组[“度量”、“年”、“月”]
。它应该看起来像df[[“度量”、“年”、“月”]
或者,也可以使用连接函数
df2= df["concat"].str.extract("^(?P<Measure>.*)\_(?P<Month>\d\d)\_(?P<Year>\d\d\d\d)$", expand=True)
pd.concat([df,df2],axis = 1)
df2=df[“concat”].str.extract(^(?P.*)\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ P\d\d\d)$”,expand=True)
pd.concat([df,df2],轴=1)
df2= df["concat"].str.extract("^(?P<Measure>.*)\_(?P<Month>\d\d)\_(?P<Year>\d\d\d\d)$", expand=True)
pd.concat([df,df2],axis = 1)