Regex 如何使用pandas拆分数据帧?
我要处理以下数据帧, DF 我期望的数据帧应该如下,DFRegex 如何使用pandas拆分数据帧?,regex,python-3.x,pandas,dataframe,split,Regex,Python 3.x,Pandas,Dataframe,Split,我要处理以下数据帧, DF 我期望的数据帧应该如下,DF Name Last_name City Hat Richards Paris Adams New york Tim Mathews Sanfrancisco chris Moya De Las Vegas kate
Name Last_name City
Hat Richards Paris
Adams New york
Tim Mathews Sanfrancisco
chris Moya De Las Vegas
kate Moris Atlanta
Grisham HA Middleton
James, Tom greval Rome
拆分应在最后一个“,”上进行,如果没有“,”,则整个其他单词或短语应归入“姓氏”列,“姓名”列应保持空白。使用
str.split
和n=-1
(默认情况下,您可以更改所需内容)
用于添加,
,最后:
又快又脏
使用pandas.str.split
和str[:-1]
反转顺序
df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)
df
Name City Last_name
0 Hat Paris Richards
1 NaN New york Adams
2 Tim Sanfrancisco Mathews
3 chris Las Vegas Moya De
4 kate Atlanta Moris
5 NaN Middleton Grisham HA
newdf=df.Name.str.split(', ',expand=True,n=1).ffill(1)
newdf.loc[newdf[0]==newdf[1],0]=''
newdf
Out[923]:
0 1
0 Hat Richards
1 Adams
2 Tim Mathews
3 chris MoyaDe
4 kate Moris
5 GrishamHA
df[['Name','LastName']]=newdf
df
Out[925]:
Name City LastName
0 Hat Paris Richards
1 Newyork Adams
2 Tim Sanfrancisco Mathews
3 chris LasVegas MoyaDe
4 kate Atlanta Moris
5 Middleton GrishamHA
df[['first','last']] = df['Name'].radd(', ').str.rsplit(', ', n=1, expand=True)
df['first'] = df['first'].str.lstrip(', ')
print (df)
Name City first last
0 Hat, Richards Paris Hat Richards
1 Adams New york Adams
2 Tim, Mathews Sanfrancisco Tim Mathews
3 chris, Moya De Las Vegas chris Moya De
4 kate, Moris Atlanta kate Moris
5 Grisham HA Middleton Grisham HA
6 James, Tom, greval Rome James, Tom greval
df[['Last_name', 'Name']] = df.Name.str.split(', ').str[::-1].apply(pd.Series)
df
Name City Last_name
0 Hat Paris Richards
1 NaN New york Adams
2 Tim Sanfrancisco Mathews
3 chris Las Vegas Moya De
4 kate Atlanta Moris
5 NaN Middleton Grisham HA