Pandas 从另一个数据帧连接多个列
我有两个数据帧df和ndf,它们需要在两列上连接。这与通常的1:1联接不同Pandas 从另一个数据帧连接多个列,pandas,Pandas,我有两个数据帧df和ndf,它们需要在两列上连接。这与通常的1:1联接不同 try: from StringIO import StringIO except ImportError: from io import StringIO myst="""india / gujarat, 22905034 , 19:44 india / kerala, 1905094 , 19:33 india / madhya pradesh, 905154 , 21:56 """
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
myst="""india / gujarat, 22905034 , 19:44
india / kerala, 1905094 , 19:33
india / madhya pradesh, 905154 , 21:56
"""
u_cols=['country_state', 'index', 'current_tm']
myf = StringIO(myst)
import pandas as pd
df = pd.read_csv(StringIO(myst), sep=',', names = u_cols)
myst="""india , Gujrat, high
india , KERALA , high
india , madhya pradesh, low
india, bihar, low
"""
u_cols=['country', 'state', 'progress']
myf = StringIO(myst)
import pandas as pd
ndf = pd.read_csv(StringIO(myst), sep=',', names = u_cols)
预期结果如下所示
country state progress index current_tm
india Gujrat high 22905034 19:44
india KERALA high 1905094 19:33
india madhya pradesh low 905154 21:56
india bihar low
此数据帧由最终用户提供,可能包含无效格式,如india/abc/xyz
有没有办法将一列与多列连接起来
更新:
这与我正在努力实现的目标非常接近
df=df.join(df['branch_name'].str.split('/', expand=True))
是否有任何方法可以将其扩展为只分成两列?例如,如果字符串为a/b/c,则a应在一列中,b/c应在另一列中?使用
In [232]: dfs = df.country_state.str.split(' / ').str[1]
In [233]: ndfs = ndf.state.str.lower().str.strip()
In [234]: pd.merge(df, ndf, left_on=dfs, right_on=ndfs,
how='right').drop('country_state', 1)
Out[234]:
index current_tm country state progress
0 1905094.0 19:33 india KERALA high
1 905154.0 21:56 india madhya pradesh low
2 NaN NaN india Gujrat high
3 NaN NaN india bihar low