Python 是否打印组合的字符串列而不在中创建新列?
我用的是熊猫0.18。我有一个数据框,看起来像这样:Python 是否打印组合的字符串列而不在中创建新列?,python,pandas,Python,Pandas,我用的是熊猫0.18。我有一个数据框,看起来像这样: >> df = pd.DataFrame({'ods': {0: 'A86016', 1: 'L81042', 2: 'C84013', 3: 'G82228', 4: 'C81083'}, 'id': {0: np.nan, 1: 463061.0, 2: np.nan, 3: 462941.0, 4: np.nan}, 'provider': {0
>> df = pd.DataFrame({'ods': {0: 'A86016', 1: 'L81042', 2: 'C84013', 3: 'G82228', 4: 'C81083'},
'id': {0: np.nan, 1: 463061.0, 2: np.nan, 3: 462941.0, 4: np.nan},
'provider': {0: 'emis', 1: np.nan, 2: 'tpp', 3: 'emis', 4: 'tpp'}})
>> print df
id ods provider
0 NaN A86016 emis
1 463061.0 L81042 NaN
2 NaN C84013 tpp
3 462941.0 G82228 emis
4 NaN C81083 tp
id (ods) provider
1 (A86016) emis
2 (L81042) NaN
3 (C84013) tpp
df['newcol'] = df.id.str + " (" + df.ods.str + ")"
print df[['newcol', 'provider']]
我想输出一个格式如下的表:
>> df = pd.DataFrame({'ods': {0: 'A86016', 1: 'L81042', 2: 'C84013', 3: 'G82228', 4: 'C81083'},
'id': {0: np.nan, 1: 463061.0, 2: np.nan, 3: 462941.0, 4: np.nan},
'provider': {0: 'emis', 1: np.nan, 2: 'tpp', 3: 'emis', 4: 'tpp'}})
>> print df
id ods provider
0 NaN A86016 emis
1 463061.0 L81042 NaN
2 NaN C84013 tpp
3 462941.0 G82228 emis
4 NaN C81083 tp
id (ods) provider
1 (A86016) emis
2 (L81042) NaN
3 (C84013) tpp
df['newcol'] = df.id.str + " (" + df.ods.str + ")"
print df[['newcol', 'provider']]
有没有一种简单的方法可以在不创建新专栏的情况下实现这一点?我知道我可以用这样的方法做到:
>> df = pd.DataFrame({'ods': {0: 'A86016', 1: 'L81042', 2: 'C84013', 3: 'G82228', 4: 'C81083'},
'id': {0: np.nan, 1: 463061.0, 2: np.nan, 3: 462941.0, 4: np.nan},
'provider': {0: 'emis', 1: np.nan, 2: 'tpp', 3: 'emis', 4: 'tpp'}})
>> print df
id ods provider
0 NaN A86016 emis
1 463061.0 L81042 NaN
2 NaN C84013 tpp
3 462941.0 G82228 emis
4 NaN C81083 tp
id (ods) provider
1 (A86016) emis
2 (L81042) NaN
3 (C84013) tpp
df['newcol'] = df.id.str + " (" + df.ods.str + ")"
print df[['newcol', 'provider']]
但我只是想知道是否可以跳过创建新专栏的中间步骤 一种可能的解决方案是从
系列
创建新的数据帧
——一种是连接列id
和ods
,另一种是列提供者
:
print pd.DataFrame({'id (ods)': df.id.astype(str) + "(" + df.ods + ")",
'provider': df.provider})
id (ods) provider
0 nan (A86016) emis
1 463061.0 (L81042) NaN
2 nan (C84013) tpp
3 462941.0 (G82228) emis
4 nan (C81083) tpp
一种可能的解决方案是从
系列
创建新的数据帧
——一种是连接列id
和ods
,另一种是列提供者
:
print pd.DataFrame({'id (ods)': df.id.astype(str) + "(" + df.ods + ")",
'provider': df.provider})
id (ods) provider
0 nan (A86016) emis
1 463061.0 (L81042) NaN
2 nan (C84013) tpp
3 462941.0 (G82228) emis
4 nan (C81083) tpp
您可以尝试这样做:
df = df.assign(id_obs=df['id'].astype(str) + ' (' + df['ods'] + ')').drop(['id','ods'], axis=1)
10K测向上的定时:
In [132]: %timeit pd.DataFrame({'id (ods)':df.id.astype(str) + " (" + df.ods + ")", 'provider': df.provider})
1 loop, best of 3: 734 ms per loop
In [133]: %timeit df.assign(id_obs=df['id'].astype(str) + ' (' + df['ods'] + ')').drop(['id','ods'], axis=1)
1 loop, best of 3: 758 ms per loop
您可以尝试这样做:
df = df.assign(id_obs=df['id'].astype(str) + ' (' + df['ods'] + ')').drop(['id','ods'], axis=1)
10K测向上的定时:
In [132]: %timeit pd.DataFrame({'id (ods)':df.id.astype(str) + " (" + df.ods + ")", 'provider': df.provider})
1 loop, best of 3: 734 ms per loop
In [133]: %timeit df.assign(id_obs=df['id'].astype(str) + ' (' + df['ods'] + ')').drop(['id','ods'], axis=1)
1 loop, best of 3: 758 ms per loop