Python 将“熊猫”列中的名称更改为以大写字母开头_Python_String_Pandas_Text_Apply

Python 将“熊猫”列中的名称更改为以大写字母开头

python string pandas text

Python 将“熊猫”列中的名称更改为以大写字母开头,python,string,pandas,text,apply,Python,String,Pandas,Text,Apply,背景我有一个玩具df import pandas as pd df = pd.DataFrame({'Text' : ['Jon J Mmith is Here', 'Mary Lisa Hder found here', 'Jane A Doe is also here', 'Tom T

背景

我有一个玩具

df

import pandas as pd
df = pd.DataFrame({'Text' : ['Jon J Mmith is Here', 
                                   'Mary Lisa Hder found here', 
                                   'Jane A Doe is also here',
                                'Tom T Tcker is here too'], 

                      'P_ID': [1,2,3,4], 
                      'P_Name' : ['MMITH, JON J', 'HDER, MARY LISA', 'DOE, JANE A', 'TCKER, TOM T'],
                      'N_ID' : ['A1', 'A2', 'A3', 'A4']

                     })

#rearrange columns
df = df[['Text','N_ID', 'P_ID', 'P_Name']]
df

                    Text      N_ID  P_ID    P_Name
0   Jon J Mmith is Here         A1  1   MMITH, JON J
1   Mary Lisa Hder found here   A2  2   HDER, MARY LISA
2   Jane A Doe is also here     A3  3   DOE, JANE A
3   Tom T Tcker is here to     A4   4   TCKER, TOM T

目标

1）将

p_Name

列从

df

更改为与所需输出类似的格式；也就是说，将当前格式（例如

MMITH，JON J

）更改为一种格式（例如

MMITH，JON J

），其中姓名和中间字母都以大写字母开头

2）在新列

p\u Name\u new

所需输出

                Text         N_ID P_ID    P_Name           P_Name_New
0   Jon J Mmith is Here         A1  1   MMITH, JON J     Mmith, Jon J
1   Mary Lisa Hder found here   A2  2   HDER, MARY LISA  Hder, Mary Lisa
2   Jane A Doe is also here     A3  3   DOE, JANE A      Doe, Jane A
3   Tom T Tcker is here too A4  4   TCKER, TOM T    Tcker, Tom T

问题

我如何实现我想要的目标

只需使用

str.title（）

函数：

In [98]: df['P_Name_New'] = df['P_Name'].str.title()                                                                            

In [99]: df                                                                                                                     
Out[99]: 
                         Text N_ID  P_ID            P_Name        P_Name_New
0         Jon J Smith is Here   A1     1      SMITH, JON J      Smith, Jon J
1  Mary Lisa Rider found here   A2     2  RIDER, MARY LISA  Rider, Mary Lisa
2     Jane A Doe is also here   A3     3       DOE, JANE A       Doe, Jane A
3    Tom T Tucker is here too   A4     4     TUCKER, TOM T     Tucker, Tom T

只需使用

str.title（）

函数：

In [98]: df['P_Name_New'] = df['P_Name'].str.title()                                                                            

In [99]: df                                                                                                                     
Out[99]: 
                         Text N_ID  P_ID            P_Name        P_Name_New
0         Jon J Smith is Here   A1     1      SMITH, JON J      Smith, Jon J
1  Mary Lisa Rider found here   A2     2  RIDER, MARY LISA  Rider, Mary Lisa
2     Jane A Doe is also here   A3     3       DOE, JANE A       Doe, Jane A
3    Tom T Tucker is here too   A4     4     TUCKER, TOM T     Tucker, Tom T

执行类似于

应用lambda x:x.title

的操作是否有任何性能或其他差异？谢谢@patrick，

%timeit df['P_Name'].str.title（）每个循环110µs±1.99µs（平均±标准偏差为7次运行，每个循环10000次）

%timeit df['P_Name'].应用（λx:x.title（））每个循环146µs±483 ns（7次运行的平均值±标准偏差，每个循环10000次）

执行类似于

应用lambda x:x.title的操作是否有任何性能或其他差异？谢谢@patrick，%timeit df['P_Name'].str.title（）每个循环110µs±1.99µs（平均±标准偏差为7次运行，每个循环10000次）
-%timeit df['P_Name'].应用（λx:x.title（））每个回路146µs±483 ns（7次运行的平均值±标准偏差，每个10000个回路）