如何用python dataframe中的某些字符替换列的开头和结尾
我有一个如下所示的数据帧:如何用python dataframe中的某些字符替换列的开头和结尾,python,regex,pandas,replace,Python,Regex,Pandas,Replace,我有一个如下所示的数据帧: clients_x clients_y coords_x coords_y 7110001002 7100019838 -23.63013,-46.704887 -23.657433,-46.744095 7110001002 7100021875 -23.63013,-46.704887 -2
clients_x clients_y coords_x coords_y
7110001002 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
7110001002 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
7110001002 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
df["clients_x"] = "[" + "df["clients_x"]" + "]"
我要做的是让clients\u x
列中的所有值以“[]”开头和结尾。因此,我的预期输出是:
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[[7110052941, 7110107795]] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[[7110052941, 7110107795]] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[[7110052941, 7110107795]] 0700245857 -23.609,-46.6974 -23.7074,-46.569
首先,我试着做这样的事情:
clients_x clients_y coords_x coords_y
7110001002 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
7110001002 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
7110001002 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
df["clients_x"] = "[" + "df["clients_x"]" + "]"
但是,这样做实际上会在每个值的开头和结尾添加“[]”,但是对于那些已经有“[]”的行,将复制它们。输出如下:
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[7110052941, 7110107795] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[7110052941, 7110107795] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[7110052941, 7110107795] 0700245857 -23.609,-46.6974 -23.7074,-46.569
clients_x clients_y coords_x coords_y
[7110001002] 7100019838 -23.63013,-46.704887 -23.657433,-46.744095
[7110001002] 7100021875 -23.63013,-46.704887 -23.7729,-46.591366
[7110001002] 0700245857 -23.63013,-46.704887 -23.7074,-46.5698
[[7110052941, 7110107795]] 7100019838 -23.609,-46.6974 -23.657433,-46.744095
[[7110052941, 7110107795]] 7100021875 -23.609,-46.6974 -23.7729,-46.591366
[[7110052941, 7110107795]] 0700245857 -23.609,-46.6974 -23.7074,-46.569
为了避免这个问题,我尝试使用以下代码,基本上我想在clients\u x
列中以数字开头的每个值的开头和结尾添加“[]”
df['clients_x'] = df['clients_x'].mask(df['clients_x'].astype(str).str.startswith(r'^\d'), f'[{df.clients_x}]')
但是,这行代码生成的输出与我的原始数据帧相同。如果有人知道如何解决这个问题,我将非常感谢您的帮助。使用-
使用-
输出
0 [7110001002]
1 [7110001002]
2 [7110001002]
3 [7110052941,7110107795]
4 [7110052941,7110107795]
5 [7110052941,7110107795]
Name: clients_x, dtype: object
使用-
使用-
输出
0 [7110001002]
1 [7110001002]
2 [7110001002]
3 [7110052941,7110107795]
4 [7110052941,7110107795]
5 [7110052941,7110107795]
Name: clients_x, dtype: object
您需要使用
where
,而不是mask
(请参阅):
df[“clients\u x”]=df.clients\u x.where(
df.clients_x.astype(str).str.startswith(“[”),
“[“+df.clients_x+”]”
)
您需要使用where
,而不是mask
(请参阅):
df[“clients\u x”]=df.clients\u x.where(
df.clients_x.astype(str).str.startswith(“[”),
“[“+df.clients_x+”]”
)