Python 拆分dataframe列并删除额外变量
你能帮我拆分如下所示的数据帧吗Python 拆分dataframe列并删除额外变量,python,python-3.x,dataframe,Python,Python 3.x,Dataframe,你能帮我拆分如下所示的数据帧吗 metric {"{'test'}.ab.ov.{'prd'}.gcp.{'392032'}_{rtw}":0.3232,"{'test'}.abv.ov.{'prd'}.gcp.{'392032'}_{rtw}.services.{'API'}":0.3243} {"{'test'}.aew.dsc.{'prd'}.gcp.{'8329'}_{abc}":0.3232,"{'test'}.ae
metric
{"{'test'}.ab.ov.{'prd'}.gcp.{'392032'}_{rtw}":0.3232,"{'test'}.abv.ov.{'prd'}.gcp.{'392032'}_{rtw}.services.{'API'}":0.3243}
{"{'test'}.aew.dsc.{'prd'}.gcp.{'8329'}_{abc}":0.3232,"{'test'}.aew.dsc.{'prd'}.gcp.{'8329'}_{abc}.services.{'Instance'}":0.932}
{"{'test'}.acd.ard.{'prd'}.gcp.{'892'}_{dxj}":0.3232,"{'test'}.abv.ov.{'prd'}.gcp.{'392032'}_{dxj}.services.{'Big data'}":0.3254}
基于、
删除{}、”和“
然后把两根柱子连在一起
metric
test.ab.ov.prd.gcp.392032_rtw:0.3232
test.abv.ov.prd.gcp.392032_rtw.services.API:0.3243
test.aew.dsc.prd.gcp.8329_abc:0.3232
test.aew.dsc.prd.gcp.8329_abc.services.Instance:0.932
test.acd.ard.prd.gcp.892_dxj:0.3232
test.abv.ov.prd.gcp.392032_dxj.services.Big data:0.3254
现在再次根据:
metric cost
test.ab.ov.prd.gcp.392032_rtw 0.3232
test.abv.ov.prd.gcp.392032_rtw.services.API 0.3243
test.aew.dsc.prd.gcp.8329_abc 0.3232
test.aew.dsc.prd.gcp.8329_abc.services.Instance 0.932
test.acd.ard.prd.gcp.892_dxj 0.3232
test.abv.ov.prd.gcp.392032_dxj.services.Big data 0.3254
使用字符串方法-
metric = [{
"{'test'}.ab.ov.{'prd'}.gcp.{'392032'}_{rtw}": 0.3232,
"{'test'}.abv.ov.{'prd'}.gcp.{'392032'}_{rtw}.services.{'API'}": 0.3243
}, {
"{'test'}.aew.dsc.{'prd'}.gcp.{'8329'}_{abc}": 0.3232,
"{'test'}.aew.dsc.{'prd'}.gcp.{'8329'}_{abc}.services.{'Instance'}": 0.932
}, {
"{'test'}.acd.ard.{'prd'}.gcp.{'892'}_{dxj}": 0.3232,
"{'test'}.abv.ov.{'prd'}.gcp.{'392032'}_{dxj}.services.{'Big data'}": 0.3254
}]
df = pd.DataFrame(metric).transpose().fillna(0).sum(axis=1).reset_index()
df['index'] = df['index'].str.replace(r'{|}|\'','')
df = df.rename(columns = {'index' : 'metric', 0: ' cost'})
输出-
metric cost
0 test.ab.ov.prd.gcp.392032_rtw 0.3232
1 test.abv.ov.prd.gcp.392032_rtw.services.API 0.3243
2 test.aew.dsc.prd.gcp.8329_abc 0.3232
3 test.aew.dsc.prd.gcp.8329_abc.services.Instance 0.9320
4 test.acd.ard.prd.gcp.892_dxj 0.3232
5 test.abv.ov.prd.gcp.392032_dxj.services.Big data 0.3254
如果度量是数据帧,则可以使用-
df = df['metric'].str.split(',', expand=True)
df = df[0].append(df[1]).str.replace(r'{|}|\'','').str.split(':', expand=True).rename(columns = {0:'metric', 1: 'cost'})
df['metric'] = df['metric'].str.strip('"')
metric是一个数据帧,我尝试使用
metric=df.values.tolist()
将其转换为一个列表并使用它,但当我应用df=pd.dataframe(metric.transpose().fillna(0).sum(axis=1).reset_index()
df显示索引0\n0.0@Nk03时,你能告诉我为什么它不工作吗?哦,它是一个数据帧。让我添加更新后的答案。谢谢,但我已经尝试过拆分metricdf=df['metric'].str.split(',',expand=True)
但它的输出为column=0,每行为Nan,我认为它没有拆分,因为,在{}之间,所以可能无法拆分度量。知道为什么吗?
df = df['metric'].str.split(',', expand=True)
df = df[0].append(df[1]).str.replace(r'{|}|\'','').str.split(':', expand=True).rename(columns = {0:'metric', 1: 'cost'})
df['metric'] = df['metric'].str.strip('"')