Python 将dataframe导出到属性列表结构xml
我叫巴勃罗,这是我在这个小组里的第一个问题。 在查看了其他相关帖子后,我决定提出一个请求, 我想知道是否有办法执行以下操作 假设我有以下数据帧结构:Python 将dataframe导出到属性列表结构xml,python,xml,pandas,dataframe,Python,Xml,Pandas,Dataframe,我叫巴勃罗,这是我在这个小组里的第一个问题。 在查看了其他相关帖子后,我决定提出一个请求, 我想知道是否有办法执行以下操作 假设我有以下数据帧结构: +----+---------+------------+------------+----------+ | | MRBTS | dest | gw | length | |----+---------+------------+------------+----------| | 0 | 1300
+----+---------+------------+------------+----------+
| | MRBTS | dest | gw | length |
|----+---------+------------+------------+----------|
| 0 | 13004 | 10.104.0.0 | 10.48.0.0 | 16 |
| 1 | 13004 | 10.107.0.0 | 10.45.0.0 | 16 |
| 2 | 13005 | 10.104.0.0 | 10.130.0.0 | 8 |
| 3 | 13005 | 10.102.0.0 | 10.130.0.0 | 8 |
| 4 | 13005 | 0.0.0.0 | 10.110.0.0 | 16 |
+----+---------+------------+------------+----------+
测试DF:我想通过MRBT导出到XML列表groupping,如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE raml SYSTEM 'raml20.dtd'>
<raml version="2.0" xmlns="raml20.xsd">
<cmData type="plan" scope="all" name="iprt" id="PlanConfiguration( 7152069 )">
<header>
<log dateTime="2020-06-19T07:38:16.000-03:00" action="created" appInfo="PlanExporter">InternalValues are used</log>
</header>
<managedObject distName="MRBTS-13004">
<list >
<item>
<p name="dest">10.104.0.0</p>
<p name="length">16</p>
<p name="gw">10.38.0.0</p>
</item>
<item>
<p name="dest">10.107.0.0</p>
<p name="length">16</p>
<p name="gw">10.45.0.0</p>
</item>
</list>
</managedObject>
<managedObject distName="MRBTS-13005">
<list >
<item>
<p name="dest">10.104.0.0</p>
<p name="length">8</p>
<p name="gw">10.130.8.0</p>
</item>
<item>
<p name="dest">10.102.0.0</p>
<p name="length">8</p>
<p name="gw">10.130.8.0</p>
</item>
<item>
<p name="dest">0.0.0.0</p>
<p name="length">16</p>
<p name="gw">10.110.0.0</p>
</item>
</list>
</managedObject>
</cmData>
</raml>
使用内部值
10.104.0.0
16
10.38.0.0
10.107.0.0
16
10.45.0.0
10.104.0.0
8
10.130.8.0
10.102.0.0
8
10.130.8.0
0.0.0.0
16
10.110.0.0
我从另一篇文章()中获得了这段代码,但在尝试按MRBTS分组时,我被绊倒了:
import pandas as pd
df = pd.DataFrame({'MRBTS':['13004','13004','13005','13005','13005'],
'dest':['10.104.0.0','10.107.0.0','10.104.0.0','10.102.0.0','0.0.0.0'],
'gw':['10.48.0.0','10.45.0.0','10.130.0.0','10.130.0.0','10.110.0.0'],
'length':['16','16','8','8','16']})
def func(row):
xml = ['<list >']
for field in row.index:
xml.append(' <field name="{0}">{1}</field>'.format(field, row[field]))
xml.append('</list>')
return '\n'.join(xml)
print ('\n'.join(df.apply(func, axis=1)))
将熊猫作为pd导入
df=pd.DataFrame({'MRBTS':['13004','13004','13005','13005','13005'],
“dest”:['10.104.0.0'、'10.107.0.0'、'10.104.0.0'、'10.102.0.0'、'0.0.0'],
‘gw’:[‘10.48.0.0’、‘10.45.0.0’、‘10.130.0.0’、‘10.130.0.0’、‘10.110.0.0’],
'长度':['16','16','8','8','16']})
def func(世界其他地区):
xml=['']
对于row.index中的字段:
append(“{1}.”格式(字段,行[field]))
xml.append(“”)
返回'\n'。加入(xml)
打印('\n'.join(df.apply(func,axis=1)))
这个结果是:
<list >
<field name="MRBTS">13004</field>
<field name="dest">10.104.0.0</field>
<field name="gw">10.48.0.0</field>
<field name="length">16</field>
</list>
<list >
<field name="MRBTS">13004</field>
<field name="dest">10.107.0.0</field>
<field name="gw">10.45.0.0</field>
<field name="length">16</field>
</list>
<list >
<field name="MRBTS">13005</field>
<field name="dest">10.104.0.0</field>
<field name="gw">10.130.0.0</field>
<field name="length">8</field>
</list>
<list >
<field name="MRBTS">13005</field>
<field name="dest">10.102.0.0</field>
<field name="gw">10.130.0.0</field>
<field name="length">8</field>
</list>
<list >
<field name="MRBTS">13005</field>
<field name="dest">0.0.0.0</field>
<field name="gw">10.110.0.0</field>
<field name="length">16</field>
</list>
13004
10.104.0.0
10.48.0.0
16
13004
10.107.0.0
10.45.0.0
16
13005
10.104.0.0
10.130.0.0
8.
13005
10.102.0.0
10.130.0.0
8.
13005
0.0.0.0
10.110.0.0
16
您能帮我解决这个问题吗?我认为关键在于首先为目标xml表示更好地构建数据
agg()
json2xml
13004
10.104.0.0
16
10.48.0.0
10.107.0.0
16
10.45.0.0
我认为关键在于首先为目标xml表示更好地构建数据
agg()
json2xml
13004
10.104.0.0
16
10.48.0.0
10.107.0.0
16
10.45.0.0
由于XML文档不是文本文档,因此避免使用字符串连接构建XML。取而代之的是考虑使用第三方<代码> LXML X/Cub >或内置模块>代码> EtRE < /C>(DOM方法)构建树(稍加修改)。对于数据,通过MRBTS
字段迭代数据帧的子集:
import lxml.etree as et
import pandas as pd
### STATIC PART OF XML
root = et.Element('raml', {"version": "2.0", "xmlns": "raml20.xsd"})
cmData = et.SubElement(root, "cmData",
{"type":"plan", "scope":"all", "name":"iprt", "id":"PlanConfiguration( 7152069 )"})
header = et.SubElement(cmData, "header")
log = et.SubElement(header, "log",
{"dateTime":"2020-06-19T07:38:16.000-03:00", "action":"created", "appInfo":"PlanExporter"})
log.text = "InternalValues are used"
### DYNAMIC PART OF XML
df = pd.DataFrame({'MRBTS':['13004','13004','13005','13005','13005'],
'dest':['10.104.0.0','10.107.0.0','10.104.0.0','10.102.0.0','0.0.0.0'],
'gw':['10.48.0.0','10.45.0.0','10.130.0.0','10.130.0.0','10.110.0.0'],
'length':['16','16','8','8','16']})
# SUBSET ITERATION
for i, g in df.groupby("MRBTS"):
managedObject = et.SubElement(cmData, "managedObject", {"distName":"MRBTS-"+i})
list = et.SubElement(managedObject, "list")
# BUILD DICTIONARY OUT OF EACH ROW
d = g.drop('MRBTS', axis='columns').to_dict('index')
for ik, iv in d.items():
item = et.SubElement(list, 'item')
for k, v in iv.items():
p = et.SubElement(item, 'p', {"name":k})
p.text = v
# OUTPUT TREE
tree = et.ElementTree(root)
tree_out = tree.write("Output.xml",
xml_declaration=True,
encoding="UTF-8",
pretty_print=True,
doctype="<!DOCTYPE raml SYSTEM 'raml20.dtd'>")
将lxml.etree作为et导入
作为pd进口熊猫
###XML的静态部分
root=et.Element('raml',{“版本”:“2.0”,“xmlns”:“raml20.xsd”})
cmData=et.SubElement(根,“cmData”,
{“类型”:“计划”,“范围”:“全部”,“名称”:“iprt”,“id”:“计划配置(7152069)”})
header=et.SubElement(cmData,“header”)
log=et.SubElement(标题“log”,
{“日期时间”:“2020-06-19T07:38:16.000-03:00”,“操作”:“已创建”,“应用信息”:“平面导出器”})
log.text=“使用内部值”
###XML的动态部分
df=pd.DataFrame({'MRBTS':['13004','13004','13005','13005','13005'],
“dest”:['10.104.0.0'、'10.107.0.0'、'10.104.0.0'、'10.102.0.0'、'0.0.0'],
‘gw’:[‘10.48.0.0’、‘10.45.0.0’、‘10.130.0.0’、‘10.130.0.0’、‘10.110.0.0’],
'长度':['16','16','8','8','16']})
#子集迭代
对于df.groupby(“MRBTS”)中的i,g:
managedObject=et.SubElement(cmData,“managedObject”,{“distName”:“MRBTS-”+i})
list=et.SubElement(managedObject,“list”)
#从每一行生成字典
d=g.drop('MRBTS',axis='columns')。to_dict('index'))
对于ik,d中的iv。项()
item=et.SubElement(列表“item”)
对于iv.项()中的k、v:
p=et.SubElement(项,'p',{“name”:k})
p、 text=v
#输出树
tree=et.ElementTree(根)
tree\u out=tree.write(“Output.xml”,
xml_声明=True,
encoding=“UTF-8”,
<?xml version="1.0" ?>
<all>
<item>
<distName>13004</distName>
<item>
<item>
<dest>10.104.0.0</dest>
<length>16</length>
<gw>10.48.0.0</gw>
</item>
<item>
<dest>10.107.0.0</dest>
<length>16</length>
<gw>10.45.0.0</gw>
</item>
</item>
</item>
</all>
import lxml.etree as et
import pandas as pd
### STATIC PART OF XML
root = et.Element('raml', {"version": "2.0", "xmlns": "raml20.xsd"})
cmData = et.SubElement(root, "cmData",
{"type":"plan", "scope":"all", "name":"iprt", "id":"PlanConfiguration( 7152069 )"})
header = et.SubElement(cmData, "header")
log = et.SubElement(header, "log",
{"dateTime":"2020-06-19T07:38:16.000-03:00", "action":"created", "appInfo":"PlanExporter"})
log.text = "InternalValues are used"
### DYNAMIC PART OF XML
df = pd.DataFrame({'MRBTS':['13004','13004','13005','13005','13005'],
'dest':['10.104.0.0','10.107.0.0','10.104.0.0','10.102.0.0','0.0.0.0'],
'gw':['10.48.0.0','10.45.0.0','10.130.0.0','10.130.0.0','10.110.0.0'],
'length':['16','16','8','8','16']})
# SUBSET ITERATION
for i, g in df.groupby("MRBTS"):
managedObject = et.SubElement(cmData, "managedObject", {"distName":"MRBTS-"+i})
list = et.SubElement(managedObject, "list")
# BUILD DICTIONARY OUT OF EACH ROW
d = g.drop('MRBTS', axis='columns').to_dict('index')
for ik, iv in d.items():
item = et.SubElement(list, 'item')
for k, v in iv.items():
p = et.SubElement(item, 'p', {"name":k})
p.text = v
# OUTPUT TREE
tree = et.ElementTree(root)
tree_out = tree.write("Output.xml",
xml_declaration=True,
encoding="UTF-8",
pretty_print=True,
doctype="<!DOCTYPE raml SYSTEM 'raml20.dtd'>")
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE raml SYSTEM 'raml20.dtd'>
<raml version="2.0" xmlns="raml20.xsd">
<cmData id="PlanConfiguration( 7152069 )" name="iprt" scope="all" type="plan">
<header>
<log action="created" appInfo="PlanExporter" dateTime="2020-06-19T07:38:16.000-03:00">InternalValues are used</log>
</header>
<managedObject distName="MRBTS-13004">
<list>
<item>
<p name="dest">10.104.0.0</p>
<p name="gw">10.48.0.0</p>
<p name="length">16</p>
</item>
<item>
<p name="dest">10.107.0.0</p>
<p name="gw">10.45.0.0</p>
<p name="length">16</p>
</item>
</list>
</managedObject>
<managedObject distName="MRBTS-13005">
<list>
<item>
<p name="dest">10.104.0.0</p>
<p name="gw">10.130.0.0</p>
<p name="length">8</p>
</item>
<item>
<p name="dest">10.102.0.0</p>
<p name="gw">10.130.0.0</p>
<p name="length">8</p>
</item>
<item>
<p name="dest">0.0.0.0</p>
<p name="gw">10.110.0.0</p>
<p name="length">16</p>
</item>
</list>
</managedObject>
</cmData>
</raml>