Python XML解析迭代&;手柄断裂
我正在尝试解析XML的这一部分,并希望通过确定需要运行多少次,它能够以迭代模式自己运行 此外,行项目可能有也可能没有每列的所有值,如果其中任何一列不存在特定的标记/文本,我将尝试用“无”填充这些空白,以便稍后在csv转换中将其映射到右列 要分析的我的XML(以粗体突出显示,发票行项目):Python XML解析迭代&;手柄断裂,python,python-3.x,xml-parsing,iterator,Python,Python 3.x,Xml Parsing,Iterator,我正在尝试解析XML的这一部分,并希望通过确定需要运行多少次,它能够以迭代模式自己运行 此外,行项目可能有也可能没有每列的所有值,如果其中任何一列不存在特定的标记/文本,我将尝试用“无”填充这些空白,以便稍后在csv转换中将其映射到右列 要分析的我的XML(以粗体突出显示,发票行项目): 有点晚了,但让我们试试这个: items = """[your xml above]""" import lxml.html import pandas as pd categories = ["invoi
有点晚了,但让我们试试这个:
items = """[your xml above]"""
import lxml.html
import pandas as pd
categories = ["invoicelinenum", "polinenum","quantity","uom","unitprice","lineamount","salestaxpercent","supplierpartnum","shortdescription",
"longdescription","deliverychargecode]"]
columns = ['ILI Line Num','ILI PO Line',
'ILI QTY', 'ILI UOM','ILI Unit Price','ILI Line Amt','ILI Sales Tax %',
'ILI Supply','ShortDesc','LongDesc','ChargeCode']
doc = lxml.html.fromstring(items)
invoices = doc.xpath('//InvoiceLineItems/LineItem'.lower())
def dict_to_list(d, keys):
return [d.get(key, None) for key in keys]
#credit: https://stackoverflow.com/a/58192327/9448090
all_inv = []
fin_dicts=[]
fin_list = []
for invoice in invoices:
items = []
for item in invoice:
item_dict = {}
item_dict[item.tag]= item.text
items.append(item_dict)
all_inv.append(items)
for inv in all_inv:
temp_dict={}
for d in inv:
temp_dict.update(d)
fin_dicts.append(temp_dict)
for dict in fin_dicts:
fin_list.append(dict_to_list(dict, categories))
df = pd.DataFrame(fin_list,columns=columns)
df
这将为您提供您要查找的表。您使用的是哪个库?xml.etree.ElementTree作为ET
items = """[your xml above]"""
import lxml.html
import pandas as pd
categories = ["invoicelinenum", "polinenum","quantity","uom","unitprice","lineamount","salestaxpercent","supplierpartnum","shortdescription",
"longdescription","deliverychargecode]"]
columns = ['ILI Line Num','ILI PO Line',
'ILI QTY', 'ILI UOM','ILI Unit Price','ILI Line Amt','ILI Sales Tax %',
'ILI Supply','ShortDesc','LongDesc','ChargeCode']
doc = lxml.html.fromstring(items)
invoices = doc.xpath('//InvoiceLineItems/LineItem'.lower())
def dict_to_list(d, keys):
return [d.get(key, None) for key in keys]
#credit: https://stackoverflow.com/a/58192327/9448090
all_inv = []
fin_dicts=[]
fin_list = []
for invoice in invoices:
items = []
for item in invoice:
item_dict = {}
item_dict[item.tag]= item.text
items.append(item_dict)
all_inv.append(items)
for inv in all_inv:
temp_dict={}
for d in inv:
temp_dict.update(d)
fin_dicts.append(temp_dict)
for dict in fin_dicts:
fin_list.append(dict_to_list(dict, categories))
df = pd.DataFrame(fin_list,columns=columns)
df