在python中将XML解析为字符串列表

在python中将XML解析为字符串列表,python,xml,Python,Xml,编辑:XML文件 -<corpus lang="en" id="subtask2-heterographic"> -<text id="het_1"> <word id="het_1_1">'</word> <word id="het_1_2">'</word> <word id="het_1_3">I</word> <word id="het_1_4">'&l

编辑:XML文件

-<corpus lang="en" id="subtask2-heterographic">


-<text id="het_1">

  <word id="het_1_1">'</word>

  <word id="het_1_2">'</word>

  <word id="het_1_3">I</word>

  <word id="het_1_4">'</word>

  <word id="het_1_5">m</word>

  <word id="het_1_6">halfway</word>

  <word id="het_1_7">up</word>

  <word id="het_1_8">a</word>

  <word id="het_1_9">mountain</word>

  <word id="het_1_10">,</word>

  <word id="het_1_11">'</word>

  <word id="het_1_12">'</word>

  <word id="het_1_13">Tom</word>

  <word id="het_1_14">alleged</word>

  <word id="het_1_15">.</word>

</text>


-<text id="het_2">

  <word id="het_2_1">I</word>

  <word id="het_2_2">'</word>

  <word id="het_2_3">d</word>

  <word id="het_2_4">like</word>

  <word id="het_2_5">to</word>

  <word id="het_2_6">be</word>

  <word id="het_2_7">a</word>

  <word id="het_2_8">Chinese</word>

  <word id="het_2_9">laborer</word>

  <word id="het_2_10">,</word>

  <word id="het_2_11">said</word>

  <word id="het_2_12">Tom</word>

  <word id="het_2_13">coolly</word>

  <word id="het_2_14">.</word>

 </text>
</corpus>
这只给出XML文件中的所有单词,而不分隔句子。 如何将每个句子作为字符串列表放入列表中

最终预期产出:

>> [["'", "'", 'I', "'", 'm', 'halfway', 'up', 'a', 'mountain', ',', "'", "'", 'Tom', 'alleged', '.'] , ['I', "'", 'd', 'like', 'to', 'be', 'a', 'Chinese', 'laborer', ',', 'said', 'Tom', 'coolly', '.'], ['Dentists', ...] ]

您必须为每个句子创建一个新列表:

sentences = []
for elem in root:
    sentence = []
    for w in elem:
        sentence.append(w.text)
    sentences.append(sentence)

将xml片段发布到start@RomanPerekhrest很抱歉编辑好了,我们收到输入了。现在,发布最终的预期输出please@RomanPerekhrest完成。谢谢
sentences = []
for elem in root:
    sentence = []
    for w in elem:
        sentence.append(w.text)
    sentences.append(sentence)