在Excel中按原样排列表格';使用Python和bs4以HTML显示

在Excel中按原样排列表格';使用Python和bs4以HTML显示,python,python-3.x,csv,html-table,beautifulsoup,Python,Python 3.x,Csv,Html Table,Beautifulsoup,我已经成功地抓取了一个我想在.xlsx中显示的表 当它显示在浏览器中时,这就是我希望它在excel中显示的方式 它的显示方式应该是 A1=1 B1=准备协助与事件响应相关的行动和活动 C1=1.1 D1=确定事故响应的责任人和WHS立法要求 A2=空白 B2=空白 C2=1.2 D2=确定与事故响应计划和报告相关的工作场所政策、程序和流程 下面是我的代码,后面是我所搜集的HTML for i in Elements.findAll('tr'): columns = i.findAll('

我已经成功地抓取了一个我想在.xlsx中显示的表

当它显示在浏览器中时,这就是我希望它在excel中显示的方式

它的显示方式应该是

A1=1

B1=准备协助与事件响应相关的行动和活动

C1=1.1

D1=确定事故响应的责任人和WHS立法要求

A2=空白

B2=空白

C2=1.2

D2=确定与事故响应计划和报告相关的工作场所政策、程序和流程

下面是我的代码,后面是我所搜集的HTML

for i in Elements.findAll('tr'):
    columns = i.findAll('td')
    output_row = []
    for column in columns:
        sub_rows = column.findAll('p')
        for row in sub_rows:
            output_row.append(row.get_text(separator=' '))
    Element_rows.append(output_row)

-----------------------------------------------------------------

<table class="ait-table" width="943">
<tr>
<td style="border:1px solid ;;vertical-align: top;" width="299">
<p class="ait4"><strong class="ait24">ELEMENTS</strong>�</p>
</td>
<td style="border:1px solid ;;vertical-align: top;" width="766">
<p class="ait4"><strong class="ait24">PERFORMANCE CRITERIA</strong>�</p>
</td>
</tr>
<tr>
<td style="border:1px solid ;;vertical-align: top;" width="299">
<p class="ait4"><em class="ait7">Elements describe the essential outcomes.</em></p>
</td>
<td style="border:1px solid ;;vertical-align: top;" width="766">
<p class="ait4"><em class="ait7">Performance criteria describe the performance needed to demonstrate achievement of the element.</em></p>
</td>
</tr>
<tr>
<td style="border:1px solid #333333;;vertical-align: top;" width="299">
<p class="ait4">1. Prepare to assist with actions and activities associated with incident response</p>
</td>
<td style="border:1px solid #333333;;vertical-align: top;" width="766">
<p class="ait4">1.1 Identify duty holders and WHS legislative requirements for incident response</p>
<p class="ait4">1.2 Identify workplace policies, procedures and processes concerning incident response planning and reporting</p>
<p class="ait4">1.3 Communicate requirements for responding to incident to required personnel within scope of own role and work area</p>
<p class="ait4">1.4 Contribute to developing communication mechanisms to notify manager of incident</p>
</td>
</tr>
<tr>
<td style="border:1px solid #333333;;vertical-align: top;" width="299">
<p class="ait4">2. Assist with implementing response procedures during incident</p>
</td>
<td style="border:1px solid #333333;;vertical-align: top;" width="766">
<p class="ait4">2.1 Provide initial assistance to those involved in incident within scope of own role and expertise and according to organisational incident response policies and procedures</p>
<p class="ait4">2.2 Assist with documenting incident according to workplace procedures and processes</p>
<p class="ait4">2.3 Assist with meeting legislative requirements regarding incident, within scope of own role and expertise</p>
<p class="ait4">2.4 Assist with reporting incident to external authorities, according to legislative requirements and workplace procedures and processes </p>
</td>
</tr>
<tr>
<td style="border:1px solid #333333;;vertical-align: top;" width="299">
<p class="ait4">3. Contribute to collecting WHS information about incident</p>
</td>
<td style="border:1px solid #333333;;vertical-align: top;" width="766">
<p class="ait4">3.1 Assist with obtaining information and data from those involved about actions and events leading up to, during and after an incident, using appropriate data collection techniques</p>
<p class="ait4">3.2 Assist with identifying and accessing sources of additional information and data related to incident</p>
<p class="ait4">3.3 Compile and enter information according to record-keeping requirements</p>
</td>
</tr>
<tr>
<td style="border:1px solid #333333;;vertical-align: top;" width="299">
<p class="ait4">4. Assist with incident investigation</p>
</td>
<td style="border:1px solid #333333;;vertical-align: top;" width="766">
<p class="ait4">4.1 Assist with applying required incident investigation processes</p>
<p class="ait4">4.2 Use appropriate analysis techniques to interpret causes of incident and communicate with advisors when participating in workplace investigations</p>
<p class="ait4">4.3 Review incident reports according to organisational policies and procedures</p>
<p class="ait4">4.4 Contact responsible persons and relevant authorities as outlined in WHS laws, and organisational policies and procedures</p>
<p class="ait4">4.5 Contribute to communicating investigation outcomes to relevant stakeholders according to organisational policies and procedures</p>
</td>
</tr>
<tr>
<td style="border:1px solid #333333;;vertical-align: top;" width="299">
<p class="ait4">5. Contribute to developing and implementing recommended measures and actions arising from incident investigation</p>
</td>
<td style="border:1px solid #333333;;vertical-align: top;" width="766">
<p class="ait4">5.1 Contribute to developing incident investigation recommendations </p>
<p class="ait4">5.2 Assist with obtaining approval of developed recommendations from required stakeholders according to organisational policies and procedures</p>
<p class="ait4">5.3 Assist with communicating approved recommendations to required stakeholders according to organisational policies and procedures</p>
<p class="ait4">5.4 Contribute to implementing recommended measures and actions arising from incident investigation within scope of own role and according to WHS legislative requirements</p>
</td>
</tr>
</table>
元素中i的
。findAll('tr'):
columns=i.findAll('td')
输出_行=[]
对于列中的列:
sub_rows=column.findAll('p')
对于子行中的行:
输出\行.追加(行.获取\文本(分隔符=“”))
元素\行。追加(输出\行)
-----------------------------------------------------------------

元素�

绩效标准�

要素描述了基本结果

绩效标准描述了证明要素实现所需的绩效

1。准备协助与事件响应相关的行动和活动

1.1确定责任人和WHS事故响应的立法要求

1.2确定与事故响应计划和报告相关的工作场所政策、程序和流程

1.3将事件响应要求传达给自身角色和工作区域范围内的所需人员

1.4有助于建立沟通机制,将事件通知经理

2。协助执行事故期间的响应程序

2.1根据组织的事故响应政策和程序,在自身职责和专业知识范围内向事故相关人员提供初步协助

2.2根据工作场所的程序和流程,协助记录事件

2.3在自己的职责和专业知识范围内,协助满足有关事故的立法要求

2.4根据立法要求和工作场所程序和流程,协助向外部机构报告事件

3。协助收集WHS事件信息

3.1使用适当的数据收集技术,协助从相关人员处获取关于事件发生之前、期间和之后的行动和事件的信息和数据

3.2协助识别和访问与事件相关的附加信息和数据来源

3.3根据记录保存要求编译和输入信息

4。协助事故调查

4.1协助应用所需的事故调查流程

4.2在参与工作场所调查时,使用适当的分析技术解释事故原因并与顾问沟通

4.3根据组织政策和程序审查事件报告

4.4联系WHS法律、组织政策和程序中概述的负责人和相关当局

4.5有助于根据组织政策和程序向相关利益相关者传达调查结果

5。协助制定和实施事故调查中提出的建议措施和行动

5.1有助于制定事故调查建议

5.2根据组织政策和程序,协助获得所需利益相关者对拟定建议的批准

5.3根据组织政策和程序,协助将批准的建议传达给所需的利益相关者

5.4在自身职责范围内,根据WHS立法要求,协助实施事故调查中提出的建议措施和行动


此示例使用
re
itertools.zip\u longest
获取所需值,并使用
csv
模块写入文件(
html\u data
是您问题中的代码片段):

结果是文件
data.csv
(来自我的LibreOffice计算的屏幕截图):


对不起,这里太乱了。
import re
import csv
from bs4 import BeautifulSoup
from itertools import zip_longest

soup = BeautifulSoup(html_data, 'html.parser')
tds = soup.select('td')

with open('data.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)

    for td1, td2 in zip(tds[::2], tds[1::2]):
        cell_ab = re.findall(r'(\d+.)\s*(.*)', td1.text)
        if not cell_ab:
            continue
        cell_cd = re.findall(r'(\d+.\d+)\s*(.*)', td2.text)

        for (a, b), (c, d) in zip_longest(cell_ab, cell_cd, fillvalue=(None, None)):
            writer.writerow([a, b, c, d])