Web scraping 创建循环以解析scrapy中的表数据

Web scraping 创建循环以解析scrapy中的表数据,web-scraping,scrapy,Web Scraping,Scrapy,我试图用下面的HTML在表行中循环。我正在使用以下xpath选择器/*[@id=“employee table”]/tbody/tr,但它不起作用 <table id="employee-table" class="table table-striped table-bordered responsive-table dataTable no-footer" role="grid" aria-describedby="employee-table_info" style="width: 8

我试图用下面的HTML在表行中循环。我正在使用以下xpath选择器
/*[@id=“employee table”]/tbody/tr
,但它不起作用

<table id="employee-table" class="table table-striped table-bordered responsive-table dataTable no-footer" role="grid" aria-describedby="employee-table_info" style="width: 882px;">
<thead>
<tr role="row"><th class="sorting_asc" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-sort="ascending" aria-label=" Name : activate to sort column descending" style="width: 174px;"> Name </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Year : activate to sort column ascending" style="width: 36px;"> Year </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Title : activate to sort column ascending" style="width: 82px;"> Title </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Agency : activate to sort column ascending" style="width: 192px;"> Agency </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Location : activate to sort column ascending" style="width: 115px;"> Location </th><th class="sorting" tabindex="0" aria-controls="employee-table" rowspan="1" colspan="1" aria-label=" Salary : activate to sort column ascending" style="width: 50px;"> Salary </th></tr>
</thead>
<tbody>
<tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/bharatkumar-a-g">A G. Bharatkumar</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Wisconsin</td><td>$335,000</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/roure-a-rafael">A Rafael Roure</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$333,634</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/dumont-aaron-s">Aaron S. Dumont</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Louisiana</td><td>$330,302</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/andrews-aaron-t">Aaron T. Andrews</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$350,000</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/elmi-abdolali">Abdolali Elmi</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>West Virginia</td><td>$325,056</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/haleem-abdul-a">Abdul A. Haleem</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Missouri</td><td>$351,056</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/ward-abner-m">Abner M. Ward</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Hawaii</td><td>$337,756</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/cohen-adam-c">Adam C. Cohen</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Indiana</td><td>$340,000</td></tr><tr role="row" class="odd"><td class="sorting_1"><a href="/employees/veterans-health-administration/bakker-adam-j">Adam J. Bakker</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Minnesota</td><td>$325,980</td></tr><tr role="row" class="even"><td class="sorting_1"><a href="/employees/veterans-health-administration/bracha-adam-s">Adam S. Bracha</a></td><td>2015</td><td><a href="/employees/occupations/medical-officer">Medical Officer</a></td><td><a href="/employees/veterans-health-administration">Veterans Health Administration</a></td><td>Florida</td><td>$335,000</td></tr></tbody>
</table>

姓名年份职称机构所在地工资
2015年威斯康星州$3350002015佛罗里达州$3336342015路易斯安那州$330302015佛罗里达州$3500002015西弗吉尼亚州$3250562015密苏里州$3510562015夏威夷$3377562015印第安纳州$340000002015明尼索塔州$32598020015佛罗里达州$335000
试试
/*[@id=“employee table”]/tr

xpath无法工作的原因是
tbody
。你必须移除它,并检查是否得到你想要的结果

您可以在零碎的文档中阅读:

尤其是Firefox,它以在浏览器中添加
元素而闻名 桌子。另一方面,Scrapy不会修改原始页面 HTML,因此如果在中使用
,则无法提取任何数据 您的XPath表达式


使用
lxml
=>
r=tree.xpath('/*[@id=“employee table”]/tbody/tr')
尝试使用此选择器
/*[@id=“employee table”]/tbody/tr[@role=“row”]