Python Scrapy Xpath从file-extension.net中选择特定元素
我在理解xpath时遇到了一些困难 我正试着从你的脸上刮去所有的神奇数字 让我们以该链接为例: 部分源代码:Python Scrapy Xpath从file-extension.net中选择特定元素,python,xpath,scrapy,Python,Xpath,Scrapy,我在理解xpath时遇到了一些困难 我正试着从你的脸上刮去所有的神奇数字 让我们以该链接为例: 部分源代码: <table border=4 RULES=ROWS FRAME=HSIDES width=728> <tr class="tabhead"> <td></td> <td><b>Website</b><
<table border=4 RULES=ROWS FRAME=HSIDES width=728>
<tr class="tabhead">
<td></td>
<td><b>Website</b></td>
<td><b> EXT </b></td>
<td><b> Filetype description</b> </td>
</tr>
<tr class="rre"><td> <img src="images/icon-filext.png" width="16" height="16"> </td><td><a href="http://filext.com/file-extension/C10">FILExt</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_irig'>IRIG</a> 106 <a class='fesl' href='program_extension_original'>Original</a> <a class='fesl' href='program_extension_recording'>Recording</a> <a class='fesl' href='program_extension_file'>File</a> (<a class='fesl' href='program_extension_range'>Range</a> <a class='fesl' href='program_extension_commanders'>Commanders</a> <a class='fesl' href='program_extension_council'>Council</a>)</td></tr>
<tr class="rro"><td> <img src="images/icon-fsorg.png" width="16" height="16"> </td><td><a href="http://www.file-extensions.org/c10-file-extension">File Extensions</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_irig'>IRIG</a> 106 <a class='fesl' href='program_extension_original'>original</a> <a class='fesl' href='program_extension_recording'>recording</a> <a class='fesl' href='program_extension_file'>file</a></td></tr>
<tr class="rre"><td> <img src="images/icon-dotwhat.png" width="16" height="16"> </td><td><a href="http://dotwhat.net/c10/9166/">DotWhat</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_split'>Split</a> <a class='fesl' href='program_extension_compressed'>Compressed</a> <a class='fesl' href='program_extension_archive'>Archive</a> <a class='fesl' href='program_extension_file'>File</a> <a class='fesl' href='program_extension_part'>Part</a> 10</td></tr>
<tr class="rro"><td> <img src="images/icon-fsorg.png" width="16" height="16"> </td><td><a href="http://www.file-extensions.org/c10-file-extension">File Extensions</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_split'>Split</a> <a class='fesl' href='program_extension_multi'>Multi</a>-<a class='fesl' href='program_extension_volume'>volume</a> ACE <a class='fesl' href='program_extension_compressed'>compressed</a> <a class='fesl' href='program_extension_file'>file</a> <a class='fesl' href='program_extension_archive'>archive</a></td></tr>
<tr class="rre"><td> <img src="images/icon-trid.png" width="16" height="16"> </td><td><a href="http://mark0.net/soft-trid-e.html">TrID</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_virtual'>Virtual</a> MC-10 <a class='fesl' href='program_extension_tape'>tape</a> <a class='fesl' href='program_extension_image'>image</a><br> <b><small>Header Hexdump</b>: <span class='hexdump'> 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 </span></small></td></tr>
<tr class="rro"><td> <img src="images/icon-filext.png" width="16" height="16"> </td><td><a href="http://filext.com/file-extension/C10">FILExt</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_winace'>WinAce</a> <a class='fesl' href='program_extension_compressed'>Compressed</a> <a class='fesl' href='program_extension_file'>File</a> <a class='fesl' href='program_extension_split'>Split</a> <a class='fesl' href='program_extension_portion'>Portion</a> of <a class='fesl' href='program_extension_compressed'>Compressed</a> <a class='fesl' href='program_extension_file'>File</a> (e-<a class='fesl' href='program_extension_merge'>merge</a> <a class='fesl' href='program_extension_gmbh'>GmbH</a>)</td></tr>
<tr class="rre"><td> <img src="images/icon-fileinfo.png" width="16" height="16"> </td><td><a href="http://www.fileinfo.com/extension/c10">FileInfo</a></td><td> <a class='fesl' href='file_extension_c10'>C10</a> </td><td> <a class='fesl' href='program_extension_winace'>WinAce</a> <a class='fesl' href='program_extension_split'>Split</a> <a class='fesl' href='program_extension_archive'>Archive</a> <a class='fesl' href='program_extension_part'>Part</a> 10</td></tr>
</table>
当然//a[text()=“TrID”]/@href.a[@class=“fesl”
不起作用
但这正是我想要的:
If you find a link name wich contains "Trid" give me it's filedescription
有什么想法吗
'//td[./a[contains(text(), "TrID")]]/following-sibling::td[2]//text()'
只需更改所需行中的另一个文本的TrID
只需将
TrID
更改为您想要的行中的另一个文本。不客气,如果答案对您有帮助,请记住接受。不客气,如果答案对您有帮助,请记住接受。
'//td[./a[contains(text(), "TrID")]]/following-sibling::td[2]//text()'