python selenium删除包含部分文本的项_Python_Selenium_Web Scraping_Beautifulsoup

python selenium删除包含部分文本的项

python selenium web-scraping

python selenium删除包含部分文本的项,python,selenium,web-scraping,beautifulsoup,Python,Selenium,Web Scraping,Beautifulsoup,我想从html表中提取特定元素，这是我当前的代码： tabela = soup.find("div", {"class" : "productDatatable"}) >>> tabela <div class="productDatatable">\n<div>\r\n Category:\r\n <span class="productDatatableValue">

我想从html表中提取特定元素，这是我当前的代码：

tabela  = soup.find("div", {"class" : "productDatatable"})
>>> tabela

<div class="productDatatable">\n<div>\r\n            Category:\r\n                        <span class="productDatatableValue">\n<a href="/en/market/mt5/utility">Utilities</a>\n</span>\n</div>\n<div title="Number of activations available for the buyers of this application. During the activation, software product is bound to the buyer's hardware, so that the copy of the application cannot work on another PC. The application should be re-activated and downloaded again in order to launch it on another computer. If the activation limit is exceeded, the buyer will have to purchase the product again.">\r\n            Activations:\r\n                        <span class="productDatatableValue">\r\n                            5\r\n                        </span>\n</div>\n<div style="padding:5px;"></div>\n<div>\r\n            Author:\r\n                        <span class="productDatatableValue">\n<span style="display: inline-block; vertical-align: middle; margin-top: -2px;"><span class="icoVerified small" title="Verified User"></span></span>\n<span title="Konstantin Chernov"><a class="author" href="/en/users/konstantin83" title="Konstantin83">Konstantin Chernov</a></span>\n</span>\n</div>\n<div>\r\n            Published:\r\n                        <span class="productDatatableValue">\r\n                            16 January 2013\r\n                        </span>\n</div>\n<div>\r\n            Current version:\r\n                        <span class="productDatatableValue">1.55</span>\n</div>\n<div>\r\n            Updated:\r\n                        <span class="productDatatableValue">\r\n                            23 March 2015\r\n                        </span>\n</div>\n</div>

如何从这个html中获取类别？我需要输出

实用程序

以返回

实用程序

，该实用程序位于锚定标记not span内。请尝试下面的Beautifulsoup代码。 编辑：

from bs4 import BeautifulSoup
import requests
response=requests.get("https://www.mql5.com/en/market/product/635").text
soup=BeautifulSoup(response,'html.parser')
tabela  = soup.find("div", class_="productDatatable").find('span', class_="productDatatableValue").find('a')
print(tabela.text)

编辑：

from bs4 import BeautifulSoup
import requests
response=requests.get("https://www.mql5.com/en/market/product/635").text
soup=BeautifulSoup(response,'html.parser')
tabela  = soup.find("div", class_="productDatatable").find('span', class_="productDatatableValue").find('a')
print(tabela.text)

如果您想使用selenium，请使用以下xpath和category的引用

print(browser.find_element_by_xpath("//div[contains(.,'Category')]/span[@class='productDatatableValue']/a").text)

请试试这个

tabela.find_element_by_xpath("/html/body/div[1]/div[3]/div[2]/div[1]/div[2]/div[4]/div[1]/span/a").text

请给出您的确切HTML格式tabela@ManaliKagathara这里是完整的链接，这里是我的完整python代码：类别并非总是

实用工具

这只是一个示例，类别更改基于您的查询，我已经发布了代码。如果您有任何进一步的要求，您需要发布该代码。@LucianBlaga:为selenium添加了xpath以引用类别。希望这会有所帮助。不起作用：（（Pdb）打印（soup.find_元素）_by_xpath（//div[contains（，'category'）]/span[@class='productDatatableValue']/a”）.text）***TypeError:“非类型”对象不可用callable@LucianBlaga：根据您的代码，我已经更新了BeautifulSoup代码。您尝试过吗？给定的url BeautifulSoup运行良好。（Pdb）选项卡。通过xpath（“div[@class='productDatatable']/div/span/a”）查找元素.text***TypeError:“非类型”对象不可用callable@LucianBlaga试试这个