使用google translate扩展的Python selenium Web垃圾处理_Python_Selenium

使用google translate扩展的Python selenium Web垃圾处理

python selenium

使用google translate扩展的Python selenium Web垃圾处理,python,selenium,Python,Selenium,我正试图从世界各地刮取多个网页。所以，我想用谷歌翻译扩展来翻译这个网站，然后用selenium来抓取这个页面我做了一些研究，找到了如何在运行selenium时添加扩展但是我不知道如何自动执行扩展（默认情况下，它什么也不做）此外，我发现扩展没有翻译原始HTML，因此我可能不得不使用不同的方法进行爬行。（可能通过标记名称（“正文”）传递ctrl-a、ctrl-c、ctrl-v）你能给我点建议吗我发现刮擦后翻译成英语会更好。如果您有类似的问题，请尝试使用 from selenium i

我正试图从世界各地刮取多个网页。所以，我想用谷歌翻译扩展来翻译这个网站，然后用selenium来抓取这个页面

我做了一些研究，找到了如何在运行selenium时添加扩展

但是我不知道如何自动执行扩展（默认情况下，它什么也不做）

此外，我发现扩展没有翻译原始HTML，因此我可能不得不使用不同的方法进行爬行。（可能通过标记名称（“正文”）传递ctrl-a、ctrl-c、ctrl-v）

你能给我点建议吗

我发现刮擦后翻译成英语会更好。如果您有类似的问题，请尝试使用

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

option = webdriver.ChromeOptions()
option.add_extension('./translate.crx')
driver = webdriver.Chrome(executable_path = "./chromedriver", chrome_options = option)
driver.get("naver.com")
WebDriverWait(driver, 3).until(EC.presence_of_element_located((By.TAG_NAME, "body")))

''' @@@@ Here I want something like@@@@
driver.execute_extension("translate this page")
'''

print driver.find_element_by_tag_name("body").text
driver.quit()