Javascript 如何使用python从动态表中提取内容?
我试图提取本页“振荡器”选项卡下的RSI指标 网址: 我知道我必须先使用Selenium之类的工具来访问选项卡,但是如何访问“oscilators”div呢 我需要使用selenium,然后我可以使用beautiful soup找到正确的标记和数据,对吗 编辑-Javascript 如何使用python从动态表中提取内容?,javascript,python,web,screen-scraping,Javascript,Python,Web,Screen Scraping,我试图提取本页“振荡器”选项卡下的RSI指标 网址: 我知道我必须先使用Selenium之类的工具来访问选项卡,但是如何访问“oscilators”div呢 我需要使用selenium,然后我可以使用beautiful soup找到正确的标记和数据,对吗 编辑- from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.chrome.options import Options fr
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from time import sleep
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd
# create object for chrome options
chrome_options = Options()
base_url = 'https://in.tradingview.com/markets/stocks-india/market-movers-active/'
# To disable the message, "Chrome is being controlled by automated test software"
chrome_options.add_argument("disable-infobars")
# Pass the argument 1 to allow and 2 to block
chrome_options.add_experimental_option("prefs", {
"profile.default_content_setting_values.notifications": 2
})
# invoke the webdriver
browser = webdriver.Chrome(executable_path = r'/Users/judhjitganguli/Downloads/chromedriver',
options = chrome_options)
browser.get('chrome://settings/')
browser.execute_script('chrome.settingsPrivate.setDefaultZoom(0.5);')
browser.get(base_url)
delay = 5 #seconds
while True:
try:
# find tab/button
osiButton = browser.find_element_by_css_selector('.tv-screener-toolbar__favorites div div div:nth-child(8)')
print('button text: ' + osiButton.text)
osiButton.click()
WebDriverWait(browser, 9).until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, 'th:nth-child(2) .js-head-title'), "OSCILLATORS RATING"))
# table updated, get the data
for row in browser.find_elements_by_css_selector(".tv-data-table__tbody tr"):
print(row.text)
#for cell in browser.find_elements_by_css_selector('td'):
# print(cell.text)
except Exception as ex:
print(ex)
# close the automated browser
browser.close()
在输出中,我得到了所需的数据,但它是一个无限循环。如何将其放入熊猫df?振荡器点击后,等待并使用WebDriverWait监控元素
th:n子元素(2).js标题
,从Last
更改为振荡器评级
# if running headless make sure to add this argument
# or the oscillators tab will not visible or can't be clicked
#chrome_options.add_argument("window-size=1980,960");
try:
# find tab/button
osiButton = driver.find_element_by_css_selector('.tv-screener-toolbar__favorites div div div:nth-child(8)')
print('button text: ' + osiButton.text)
osiButton.click()
WebDriverWait(driver, 9).until(
EC.text_to_be_present_in_element((By.CSS_SELECTOR, 'th:nth-child(2) .js-head-title'), "OSCILLATORS RATING"))
# table updated, get the data
for row in driver.find_elements_by_css_selector('.tv-data-table__tbody tr'):
print(row.text)
#for cell in driver.find_elements_by_css_selector('td'):
#print(cell.text)
except Exception as ex:
print(ex)
因此,如果您检查元素,它有一个div id。My bad@uingtea/知道如何继续吗?尝试了这个,但没有成功。错误消息是什么?没有错误消息,但它只是抛出超时异常。我点击振荡器,但什么也没发生。用我正在使用的代码更新了我的问题。答案更新了,看来你没有点击振荡器。太感谢了。经过几次调整后,我能够获得所需的数据。(在答案中添加了代码)但是你能帮我做两件事吗。1.为什么它是一个无限循环?它应该在解析整个页面后停止,对吗?2.我怎样才能把它转换成一个文件?