Javascript 如何使用python从动态表中提取内容？_Javascript_Python_Web_Screen Scraping

Javascript 如何使用python从动态表中提取内容？

javascript python web

Javascript 如何使用python从动态表中提取内容？,javascript,python,web,screen-scraping,Javascript,Python,Web,Screen Scraping,我试图提取本页“振荡器”选项卡下的RSI指标网址：我知道我必须先使用Selenium之类的工具来访问选项卡，但是如何访问“oscilators”div呢我需要使用selenium，然后我可以使用beautiful soup找到正确的标记和数据，对吗编辑- from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.chrome.options import Options fr

我试图提取本页“振荡器”选项卡下的RSI指标

网址：

我知道我必须先使用Selenium之类的工具来访问选项卡，但是如何访问“oscilators”div呢

我需要使用selenium，然后我可以使用beautiful soup找到正确的标记和数据，对吗

编辑-

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from time import sleep
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import pandas as pd

# create object for chrome options
chrome_options = Options()
base_url = 'https://in.tradingview.com/markets/stocks-india/market-movers-active/'


# To disable the message, "Chrome is being controlled by automated test software"
chrome_options.add_argument("disable-infobars")
# Pass the argument 1 to allow and 2 to block
chrome_options.add_experimental_option("prefs", { 
    "profile.default_content_setting_values.notifications": 2
    })
# invoke the webdriver
browser = webdriver.Chrome(executable_path = r'/Users/judhjitganguli/Downloads/chromedriver',
                          options = chrome_options)

browser.get('chrome://settings/')
browser.execute_script('chrome.settingsPrivate.setDefaultZoom(0.5);')
browser.get(base_url)

delay = 5 #seconds

while True:
    try:
  # find tab/button
        osiButton = browser.find_element_by_css_selector('.tv-screener-toolbar__favorites div div div:nth-child(8)')
        print('button text: ' + osiButton.text)
        osiButton.click()
        WebDriverWait(browser, 9).until(EC.text_to_be_present_in_element((By.CSS_SELECTOR, 'th:nth-child(2) .js-head-title'), "OSCILLATORS RATING"))
  
  # table updated, get the data
        for row in browser.find_elements_by_css_selector(".tv-data-table__tbody tr"):
            print(row.text)
           
        #for cell in browser.find_elements_by_css_selector('td'):
         #   print(cell.text)

        
        
    except Exception as ex:
        print(ex)
    

# close the automated browser
browser.close()

在输出中，我得到了所需的数据，但它是一个无限循环。如何将其放入熊猫df？

振荡器点击后，等待并使用WebDriverWait监控元素

th:n子元素（2）.js标题

，从

Last

更改为

振荡器评级

# if running headless make sure to add this argument
# or the oscillators tab will not visible or can't be clicked
#chrome_options.add_argument("window-size=1980,960");

try:
  # find tab/button
  osiButton = driver.find_element_by_css_selector('.tv-screener-toolbar__favorites div div div:nth-child(8)')
  print('button text: ' + osiButton.text)
  osiButton.click()
  WebDriverWait(driver, 9).until(
      EC.text_to_be_present_in_element((By.CSS_SELECTOR, 'th:nth-child(2) .js-head-title'), "OSCILLATORS RATING"))
  
  # table updated, get the data
  for row in driver.find_elements_by_css_selector('.tv-data-table__tbody tr'):
      print(row.text)
      #for cell in driver.find_elements_by_css_selector('td'):
         #print(cell.text)

except Exception as ex:
  print(ex)

因此，如果您检查元素，它有一个div id。My bad@uingtea/知道如何继续吗？尝试了这个，但没有成功。错误消息是什么？没有错误消息，但它只是抛出超时异常。我点击振荡器，但什么也没发生。用我正在使用的代码更新了我的问题。答案更新了，看来你没有点击振荡器。太感谢了。经过几次调整后，我能够获得所需的数据。（在答案中添加了代码）但是你能帮我做两件事吗。1.为什么它是一个无限循环？它应该在解析整个页面后停止，对吗？2.我怎样才能把它转换成一个文件？