Python 3.x 无法使用Selenium-python获取数据

Python 3.x 无法使用Selenium-python获取数据,python-3.x,selenium,web-scraping,Python 3.x,Selenium,Web Scraping,我正试图从网站上获取表格历史数据- 下面的代码 url = 'https://www1.nseindia.com/products/content/derivatives/currency/historical_contract_cd.htm' driver.get(url) driver.implicitly_wait(6) inst = 'FUTCUR' symbol = 'USDINR' contYear = '2020' expiry = '270520' contract = symb

我正试图从网站上获取表格历史数据-

下面的代码

url = 'https://www1.nseindia.com/products/content/derivatives/currency/historical_contract_cd.htm'
driver.get(url)
driver.implicitly_wait(6)
inst = 'FUTCUR'
symbol = 'USDINR'
contYear = '2020'
expiry = '270520'
contract = symbol + ' ' + expiry
startDate = '12-04-2020'
endDate = '11-05-2020'

instSelect = Select(driver.find_element_by_id('instrument')).select_by_value(inst)
symbolSelect = Select(driver.find_element_by_id('symbol')).select_by_value(symbol)
yearSelect = Select(driver.find_element_by_id('contractYear')).select_by_value(contYear)
contractSelect = Select(driver.find_element_by_id('contract')).select_by_value(contract)

# optionTypeSelect = Select(driver.find_element_by_id('contract')).select_by_value(opType)
# strikeSelect = Select(driver.find_element_by_id('contract')).select_by_value(strike)
driver.find_element_by_xpath("//input[@class='textboxdata hasDatepicker' and @id='fromDt']").send_keys(startDate)
driver.find_element_by_xpath("//input[@class='textboxdata hasDatepicker' and @id='toDt']").send_keys(endDate + "\n")
在最后一行中,我尝试通过换行点击获取数据按钮,但我没有得到任何表格

有人能建议怎么做吗

或者,你能建议一个更好的方法来获取这些历史数据吗,我是硒的新手,也许有更好的方法来获取这些数据


而言,应用程序似乎正在识别它是bot。但是,如果您转到
网络选项卡
,在手动单击
获取数据
后,您将发现以下url。您可以使用下面的链接进行导航

但是,对于任何其他
搜索查询
,您可以
参数化
url

代码

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium import webdriver

inst = 'FUTCUR'
symbol = 'USDINR'
expiry = '270520'
startDate = '12-04-2020'
endDate = '11-05-2020'

url="https://www1.nseindia.com/live_market/dynaContent/live_watch/fx_tracker/FxTradeHistoryNew.jsp?contract={0}%20{1}&symbol={0}&fromDt={2}&toDt={3}&instrument={4}&strikePrice=select&optionType=select&time=1589457333448".format(symbol,expiry,startDate,endDate,inst)

driver=webdriver.Chrome()
driver.get(url)
rows=WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"table>tbody tr")))
for row in rows[2:]:
    print([td.text for td in row.find_elements_by_xpath(".//td")])
输出:列表格式的表格单元格数据

['11-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7150', '75.9575', '75.6100', '75.9350', '75.9375', '22,73,052', '1442585', '10,93,914.24']
['08-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7500', '75.8525', '75.3000', '75.7350', '75.7350', '23,32,272', '2082334', '15,75,416.91']
['06-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.9000', '76.1000', '75.7825', '76.0325', '76.0325', '22,55,414', '1610773', '12,23,210.60']
['05-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7500', '76.0700', '75.6525', '75.9425', '75.9425', '23,31,177', '1640933', '12,44,886.32']
['04-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7525', '76.0200', '75.7525', '75.8475', '75.8475', '22,96,509', '1676336', '12,72,472.35']
['30-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.5500', '75.5500', '75.0825', '75.2750', '75.2750', '23,18,407', '2543021', '19,14,672.14']
['29-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.1575', '76.3000', '75.8075', '75.9250', '75.9225', '25,65,081', '1882266', '14,30,376.21']
['28-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.4975', '76.7400', '76.0650', '76.1200', '76.1200', '26,39,745', '2084993', '15,94,244.06']
['27-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6000', '76.7125', '76.4050', '76.5375', '76.5375', '25,90,069', '1356043', '10,37,735.82']
['24-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6000', '76.8775', '76.4500', '76.6800', '76.6800', '21,89,351', '800447', '6,14,078.45']
['23-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.9500', '76.9500', '76.3125', '76.5125', '76.5125', '19,56,117', '698749', '5,34,773.33']
['22-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.4725', '77.4725', '76.9050', '76.9550', '76.9550', '19,09,133', '448019', '3,45,608.92']
['21-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.0975', '77.4950', '77.0100', '77.4625', '77.4600', '18,23,316', '517761', '4,00,170.48']
['20-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.7425', '77.1300', '76.7400', '77.0325', '77.0325', '16,47,582', '410349', '3,15,927.53']
['17-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.1375', '77.1950', '76.7400', '76.8950', '76.8950', '15,40,884', '395013', '3,03,905.97']
['16-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.2000', '77.5000', '76.9150', '77.4125', '77.4125', '14,75,982', '402968', '3,11,477.92']
['15-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6525', '77.1225', '76.4975', '77.0950', '77.0950', '13,22,890', '425692', '3,26,801.09']
['13-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.0925', '77.2000', '76.7000', '76.8025', '76.8050', '11,73,618', '323331', '2,48,706.21']

要导入到pandas中,您可以先获取html,然后使用
pd.read\u html()
加载到dataframe中

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium import webdriver
import pandas as pd
inst = 'FUTCUR'
symbol = 'USDINR'
expiry = '270520'
startDate = '12-04-2020'
endDate = '11-05-2020'

url="https://www1.nseindia.com/live_market/dynaContent/live_watch/fx_tracker/FxTradeHistoryNew.jsp?contract={0}%20{1}&symbol={0}&fromDt={2}&toDt={3}&instrument={4}&strikePrice=select&optionType=select&time=1589457333448".format(symbol,expiry,startDate,endDate,inst)

driver=webdriver.Chrome()
driver.get(url)
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"#csvContentDiv+table")))
htmldata=driver.page_source
df=pd.read_html((str(htmldata)))[0]
print(df)
#Import into csv
df.to_csv("testdata.csv",index=False)

应用程序似乎正在识别它是bot。但是,如果您转到
网络选项卡
,在手动单击
获取数据
后,您将发现以下url。您可以使用下面的链接进行导航

但是,对于任何其他
搜索查询
,您可以
参数化
url

代码

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium import webdriver

inst = 'FUTCUR'
symbol = 'USDINR'
expiry = '270520'
startDate = '12-04-2020'
endDate = '11-05-2020'

url="https://www1.nseindia.com/live_market/dynaContent/live_watch/fx_tracker/FxTradeHistoryNew.jsp?contract={0}%20{1}&symbol={0}&fromDt={2}&toDt={3}&instrument={4}&strikePrice=select&optionType=select&time=1589457333448".format(symbol,expiry,startDate,endDate,inst)

driver=webdriver.Chrome()
driver.get(url)
rows=WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,"table>tbody tr")))
for row in rows[2:]:
    print([td.text for td in row.find_elements_by_xpath(".//td")])
输出:列表格式的表格单元格数据

['11-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7150', '75.9575', '75.6100', '75.9350', '75.9375', '22,73,052', '1442585', '10,93,914.24']
['08-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7500', '75.8525', '75.3000', '75.7350', '75.7350', '23,32,272', '2082334', '15,75,416.91']
['06-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.9000', '76.1000', '75.7825', '76.0325', '76.0325', '22,55,414', '1610773', '12,23,210.60']
['05-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7500', '76.0700', '75.6525', '75.9425', '75.9425', '23,31,177', '1640933', '12,44,886.32']
['04-May-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.7525', '76.0200', '75.7525', '75.8475', '75.8475', '22,96,509', '1676336', '12,72,472.35']
['30-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '75.5500', '75.5500', '75.0825', '75.2750', '75.2750', '23,18,407', '2543021', '19,14,672.14']
['29-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.1575', '76.3000', '75.8075', '75.9250', '75.9225', '25,65,081', '1882266', '14,30,376.21']
['28-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.4975', '76.7400', '76.0650', '76.1200', '76.1200', '26,39,745', '2084993', '15,94,244.06']
['27-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6000', '76.7125', '76.4050', '76.5375', '76.5375', '25,90,069', '1356043', '10,37,735.82']
['24-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6000', '76.8775', '76.4500', '76.6800', '76.6800', '21,89,351', '800447', '6,14,078.45']
['23-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.9500', '76.9500', '76.3125', '76.5125', '76.5125', '19,56,117', '698749', '5,34,773.33']
['22-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.4725', '77.4725', '76.9050', '76.9550', '76.9550', '19,09,133', '448019', '3,45,608.92']
['21-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.0975', '77.4950', '77.0100', '77.4625', '77.4600', '18,23,316', '517761', '4,00,170.48']
['20-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.7425', '77.1300', '76.7400', '77.0325', '77.0325', '16,47,582', '410349', '3,15,927.53']
['17-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.1375', '77.1950', '76.7400', '76.8950', '76.8950', '15,40,884', '395013', '3,03,905.97']
['16-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.2000', '77.5000', '76.9150', '77.4125', '77.4125', '14,75,982', '402968', '3,11,477.92']
['15-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '76.6525', '77.1225', '76.4975', '77.0950', '77.0950', '13,22,890', '425692', '3,26,801.09']
['13-Apr-2020', 'FUTCUR', 'USDINR 270520', '-', '-', '77.0925', '77.2000', '76.7000', '76.8025', '76.8050', '11,73,618', '323331', '2,48,706.21']

要导入到pandas中,您可以先获取html,然后使用
pd.read\u html()
加载到dataframe中

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium import webdriver
import pandas as pd
inst = 'FUTCUR'
symbol = 'USDINR'
expiry = '270520'
startDate = '12-04-2020'
endDate = '11-05-2020'

url="https://www1.nseindia.com/live_market/dynaContent/live_watch/fx_tracker/FxTradeHistoryNew.jsp?contract={0}%20{1}&symbol={0}&fromDt={2}&toDt={3}&instrument={4}&strikePrice=select&optionType=select&time=1589457333448".format(symbol,expiry,startDate,endDate,inst)

driver=webdriver.Chrome()
driver.get(url)
WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"#csvContentDiv+table")))
htmldata=driver.page_source
df=pd.read_html((str(htmldata)))[0]
print(df)
#Import into csv
df.to_csv("testdata.csv",index=False)

非常感谢,但我需要一个数据框的形式,与所有的列名等。我正在工作,请帮助与建议,如果有任何感谢,但我需要它的形式的数据框,与所有的列名等。我正在工作,请帮助与建议,如果有