Python 使用SeleniumWebDriver循环url
下面的Python 使用SeleniumWebDriver循环url,python,selenium-webdriver,Python,Selenium Webdriver,下面的请求查找当天的比赛id。我正在尝试将str传递到驱动程序中。获取url,这样它将转到每个比赛url,并下载每个比赛CSV。我想你必须编写一个循环,但我不确定网络驱动程序会是什么样子 import time from selenium import webdriver import requests import datetime req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
请求
查找当天的比赛id。我正在尝试将str
传递到驱动程序中。获取url
,这样它将转到每个比赛url
,并下载每个比赛CSV
。我想你必须编写一个循环
,但我不确定网络驱动程序
会是什么样子
import time
from selenium import webdriver
import requests
import datetime
req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()
for ids in data:
contest = ids['id']
driver = webdriver.Chrome() # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!
search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!
driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')
这有帮助吗
for ids in data:
contest = ids['id']
driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')
也许是时候把它分解一下了。
创建几个独立的函数,它们是:
0(可选)向目标url提供授权。
1.收集所有需要的id
(代码的第一部分)。
2.为特定的id
(代码的第二部分)导出CSV。
3.循环查看id
列表,并为每个id调用func#2
共享chromedriver
作为每个驱动程序的输入参数,以保存驱动程序状态和验证cookies。
它的工作原理很好,使代码清晰易读 我认为您可以将比赛的URL设置为登录页中的a
元素,然后单击它。然后使用其他ID重复该步骤
请参阅下面的代码
req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()
contests = []
for ids in data:
contests.append(ids['id'])
driver = webdriver.Chrome() # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!
search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!
for id in contests:
element = driver.find_element_by_css_selector('a')
script1 = "arguments[0].setAttribute('download',arguments[1]);"
driver.execute_script(script1, element, str(id) + '.pdf')
script2 = "arguments[0].setAttribute('href',arguments[1]);"
driver.execute_script(script2, element, 'https://www.draftkings.com/contest/exportfullstandingscsv/' + str(id))
time.sleep(1)
element.click()
time.sleep(3)
请按以下顺序尝试:
import time
from selenium import webdriver
import requests
import datetime
req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()
driver = webdriver.Chrome() # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby')
time.sleep(2) # Let DK Load!
search_box = driver.find_element_by_name('username')
search_box.send_keys('Pr0c3ss')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('generic1!')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!
for ids in data:
contest = ids['id']
driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')
您不需要发送x次load selenium来下载x个文件。请求和selenium可以共享cookie。这意味着您可以使用selenium登录站点,检索登录详细信息并与请求或任何其他应用程序共享。花点时间看看httpie,看起来您像请求一样手动控制会话
有关请求,请参阅:
有关selenium,请参见:
查看Webdriver块,您可以添加代理并加载浏览器headless或live:只需对headless行进行注释,它就可以加载浏览器live,这使得调试变得容易,易于理解对站点api/html的移动和更改
import time
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
import requests
import datetime
import shutil
LOGIN = 'https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby'
BASE_URL = 'https://www.draftkings.com/contest/exportfullstandingscsv/'
USER = ''
PASS = ''
try:
data = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA').json()
except BaseException as e:
print(e)
exit()
ids = [str(item['id']) for item in data]
# Webdriver block
driver = webdriver.Chrome()
options.add_argument('headless')
options.add_argument('window-size=800x600')
# options.add_argument('--proxy-server= IP:PORT')
# options.add_argument('--user-agent=' + USER_AGENT)
try:
driver.get(URL)
driver.implicitly_wait(2)
except WebDriverException:
exit()
def login(USER, PASS)
'''
Login to draftkings.
Retrieve authentication/authorization.
http://selenium-python.readthedocs.io/waits.html#implicit-waits
http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions
'''
search_box = driver.find_element_by_name('username')
search_box.send_keys(USER)
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys(PASS)
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
driver.implicitly_wait(2)
cookies = driver.get_cookies()
return cookies
site_cookies = login(USER, PASS)
def get_csv_files(id):
'''
get each id and download the file.
'''
session = rq.session()
for cookie in site_cookies:
session.cookies.update(cookies)
try:
_data = session.get(BASE_URL + id)
with open(id + '.csv', 'wb') as f:
shutil.copyfileobj(data.raw, f)
except BaseException:
return
map(get_csv_files, ids)
会发生什么情况?您可以在testNG框架中使用dataprovider,那么这些ID是否在url中返回?这是体育=nba位吗?它将通过并获取当前正在进行的所有比赛ID。有了这些id,我希望它能够通过id进入并通过id进入每个比赛,并导出csv@MichaelTJohnson你能在登录后显示登录页面的HTML吗?我有一个解决方案,但我需要的HTML。不能真的复制所有它。你可以登录,看看你是否愿意。用户名Pr0c3ss,pw:generic1!看来它能用,比赛一开始我就得检查一下。干得好!不要使用静态睡眠