Python 使用SeleniumWebDriver循环url

Python 使用SeleniumWebDriver循环url,python,selenium-webdriver,Python,Selenium Webdriver,下面的请求查找当天的比赛id。我正在尝试将str传递到驱动程序中。获取url,这样它将转到每个比赛url,并下载每个比赛CSV。我想你必须编写一个循环,但我不确定网络驱动程序会是什么样子 import time from selenium import webdriver import requests import datetime req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')

下面的
请求
查找当天的比赛id。我正在尝试将
str
传递到
驱动程序中。获取
url
,这样它将转到每个比赛
url
,并下载每个比赛
CSV
。我想你必须编写一个
循环
,但我不确定
网络驱动程序
会是什么样子

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()

for ids in data:
    contest = ids['id']

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!


driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 
这有帮助吗

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 

也许是时候把它分解一下了。
创建几个独立的函数,它们是:
0(可选)向目标url提供授权。
1.收集所有需要的
id
(代码的第一部分)。
2.为特定的
id
(代码的第二部分)导出CSV。
3.循环查看
id
列表,并为每个id调用func#2

共享
chromedriver
作为每个驱动程序的输入参数,以保存驱动程序状态和验证cookies。

它的工作原理很好,使代码清晰易读

我认为您可以将比赛的URL设置为登录页中的
a
元素,然后单击它。然后使用其他ID重复该步骤

请参阅下面的代码

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()
contests = []

for ids in data:
    contests.append(ids['id'])

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for id in contests:
    element = driver.find_element_by_css_selector('a')
    script1 = "arguments[0].setAttribute('download',arguments[1]);"
    driver.execute_script(script1, element, str(id) + '.pdf')
    script2 = "arguments[0].setAttribute('href',arguments[1]);"
    driver.execute_script(script2, element, 'https://www.draftkings.com/contest/exportfullstandingscsv/' + str(id))
    time.sleep(1)
    element.click()
    time.sleep(3)

请按以下顺序尝试:

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()



driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby')
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('Pr0c3ss')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('generic1!')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')

您不需要发送x次load selenium来下载x个文件。请求和selenium可以共享cookie。这意味着您可以使用selenium登录站点,检索登录详细信息并与请求或任何其他应用程序共享。花点时间看看httpie,看起来您像请求一样手动控制会话

有关请求,请参阅: 有关selenium,请参见:

查看Webdriver块,您可以添加代理并加载浏览器headless或live:只需对headless行进行注释,它就可以加载浏览器live,这使得调试变得容易,易于理解对站点api/html的移动和更改

import time
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
import requests
import datetime
import shutil



LOGIN = 'https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby'
BASE_URL = 'https://www.draftkings.com/contest/exportfullstandingscsv/'
USER = ''
PASS = ''

try:
    data = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA').json()
except BaseException as e:
    print(e)
    exit()


ids = [str(item['id']) for item in data]

# Webdriver block
driver = webdriver.Chrome()
options.add_argument('headless')
options.add_argument('window-size=800x600')
# options.add_argument('--proxy-server= IP:PORT')
# options.add_argument('--user-agent=' + USER_AGENT)

try:
    driver.get(URL)
    driver.implicitly_wait(2)
except WebDriverException:
    exit()

def login(USER, PASS)
    '''
    Login to draftkings.
    Retrieve authentication/authorization.

    http://selenium-python.readthedocs.io/waits.html#implicit-waits
    http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions

    '''

    search_box = driver.find_element_by_name('username')
    search_box.send_keys(USER)

    search_box2 = driver.find_element_by_name('password')
    search_box2.send_keys(PASS)

    submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
    submit_button.click()

    driver.implicitly_wait(2)

    cookies = driver.get_cookies()
    return cookies


site_cookies = login(USER, PASS)

def get_csv_files(id):
    '''
    get each id and download the file.
    '''

    session = rq.session()

    for cookie in site_cookies:
        session.cookies.update(cookies)

    try:
        _data = session.get(BASE_URL + id)
        with open(id + '.csv', 'wb') as f:
            shutil.copyfileobj(data.raw, f)
    except BaseException:
        return


map(get_csv_files, ids)

会发生什么情况?您可以在testNG框架中使用dataprovider,那么这些ID是否在url中返回?这是体育=nba位吗?它将通过并获取当前正在进行的所有比赛ID。有了这些id,我希望它能够通过id进入并通过id进入每个比赛,并导出csv@MichaelTJohnson你能在登录后显示登录页面的HTML吗?我有一个解决方案,但我需要的HTML。不能真的复制所有它。你可以登录,看看你是否愿意。用户名Pr0c3ss,pw:generic1!看来它能用,比赛一开始我就得检查一下。干得好!不要使用静态睡眠