Python Selenium-下载文件_Python_Selenium_Screen Scraping_Download

Python Selenium-下载文件

python selenium download

Python Selenium-下载文件,python,selenium,screen-scraping,download,Python,Selenium,Screen Scraping,Download,我正在尝试运行一个脚本，访问纳斯达克网站，下载过去18个月公司名单的股票信息。在运行下面的脚本之后，我只打开了带有公司信息和下载按钮的Firefox页面，但它不会马上为我下载为什么? def拉取数据（股票代码、保存路径、rm路径）：根据我的经验，Selenium不会直接将文件下载到您的标准用户下载文件夹（我猜您使用的是windows）。我解决这个问题的方法是直接操作文件。例如，当访问网站下载CSV时，我将CSV直接读入Pandas。对于访问.TXT文件，我通过将文件路径设置为远程位置直接读取

我正在尝试运行一个脚本，访问纳斯达克网站，下载过去18个月公司名单的股票信息。在运行下面的脚本之后，我只打开了带有公司信息和下载按钮的Firefox页面，但它不会马上为我下载

为什么?

def拉取数据（股票代码、保存路径、rm路径）：

根据我的经验，Selenium不会直接将文件下载到您的标准用户下载文件夹（我猜您使用的是windows）。我解决这个问题的方法是直接操作文件。例如，当访问网站下载CSV时，我将CSV直接读入Pandas。对于访问.TXT文件，我通过将文件路径设置为远程位置直接读取它。好的，谢谢！你能给我指一些文档或脚本示例吗？太棒了！你到底被困在哪里了？基本上，Selenium打开了正确的firefox页面，上面有公司信息，我想下载数据的时间范围，但没有下载数据。而且，它只会打开列表中第一家公司的网页，而不会打开第二家公司的网页。

# To prevent download dialog box in selenium
profile = webdriver.FirefoxProfile()
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', r'C:\Users\Filippo Sebastio\Desktop\Stock')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, application/csv, text/comma-separated-values, application/download, application/octet-stream, binary/octet-stream, application/binary, application/x-unknown")

# Setup Webdriver

driver = webdriver.Firefox(executable_path=r'C:\Users\Filippo Sebastio\Desktop\geckodriver.exe')


popup = True  # Will there be a popup?

for ticker in tickers:
    # Get the stocks website
    site = 'http://www.nasdaq.com/symbol/' + ticker + '/historical'
    driver.get(site)
    # Choose 10 year data from a drop down
    data_range = driver.find_element_by_name('ddlTimeFrame')
    for option in data_range.find_elements_by_tag_name('option'):
        if option.text == '18 Months':
            option.click()
            break
    time.sleep(10)

    # Click to Download Data
    driver.find_element_by_id('lnkDownLoad').click()

    # Open the file from the downloads folder
    time.sleep(25)  # Wait for file to download
    data = pd.read_csv('~/Downloads/HistoricalQuotes.csv')

    # Rename and save the file in the desired location
    file_loc = save_path + ticker + '.csv'
    data.to_csv(file_loc, index=False)

    # Delete the downloaded file
    os.remove(removal_path)

    print("Downloaded:  ", ticker)

    # Wait for the next page to load
    time.sleep(20)  


tickers = ['tesla', 'mmm']  
save_path = my patht to where I want the docuemnts downloaded
rm_path = my Download path 

pull_nasdaq_data(tickers, save_path, rm_path)