Python 用beautifulsoup将yahoo finance拖入一个循环_Python_Web Scraping_Beautifulsoup

Python 用beautifulsoup将yahoo finance拖入一个循环

python web-scraping

Python 用beautifulsoup将yahoo finance拖入一个循环,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我正在学习用Beautifulsoup搜索网站，并试图从yahoo finance获取数据。在我前进的过程中，我一直在想，当我不在for循环中时，它是否会成功地获取我想要的东西（搜索特定的ticker），但当我尝试使用csv文件搜索多个ticker时，.find（）方法会返回一个错误，而不是我正在查找的标记这是运行良好时的代码， ``` import requests import csv from bs4 import BeautifulSoup &

我正在学习用Beautifulsoup搜索网站，并试图从yahoo finance获取数据。在我前进的过程中，我一直在想，当我不在for循环中时，它是否会成功地获取我想要的东西（搜索特定的ticker），但当我尝试使用csv文件搜索多个ticker时，.find（）方法会返回一个错误，而不是我正在查找的标记

这是运行良好时的代码，

    ```
    import requests
    import csv
    from bs4 import BeautifulSoup

    > ------ FOR LOOP THAT MESSES THINGS UP ----- <
    # with open('s&p500_tickers.csv', 'r') as tickers:
    #     for ticker in tickers:


    ticker = 'AAPL' > ------ TEMPORARY TICKER TO TEST CODE

    web = requests.get(f'https://ca.finance.yahoo.com/quote/{ticker}/financials?p={ticker}').text
    soup = BeautifulSoup(web, 'lxml')
    section = soup.find('section', class_='smartphone_Px(20px) Mb(30px)')
    tbl = section.find('div', class_='M(0) Whs(n) BdEnd Bdc($seperatorColor) D(itb)')
    headerRow = tbl.find("div", class_="D(tbr) C($primaryColor)")

    > ------ CODE I USED TO VISUALIZE THE RESULT ------ <
    breakdownHead = headerRow.text[0:9]
    ttmHead = headerRow.text[9:12]
    lastYear = headerRow.text[12:22]
    twoYears = headerRow.text[22:32]
    threeYears = headerRow.text[32:42]
    fourYears = headerRow.text[42:52]

    print(breakdownHead, ttmHead, lastYear, twoYears, threeYears, fourYears)

    ```

以下是不起作用的代码

    ```
    import requests
    import csv
    from bs4 import BeautifulSoup

    with open('s&p500_tickers.csv', 'r') as tickers:
        for ticker in tickers:

            web = requests.get(f'https://ca.finance.yahoo.com/quote/{ticker}/financials?p={ticker}').text
            soup = BeautifulSoup(web, 'lxml')
            section = soup.find('section', class_='smartphone_Px(20px) Mb(30px)')
            tbl = section.find('div', class_='M(0) Whs(n) BdEnd Bdc($seperatorColor) D(itb)')
            headerRow = tbl.find("div", class_="D(tbr) C($primaryColor)")

            breakdownHead = headerRow.text[0:9]
            ttmHead = headerRow.text[9:12]
            lastYear = headerRow.text[12:22]
            twoYears = headerRow.text[22:32]
            threeYears = headerRow.text[32:42]
            fourYears = headerRow.text[42:52]

            print(breakdownHead, ttmHead, lastYear, twoYears, threeYears, fourYears)
    ```

我欢迎任何关于我的代码的反馈，因为我一直在努力变得更好

非常感谢

所以我已经解决了这个问题

我意识到

csv

模块的

.writerow（）

方法在字符串末尾添加了

'\n'

。（例如：

'MMM\n'

）

不知何故，新行保留了for循环中要执行的

.find（）

方法。（仍然不知道为什么）

之后，它对第一行起作用，但是因为有空格，我必须让python用If语句传递空格

我用一个

替换了

'\n'

，它成功了。下面是它的样子：

    '''
    for ticker in tickers.readlines():
        ticker = ticker.replace('\n', '')
        if ticker == '':
            pass
        else:
            web = requests.get(f'https://ca.finance.yahoo.com/quote/{ticker}/financials?p={ticker}').text
            soup = BeautifulSoup(web, 'lxml')
            headerRow = soup.find("div", class_="D(tbr) C($primaryColor)")
    '''

如果你们中有人能找到更好的方法，我很乐意得到你们的一些反馈。

我是编程新手，非常想知道我做错了什么