解析yahoo finance python httperror 502_Python_Parsing_Yahoo Finance

解析yahoo finance python httperror 502

python parsing

解析yahoo finance python httperror 502,python,parsing,yahoo-finance,Python,Parsing,Yahoo Finance,我有以下代码用于解析雅虎金融，以支持我与标准普尔500的合作。它会因一个错误而停止-HTTP错误502：服务器仅在20个股票之后挂断。。有人知道更好的解析雅虎财务或解决这个问题的方法吗 try: for stock in sp500: save_path = location+'\\_KeyStats\\'+stock name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")

我有以下代码用于解析雅虎金融，以支持我与标准普尔500的合作。它会因一个错误而停止-HTTP错误502：服务器仅在20个股票之后挂断。。有人知道更好的解析雅虎财务或解决这个问题的方法吗

try:
    for stock in sp500:

        save_path = location+'\\_KeyStats\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        keyStat = urllib2.urlopen('https://au.finance.yahoo.com/q/ks?s='+stock).read()
        file1.write(keyStat)
        file1.close()

        #income Statement 
        save_path = location+'\\_AnnualEarnings\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        incomeState = urllib2.urlopen('https://au.finance.yahoo.com/q/is?s='+stock+'&annual').read()
        file1.write(incomeState)
        file1.close()

        save_path = location+'\\_QuarterlyEarnings\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        incomeState2 = urllib2.urlopen('https://au.finance.yahoo.com/q/is?s='+stock).read()
        file1.write(incomeState2)
        file1.close()

        #Balance Sheet 
        save_path = location+'\\_AnnaulBS\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        blanceSheet = urllib2.urlopen('https://au.finance.yahoo.com/q/bs?s='+stock+'&annual').read()
        file1.write(blanceSheet)
        file1.close()

        save_path = location+'\\_QuarterlyBS\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        blanceSheet2 = urllib2.urlopen('https://au.finance.yahoo.com/q/bs?s='+stock).read()
        file1.write(blanceSheet2)
        file1.close()

        #Cash Flow
        save_path = location+'\\_AnnaulCF\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        cashFlow = urllib2.urlopen('https://au.finance.yahoo.com/q/cf?s='+stock+'&annual').read()
        file1.write(cashFlow)
        file1.close()

        save_path = location+'\\_QuarterlyCF\\'+stock
        name_of_file = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        completeName = os.path.join(save_path, name_of_file+".html")         
        file1 = open(completeName, "w")
        cashFlow2 = urllib2.urlopen('https://au.finance.yahoo.com/q/cf?s='+stock).read()
        file1.write(cashFlow2)
        file1.close()
        print stock

except Exception, e:
    print 'failed main loop', str(e)

你应该用这个。假设您有一个包含所有股票的文件：

sp500.txt

AAPL
GLD
SPX
MCD

现在您可以执行以下操作：

from pandas.io.data import DataReader
from pandas import Panel, DataFrame
import datetime

start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2013, 1, 27)

with open('sp500.txt') as f:
    symbols = f.read().splitlines()  # ['AAPL', 'GLD', 'SPX', 'MCD']

data = dict((symbol, DataReader(symbol, "yahoo", start, end, pause=1)) for symbol in symbols)
panel = Panel(data).swapaxes('items', 'minor')
closing = panel['Close'].dropna()
print closing.head()

输出：

            AAPL    GLD     MCD     SPX
Date                
2010-01-04  214.01  109.80  62.78   0.03
2010-01-05  214.38  109.70  62.30   0.03
2010-01-06  210.97  111.51  61.45   0.03
2010-01-07  210.58  110.82  61.90   0.03
2010-01-08  211.98  111.37  61.84   0.04

请注意DataReader调用中的

pause=1

，以避免达到API限制。如果要将结果保存到文件中，可以使用：

closing.to_csv('output.csv')

回答得好，但以防万一你不知道-你不应该在问题上添加标签，因为它包含在你的答案中。标签仅用于提问。在没有熊猫的情况下，有一种（相当困难的，我会说）方法可以做到这一点，OP没有使用熊猫，所以不应该把它添加到问题中。@MattDMo，噢。谢谢你澄清这一点。我有我的怀疑，但我认为这将有助于搜索，并没有看到任何严重的反对。您是否与该政策有关联，或者这是您的解释？请参阅：标记是描述问题主题的单词或短语。（强调我的）我确信我已经在和/或上看到过这一讨论，但我找不到任何ATM机。如果你想的话，可以在那里问一个问题——它可能会被关闭，但你会得到你想要的信息。是我刚才在Meta上问的问题。太好了！谢谢你抽出时间在那里提问。