Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/xamarin/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何通过限制后重试正确获取API_Python_Request - Fatal编程技术网

Python 如何通过限制后重试正确获取API

Python 如何通过限制后重试正确获取API,python,request,Python,Request,我正在尝试使用以下代码获取ArXiv API: import urllib.request import pandas as pd import xml.etree.ElementTree as ET OAI = "{http://www.openarchives.org/OAI/2.0/}" ARXIV = "{http://arxiv.org/OAI/arXiv/}" def harvest(arxiv): df = pd.DataFrame(columns=("title",

我正在尝试使用以下代码获取ArXiv API:

import urllib.request
import pandas as pd
import xml.etree.ElementTree as ET

OAI = "{http://www.openarchives.org/OAI/2.0/}"
ARXIV = "{http://arxiv.org/OAI/arXiv/}"

def harvest(arxiv):
    df = pd.DataFrame(columns=("title", "abstract", "categories"))
    base_url = "http://export.arxiv.org/oai2?verb=ListRecords&"
    url = (base_url +
           "from=2012-01-01&until=2019-01-01&" +
           "metadataPrefix=arXiv&set=%s"%arxiv)

    while True:
        print("fetching", url)

        response = urllib.request.urlopen(url)

        xml = response.read()

        root = ET.fromstring(xml)

        for record in root.find(OAI+'ListRecords').findall(OAI+"record"):
            meta = record.find(OAI+'metadata')
            info = meta.find(ARXIV+"arXiv")
            categories = info.find(ARXIV+"categories").text

            contents = {'title': info.find(ARXIV+"title").text,
                        'abstract': 
                         info.find(ARXIV+"abstract").text.strip(),
                        'categories': categories.split(),
                        }

            df = df.append(contents, ignore_index=True)

        token = root.find(OAI+'ListRecords').find(OAI+"resumptionToken")
        if token is None or token.text is None:
           break
        else:
           url = base_url + "resumptionToken=%s"%(token.text)

    return df

df_hep_th = harvest("physics:hep-th")

df_hep_th.to_csv('df_hep_th.csv', sep=',', encoding='utf-8')
我每次得到1000篇论文,并将它们附加到一个数据框上。然后我将df保存为csv

我曾经成功地尝试过这段代码,例如,为了得到“physics:hep-ex”类别的结果。起初它工作得很好,但现在我遇到了这个错误,我无法解决它:

  File "C:\Users\my_user\Anaconda3\lib\urllib\request.py", line 649, in 
       http_error_default
  raise HTTPError(req.full_url, code, msg, hdrs, fp)

  HTTPError: Retry after specified interval

问题的出现是因为您在很短的时间间隔内执行了太多的请求。看看HTTP文档。您应该做的是检索这段时间间隔
t
,当您出现错误时,使用一些延迟函数等待
t
秒,然后继续运行循环。

非常感谢,如果我在这段时间内有了:try:response=urllib.request.urlopen(url),问题就解决了除了urllib.error.HTTPError作为e:if e.code==503:to=int(e.hdrs.get(“在之后重试”,30))打印(“get 503.在{0:d}秒之后重试。”.format(to))time.sleep(to)continue其他:引发