Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/285.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/excel/26.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:如何从页面下载Excel文件 转到此url(用户名=TrickyBen |密码=TrickyBen123) 请注意,有一个下载Excel按钮(红色) 我想下载excel文件并将其转换为数据框。我想通过编程(即从脚本,而不是通过手动点击网站)来完成。我该怎么做_Python_Excel - Fatal编程技术网

Python:如何从页面下载Excel文件 转到此url(用户名=TrickyBen |密码=TrickyBen123) 请注意,有一个下载Excel按钮(红色) 我想下载excel文件并将其转换为数据框。我想通过编程(即从脚本,而不是通过手动点击网站)来完成。我该怎么做

Python:如何从页面下载Excel文件 转到此url(用户名=TrickyBen |密码=TrickyBen123) 请注意,有一个下载Excel按钮(红色) 我想下载excel文件并将其转换为数据框。我想通过编程(即从脚本,而不是通过手动点击网站)来完成。我该怎么做,python,excel,Python,Excel,此代码将让您以TrickyBen的身份登录,并向网站API发出请求 导入请求 从lxml导入html 来自导入会话的请求 作为pd进口熊猫 进口舒蒂尔 raceSession = Session() LoginDetails = {'login': 'TrickyBen', 'password': 'TrickyBen123'} LoginUrl = 'https://www.horseracebase.com/horse-racing-results.php?year=2005&m

此代码将让您以TrickyBen的身份登录,并向网站API发出请求

导入请求 从lxml导入html 来自导入会话的请求 作为pd进口熊猫 进口舒蒂尔

raceSession = Session()

LoginDetails = {'login': 'TrickyBen', 'password': 'TrickyBen123'}

LoginUrl = 'https://www.horseracebase.com/horse-racing-results.php?year=2005&month=3&day=15/horsebase1.php'
LoginPost = raceSession.post(LoginUrl, data=LoginDetails)

RaceUrl = 'https://www.horseracebase.com/excelresults.php'
RaceDataDetails =  {"user": "41495", "racedate": "2005-3-15", "downloadbutton": "Excel"}

PostHeaders = {"Content-Type": "application/x-www-form-urlencoded"}
Response = raceSession.post(RaceUrl, data=RaceDataDetails, headers=PostHeaders)

Table = pd.read_table(Response.text)

Table.to_csv('blahblah.csv')
如果检查元素,您会注意到相关元素如下所示

<form action="excelresults.php" method="post">
    <input type="hidden" name="user" value="41495">
    <input type="hidden" name="racedate" value="2005-3-15">
    <input type="submit" class="downloadbutton" value="Excel">
</form>

我收到这个错误消息

Traceback (most recent call last):
  File "/Users/Alex/Desktop/DateTest/hrpull.py", line 20, in <module>
    Table = pd.read_table(Response.text)
  File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 562, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 315, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 645, in __init__
    self._make_engine(self.engine)
  File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 799, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
  File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 1213, in __init__
self._reader = _parser.TextReader(src, **kwds)
  File "pandas/parser.pyx", line 358, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:3427)
  File "pandas/parser.pyx", line 628, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:6861)
IOError: File race_date race_time   track   race_name       race_restrictions_age   race_class  major   race_distance   prize_money     going_description   number_of_runners   place   distbt  horse_name  stall       trainer horse_age   jockey_name jockeys_claim   pounds  odds    fav     official_rating comptime    TotalDstBt  MedianOR    Dist_Furlongs       placing_numerical   RCode   BFSP    BFSP_Place  PlcsPaid    BFPlcsPaid      Yards   RailMove    RaceType    
"2005-03-15"    "14:00:00"  "Cheltenham"    "Letheby & Christopher Supreme Novices Hurdle " "4yo+"  "Class 1"   "Grade 1"   "2m˝f " "58000" "Good"  "20"    "1st"       "Arcalis"   "0" "Johnson, J Howard" "5" "Lee, G"    "0" "161"   "21"        "136"   "3 mins 53.00s"     "121.5" "16.5"  "1" "National Hunt" "0" "0" "3" "0" "0" "0" "Novices Hurdle"
"2005-03-15"    "14:00:00"  "Cheltenham"    "Letheby & Christopher Supreme Novices Hurdle " "4yo+"  "Class 1"   "Grade 1"   "2m˝f " "58000" "Good"  "20"    "2nd"   "6" "Wild Passion (GER)"    "0" "Meade, Noel"   "5" "Carberry, P"   "0" "161"   "11"        "0" "3 mins 53.00s" "6" "121.5" "16.5"  "2" "National Hunt" "0" "0" "3" "0" "0" "0" "Novices Hurdle"
回溯(最近一次呼叫最后一次):
文件“/Users/Alex/Desktop/DateTest/hrpull.py”,第20行,在
Table=pd.read\u Table(Response.text)
文件“/Library/Python/2.7/site packages/pandas/io/parsers.py”,第562行,在解析器中
返回读取(文件路径或缓冲区,kwds)
文件“/Library/Python/2.7/site packages/pandas/io/parsers.py”,第315行,已读
parser=TextFileReader(文件路径或缓冲区,**kwds)
文件“/Library/Python/2.7/site packages/pandas/io/parsers.py”,第645行,在__
自制发动机(自制发动机)
文件“/Library/Python/2.7/site packages/pandas/io/parsers.py”,第799行,在“make”引擎中
self.\u engine=CParserWrapper(self.f,**self.options)
文件“/Library/Python/2.7/site packages/pandas/io/parsers.py”,第1213行,在__
self.\u reader=\u parser.textleader(src,**kwds)
文件“pandas/parser.pyx”,第358行,在pandas.parser.TextReader.\u u cinit\uu u(pandas/parser.c:3427)中
文件“pandas/parser.pyx”,第628行,位于pandas.parser.TextReader.\u设置\u解析器\u源(pandas/parser.c:6861)
IOError:文件比赛日期比赛时间赛道比赛姓名比赛限制年龄比赛级别主要比赛距离奖奖金去向描述人数参赛者地点距离马姓名摊位驯马师马年龄骑师姓名骑师索赔英镑赔率fav官方评级comptime TotaldSBT MedianOR距离Furlongs放置数字代码BFSP BFSP放置PlcsPaid BFPlcsPaid码轨道移动线类型
“2005-03-15”“14:00:00”“Cheltenham”“Letheby&Christopher Supreme初学者跨栏”“4yo+”“1班”“1年级”“2米f”“58000”“好”“20”“1”“Arcalis”“0”“约翰逊,J霍华德”“5”“李,G”“0”“161”“21”“136”“3分钟53.00”“121.5”“16.5”“1”“国家狩猎”“0”“3”“0”“0”“0”“0”“0”“0”“新手跨栏”
“2005-03-15”“14:00:00”“Cheltenham”“Letheby&Christopher Supreme新手跨栏”“4yo+”“1班”“1年级”“2米f”“58000”“好”“2”“6”“野性激情(GER)”“0”“米德”“诺埃尔”“5”“卡贝里,P”“0”“161”“11”“0”“3分钟53.00秒”“6”“121.5”“16.5”“2”“国家狩猎”“0”“0”“0”“3”“0”“0”“新手跨栏”

我想您可以在另一个网页上看到您要下载的数据,例如,单击“我的系统(v4)”“。如果可以这样做,则可以使用urllib.request.urlretrieve下载该页面。然后,您可以使用html.parser.HTMLParser来解析数据,并按照您的意愿进行处理。

我认为您可以在另一个网页中看到要下载的数据,例如,单击“我的系统(v4)”。如果可以这样做,则可以使用urllib.request.urlretrieve下载该页面。然后,您可以使用html.parser.HTMLParser来解析数据,并按照您的意愿进行处理。

如果您查看表单操作中调用的api,您将看到您必须对此url发出post请求:

https://www.horseracebase.com/excelresults.php
具有以下参数:

data = {
    "user": "41495", # looks like this varies with login, so update in case you change your login id
    "racedate": "2005-3-15",
    "downloadbutton": "Excel"
}
您可以这样做:

response = raceSession.post(reqUrl, json=data)
如果这不起作用,请尝试向请求添加标题,如:
headers=postHeaders
。例如,在本例中,当您发送表单编码数据时,应设置内容类型标题,以便:

headers = {"Content-Type": "application/x-www-form-urlencoded"} 
有关如何将excel保存到文件的详细信息,请阅读

以下是对Postman中此请求的响应,因此除了
内容类型
,您似乎不需要任何其他标题:

编辑

这是您需要做的:

raceSession = Session()

RaceUrl = 'https://www.horseracebase.com/excelresults.php'
RaceDataDetails =  {"user": "41495", "racedate": "2005-3-15", "downloadbutton": "Excel"}

PostHeaders = {"Content-Type": "application/x-www-form-urlencoded"}
Response = raceSession.post(RaceUrl, data=RaceDataDetails, headers=PostHeaders)
# from StringIO import StringIO #for python 2.x
#import StringIO #for python 3.x
Table = pd.read_table(StringIO(Response.text)) 

如果您查看表单操作中调用的api,您将看到必须对此url发出post请求:

https://www.horseracebase.com/excelresults.php
具有以下参数:

data = {
    "user": "41495", # looks like this varies with login, so update in case you change your login id
    "racedate": "2005-3-15",
    "downloadbutton": "Excel"
}
您可以这样做:

response = raceSession.post(reqUrl, json=data)
如果这不起作用,请尝试向请求添加标题,如:
headers=postHeaders
。例如,在本例中,当您发送表单编码数据时,应设置内容类型标题,以便:

headers = {"Content-Type": "application/x-www-form-urlencoded"} 
有关如何将excel保存到文件的详细信息,请阅读

以下是对Postman中此请求的响应,因此除了
内容类型
,您似乎不需要任何其他标题:

编辑

这是您需要做的:

raceSession = Session()

RaceUrl = 'https://www.horseracebase.com/excelresults.php'
RaceDataDetails =  {"user": "41495", "racedate": "2005-3-15", "downloadbutton": "Excel"}

PostHeaders = {"Content-Type": "application/x-www-form-urlencoded"}
Response = raceSession.post(RaceUrl, data=RaceDataDetails, headers=PostHeaders)
# from StringIO import StringIO #for python 2.x
#import StringIO #for python 3.x
Table = pd.read_table(StringIO(Response.text)) 

那么你想下载信息而不下载文件?嘿-我想通过程序下载文件。也就是说,不仅仅是手动将文件下载到我的桌面,然后使用read_csv将文件读取到DataFrame中。当然,read_csv会将文件放入数据帧,但我想从脚本访问文件。我希望这是有意义的?所以你想下载信息而不下载文件?嘿-我想通过编程下载文件。也就是说,不仅仅是手动将文件下载到我的桌面,然后使用read_csv将文件读取到DataFrame中。当然,read_csv会将文件放入数据帧,但我想从脚本访问文件。我希望这有意义?嘿,伙计们,谢谢你们。我取得了一些进展,但我的代码不起作用。你能看到我做错了什么吗?raceSession=Session()LoginDetails={'login':'TrickyBen','password':'TrickyBen123'}LoginUrl='LoginPost=raceSession.post(LoginUrl,data=LoginDetails)RaceUrl='Rac