使用Winhttp/VBA下载文件部分成功
我几乎成功地实现了下载过程的自动化。我现在正在使用fiddler&获取值并对它们进行硬编码,直到它起作用为止。该网站是安全的&我得到的饼干好。然而,当测试下载时,我得到一个239K的文件,但它不会打开。Windows告诉我它可能已损坏。不过文件大小应该是250K,所以我知道这并不是全部。我打算尝试运行fiddler来捕获我的代码流量,但有趣的是,它只捕获我在浏览器中所做的事情 我试过:使用Winhttp/VBA下载文件部分成功,vba,winhttprequest,Vba,Winhttprequest,我几乎成功地实现了下载过程的自动化。我现在正在使用fiddler&获取值并对它们进行硬编码,直到它起作用为止。该网站是安全的&我得到的饼干好。然而,当测试下载时,我得到一个239K的文件,但它不会打开。Windows告诉我它可能已损坏。不过文件大小应该是250K,所以我知道这并不是全部。我打算尝试运行fiddler来捕获我的代码流量,但有趣的是,它只捕获我在浏览器中所做的事情 我试过: strFullCookie = "a cookie in here" strDocLink = "https:
strFullCookie = "a cookie in here"
strDocLink = "https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=withatokenhere&portalRoute=/AgentDocumentError"
Set WinHttpReq2 = CreateObject("WINHTTP.WinHTTPRequest.5.1")
WinHttpReq2.Open "GET", Trim(strDocLink), False
WinHttpReq2.Option(6) = False
WinHttpReq2.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
WinHttpReq2.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
WinHttpReq2.setRequestHeader "Accept-Language", "en-US,en;q=0.9"
WinHttpReq2.setRequestHeader "Connection", "keep-alive"
WinHttpReq2.setRequestHeader "Host", "gateway.frontlineinsurance.com"
WinHttpReq2.setRequestHeader "Referer", "https://gateway.frontlineinsurance.com/"
WinHttpReq2.setRequestHeader "Upgrade-Insecure-Requests", "1"
WinHttpReq2.setRequestHeader "Accept-Encoding", "gzip, deflate"
WinHttpReq2.setRequestHeader "Sec-Fetch-User", "?1"
WinHttpReq2.setRequestHeader "Sec-Fetch-Site", "same-origin"
WinHttpReq2.setRequestHeader "Sec-Fetch-Mode", "Navigate"
WinHttpReq2.setRequestHeader "Cache-Control", "no-cache"
WinHttpReq2.setRequestHeader "Content-Type", "application/pdf"
WinHttpReq2.setRequestHeader "Cookie", strFullCookie
WinHttpReq2.Send
'MsgBox WinHttpReq2.responseBody
'MsgBox WinHttpReq2.responseText
Debug.Print WinHttpReq2.Status
strHeaders = WinHttpReq2.getAllResponseHeaders()
Debug.Print strHeaders
Sleep 2000
If WinHttpReq2.Status = 200 Then
Set oStream = CreateObject("ADODB.Stream")
oStream.Open
oStream.Type = 1
oStream.Write WinHttpReq2.responseBody
oStream.SaveToFile "C:\Users\JCarney\Desktop\DownloadedMail\FPITest.pdf", 2 ' 1 = no overwrite, 2 = overwrite
oStream.Close
End If
我使用了:
'Set WinHttpReq2 = CreateObject("MSXML2.serverXMLHttp")
'Set WinHttpReq2 = CreateObject("Microsoft.XMLHTTP"
ServerXMLhttp REF返回的结果与Winhttp相同。另一个被服务器拒绝
Fiddler请求看起来像:
GET https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=sometokehere&portalRoute=/AgentDocumentError HTTP/1.1
Accept: text/html, application/xhtml+xml, image/jxr, */*
Referer: https://gateway.frontlineinsurance.com/
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: gateway.frontlineinsurance.com
Connection: Keep-Alive
Cookie: _cookies here
HTTP/1.1 200 OK
Date: Thu, 20 Feb 2020 18:45:05 GMT
Content-Type: application/pdf
Connection: keep-alive
Set-Cookie: a cookie
Set-Cookie: a cookie
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Content-Disposition: filename=Declarations Page.pdf
Vary: Accept-Encoding,User-Agent
Set-Cookie: a cookie
Set-Cookie: a cookie
Cache-Control: max-age=0, must-revalidate
Content-Length: 255177
fiddler的响应如下所示:
GET https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=sometokehere&portalRoute=/AgentDocumentError HTTP/1.1
Accept: text/html, application/xhtml+xml, image/jxr, */*
Referer: https://gateway.frontlineinsurance.com/
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: gateway.frontlineinsurance.com
Connection: Keep-Alive
Cookie: _cookies here
HTTP/1.1 200 OK
Date: Thu, 20 Feb 2020 18:45:05 GMT
Content-Type: application/pdf
Connection: keep-alive
Set-Cookie: a cookie
Set-Cookie: a cookie
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Content-Disposition: filename=Declarations Page.pdf
Vary: Accept-Encoding,User-Agent
Set-Cookie: a cookie
Set-Cookie: a cookie
Cache-Control: max-age=0, must-revalidate
Content-Length: 255177
当我打印从代码返回的标题时,我得到:
Cache-Control: max-age=0, must-revalidate
Connection: keep-alive
Date: Thu, 20 Feb 2020 18:49:50 GMT
Transfer-Encoding: chunked
Content-Type: application/pdf
Content-Encoding: gzip
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Set-Cookie: a good cookie
Set-Cookie: a good cookie
Set-Cookie: a good cookie
Set-Cookie:a good cookie
Vary: Accept-Encoding,User-Agent
Content-Disposition: filename=Declarations Page.pdf
我得到了预期的4个cookie,但我的文件很短(pdf)。我不确定还需要修改什么。我试着以txt或html的形式打开这个文件,这是胡言乱语,但有很多胡言乱语。只是不是一个完美的pdf。提前感谢您提供有关我可以调整的内容的想法。检查了文件内容(在文本编辑器中打开)?它以
%PDF
开头(使不可打印的字符可见,例如查找BOM)?还是垃圾?然后可能是压缩的,限制编码,使其泄气。看来你解决了,如果是真的,分享答案吧!您尝试了WGet
,因为它告诉我们在文档中简化请求?谢谢您的回复。我会在周一或周二在文本编辑器中登记,然后发回。在另一个帖子上出现302错误。我没有。我猜这是网站的问题&我需要发送一个请求到前面的页面并继续工作,我猜服务器是智能的&需要请求来进行身份验证。这是一个非常要求密集的网站,所以我现在继续前进。这完全奏效了!非常感谢你!文件大小合适。在文本编辑器中,它只是胡言乱语。我只修复了这一行:WinHttpReq2.setRequestHeader“接受编码”,“gzip,deflate”到WinHttpReq2.setRequestHeader“接受编码”,“deflate”&我很乐意去做!