使用Winhttp/VBA下载文件部分成功

使用Winhttp/VBA下载文件部分成功,vba,winhttprequest,Vba,Winhttprequest,我几乎成功地实现了下载过程的自动化。我现在正在使用fiddler&获取值并对它们进行硬编码,直到它起作用为止。该网站是安全的&我得到的饼干好。然而,当测试下载时,我得到一个239K的文件,但它不会打开。Windows告诉我它可能已损坏。不过文件大小应该是250K,所以我知道这并不是全部。我打算尝试运行fiddler来捕获我的代码流量,但有趣的是,它只捕获我在浏览器中所做的事情 我试过: strFullCookie = "a cookie in here" strDocLink = "https:

我几乎成功地实现了下载过程的自动化。我现在正在使用fiddler&获取值并对它们进行硬编码,直到它起作用为止。该网站是安全的&我得到的饼干好。然而,当测试下载时,我得到一个239K的文件,但它不会打开。Windows告诉我它可能已损坏。不过文件大小应该是250K,所以我知道这并不是全部。我打算尝试运行fiddler来捕获我的代码流量,但有趣的是,它只捕获我在浏览器中所做的事情

我试过:

strFullCookie = "a cookie in here"
strDocLink = "https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=withatokenhere&portalRoute=/AgentDocumentError"

Set WinHttpReq2 = CreateObject("WINHTTP.WinHTTPRequest.5.1")

WinHttpReq2.Open "GET", Trim(strDocLink), False
WinHttpReq2.Option(6) = False          
WinHttpReq2.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
WinHttpReq2.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"
WinHttpReq2.setRequestHeader "Accept-Language", "en-US,en;q=0.9"
WinHttpReq2.setRequestHeader "Connection", "keep-alive"
WinHttpReq2.setRequestHeader "Host", "gateway.frontlineinsurance.com"
WinHttpReq2.setRequestHeader "Referer", "https://gateway.frontlineinsurance.com/"
WinHttpReq2.setRequestHeader "Upgrade-Insecure-Requests", "1"
WinHttpReq2.setRequestHeader "Accept-Encoding", "gzip, deflate"
WinHttpReq2.setRequestHeader "Sec-Fetch-User", "?1"
WinHttpReq2.setRequestHeader "Sec-Fetch-Site", "same-origin"
WinHttpReq2.setRequestHeader "Sec-Fetch-Mode", "Navigate"
WinHttpReq2.setRequestHeader "Cache-Control", "no-cache"
WinHttpReq2.setRequestHeader "Content-Type", "application/pdf"
WinHttpReq2.setRequestHeader "Cookie", strFullCookie

WinHttpReq2.Send

'MsgBox WinHttpReq2.responseBody
'MsgBox WinHttpReq2.responseText

Debug.Print WinHttpReq2.Status
strHeaders = WinHttpReq2.getAllResponseHeaders()
Debug.Print strHeaders

Sleep 2000

         If WinHttpReq2.Status = 200 Then
            Set oStream = CreateObject("ADODB.Stream")
            oStream.Open
            oStream.Type = 1
            oStream.Write WinHttpReq2.responseBody
            oStream.SaveToFile "C:\Users\JCarney\Desktop\DownloadedMail\FPITest.pdf", 2    ' 1 = no overwrite, 2 = overwrite
            oStream.Close
        End If
我使用了:

'Set WinHttpReq2 = CreateObject("MSXML2.serverXMLHttp")
'Set WinHttpReq2 = CreateObject("Microsoft.XMLHTTP"
ServerXMLhttp REF返回的结果与Winhttp相同。另一个被服务器拒绝

Fiddler请求看起来像:

GET https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=sometokehere&portalRoute=/AgentDocumentError HTTP/1.1
Accept: text/html, application/xhtml+xml, image/jxr, */*
Referer: https://gateway.frontlineinsurance.com/
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: gateway.frontlineinsurance.com
Connection: Keep-Alive
Cookie: _cookies here
HTTP/1.1 200 OK
Date: Thu, 20 Feb 2020 18:45:05 GMT
Content-Type: application/pdf
Connection: keep-alive
Set-Cookie: a cookie
Set-Cookie: a cookie
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Content-Disposition: filename=Declarations Page.pdf
Vary: Accept-Encoding,User-Agent
Set-Cookie: a cookie
Set-Cookie: a cookie
Cache-Control: max-age=0, must-revalidate
Content-Length: 255177
fiddler的响应如下所示:

GET https://gateway.frontlineinsurance.com/pc/service/edge/document/gpa/document/pc:somenumberhere?token=sometokehere&portalRoute=/AgentDocumentError HTTP/1.1
Accept: text/html, application/xhtml+xml, image/jxr, */*
Referer: https://gateway.frontlineinsurance.com/
Accept-Language: en-US
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: gateway.frontlineinsurance.com
Connection: Keep-Alive
Cookie: _cookies here
HTTP/1.1 200 OK
Date: Thu, 20 Feb 2020 18:45:05 GMT
Content-Type: application/pdf
Connection: keep-alive
Set-Cookie: a cookie
Set-Cookie: a cookie
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Content-Disposition: filename=Declarations Page.pdf
Vary: Accept-Encoding,User-Agent
Set-Cookie: a cookie
Set-Cookie: a cookie
Cache-Control: max-age=0, must-revalidate
Content-Length: 255177
当我打印从代码返回的标题时,我得到:

Cache-Control: max-age=0, must-revalidate
Connection: keep-alive
Date: Thu, 20 Feb 2020 18:49:50 GMT
Transfer-Encoding: chunked
Content-Type: application/pdf
Content-Encoding: gzip
Server: Apache/2.4.6 (Red Hat Enterprise Linux)
Set-Cookie: a good cookie
Set-Cookie: a good cookie
Set-Cookie: a good cookie
Set-Cookie:a good cookie
Vary: Accept-Encoding,User-Agent
Content-Disposition: filename=Declarations Page.pdf

我得到了预期的4个cookie,但我的文件很短(pdf)。我不确定还需要修改什么。我试着以txt或html的形式打开这个文件,这是胡言乱语,但有很多胡言乱语。只是不是一个完美的pdf。提前感谢您提供有关我可以调整的内容的想法。

检查了文件内容(在文本编辑器中打开)?它以
%PDF
开头(使不可打印的字符可见,例如查找BOM)?还是垃圾?然后可能是压缩的,限制编码,使其泄气。看来你解决了,如果是真的,分享答案吧!您尝试了
WGet
,因为它告诉我们在文档中简化请求?谢谢您的回复。我会在周一或周二在文本编辑器中登记,然后发回。在另一个帖子上出现302错误。我没有。我猜这是网站的问题&我需要发送一个请求到前面的页面并继续工作,我猜服务器是智能的&需要请求来进行身份验证。这是一个非常要求密集的网站,所以我现在继续前进。这完全奏效了!非常感谢你!文件大小合适。在文本编辑器中,它只是胡言乱语。我只修复了这一行:WinHttpReq2.setRequestHeader“接受编码”,“gzip,deflate”到WinHttpReq2.setRequestHeader“接受编码”,“deflate”&我很乐意去做!