Html Powershell：下载或保存整个ie页面的源代码_Html_Internet Explorer_Powershell_Form Submit_Powershell 3.0

Html Powershell：下载或保存整个ie页面的源代码

html internet-explorer powershell

Html Powershell：下载或保存整个ie页面的源代码,html,internet-explorer,powershell,form-submit,powershell-3.0,Html,Internet Explorer,Powershell,Form Submit,Powershell 3.0,我有这个PS脚本，它登录到一个网站，然后导航到另一个页面我想保存该页面的全部源代码。但出于某种原因。源代码的某些部分没有被发现 $username = "myuser" $password = "mypass" $ie = New-Object -com InternetExplorer.Application $ie.visible=$true $ie.navigate("http://www.example.com/login.shtml") while($ie.ReadyState -

我有这个PS脚本，它登录到一个网站，然后导航到另一个页面

我想保存该页面的全部源代码。但出于某种原因。源代码的某些部分没有被发现

$username = "myuser" 
$password = "mypass"
$ie = New-Object -com InternetExplorer.Application
$ie.visible=$true
$ie.navigate("http://www.example.com/login.shtml")
while($ie.ReadyState -ne 4) {start-sleep -m 100}
$ie.document.getElementById("username").value = "$username"
$ie.document.getElementById("pass").value = "$password"
$ie.document.getElementById("frmLogin").submit()
start-sleep 5
$ie.navigate("http://www.example.com/thislink.shtml")
$ie.Document.body.outerHTML | Out-File -FilePath c:\sourcecode.txt

这里是一个没有遇到的代码粘贴箱

导航后，再次检查就绪状态，而不是使用睡眠。与您使用的代码相同的代码也可以使用

在运行代码后，如果站点加载缓慢，则睡眠时间可能不够长

while($ie.ReadyState -ne 4) {start-sleep -m 100}

看起来还有另一篇关于这个的帖子看起来有人在该页面上创建了一个函数，您可以在其中清理它。一旦在代码中声明了函数，就会是这样

htmlWithCDATASectionsToHtmlWithout($ie.Document.body.outerHTML) | Out-File -FilePath c:\sourcecode.txt

我同意@tkrn关于使用while循环等待IE文档准备就绪的观点。为此，我建议在循环中至少使用2秒钟

while($ie.ReadyState -ne 4) {start-sleep -s 2}

尽管如此，我还是找到了一种更简单的方法，可以完全从URL获取整个HTML源页面。这是：

$ie.Document.parentWindow.execScript("var JSIEVariable = new XMLSerializer().serializeToString(document);", "javascript")
$obj = $ie.Document.parentWindow.GetType().InvokeMember("JSIEVariable", 4096, $null, $ie.Document.parentWindow, $null)
$HTMLDoc = $obj.ToString()

现在，$HTMLDoc完整保存了整个HTML源页面，您可以将其保存为HTML文件。

粘贴似乎是私有的。抱歉，修复了，现在请检查它。我做了更多的研究。它正在忽略

//抱歉，页面加载和所有内容之后的代码。我在电视上看到了。问题是它在//之后忽略了代码。非常感谢。但是我仍然在使用你发布的函数时出错<代码>在C:\Users\mmmm\Desktop\new.ps1:4 char:5+var ATTRS=“（？：[^>\”\]\\\“[^\”]*\“[^\']*\”）*”，
标记为已解决，但仍需要修复该代码中的错误。您对“JSIEVariable”的内容有何解释？这是可行的，但我想知道为什么，因为我根本不明白这里发生了什么。