Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/powershell/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/webpack/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
.net 在.mht保存的网页中进行DOM遍历_.net_Powershell_Dom_Mhtml - Fatal编程技术网

.net 在.mht保存的网页中进行DOM遍历

.net 在.mht保存的网页中进行DOM遍历,.net,powershell,dom,mhtml,.net,Powershell,Dom,Mhtml,是否可以在保存为.mht或仅保存为.htm html的网页中进行DOM遍历? 最好是在powershell或.net中 目标是能够做一些像getElementsByTagName'div'这样的事情 如果是,如何使用找到解决方案。 文档可在上找到,本节中提到了 示例代码: # Choose a source $Source = 'C:\temp\myFile.mht' $Source = 'http://www.google.com' # Get online or mht content $

是否可以在保存为.mht或仅保存为.htm html的网页中进行DOM遍历? 最好是在powershell或.net中 目标是能够做一些像getElementsByTagName'div'这样的事情 如果是,如何使用找到解决方案。 文档可在上找到,本节中提到了

示例代码:

# Choose a source
$Source = 'C:\temp\myFile.mht'
$Source = 'http://www.google.com'

# Get online or mht content
$IE = New-Object -ComObject InternetExplorer.Application

# Don't show the browser
$IE.Visible = $false

# Browse to your webpage/file
$IE.Navigate($Source)

# Wait for page to load
while ($IE.busy) { Sleep -Milliseconds 50 }

# Get the html from that page
$Html = $IE.Document.body.parentElement.outerHTML

# Decode to get rid of html encoded characters like & etc...
$Html = [System.Web.HttpUtility]::HtmlDecode($Html)

# Close the browser
$IE.Quit()


# Use HtmlAgilityPack (must be installed first)
Add-Type -Path (Join-Path $Env:userprofile '.nuget\packages\htmlagilitypack\1.4.9.5\lib\Net40\HtmlAgilityPack.dll')
$Hap = New-Object HtmlAgilityPack.HtmlDocument

# Load the Html in HtmlAgilityPack to get a DOM
$Hap.LoadHtml($global:Html)

# Retrieve the data from the DOM (read a node)
[string]$partData = $Hap.DocumentNode.SelectSingleNode("//div[@class='formatted_content']/ul").InnerText