Php 使用DOM解析器提取文本_Php_Dom

Php 使用DOM解析器提取文本

php dom

Php 使用DOM解析器提取文本,php,dom,Php,Dom,我刚刚开始学习DOM解析器让我们假设在中，我有4行类似于下面的行，我试图将上下文提取为文本。我所需要的只是LPPR 051600Z 35010KT CAVOK 27/14 Q1020，将其作为JSON负载发送到传入的webhook <FONT FACE="Monospace,Courier">LPPR 051600Z 35010KT CAVOK 27/14 Q1020</FONT><BR> LPPR 051600Z 35010KT卡沃克27/14 Q10

我刚刚开始学习DOM解析器

让我们假设在中，我有4行类似于下面的行，我试图将上下文提取为文本。我所需要的只是LPPR 051600Z 35010KT CAVOK 27/14 Q1020，将其作为JSON负载发送到传入的webhook

<FONT FACE="Monospace,Courier">LPPR 051600Z 35010KT CAVOK 27/14 Q1020</FONT><BR>

LPPR 051600Z 35010KT卡沃克27/14 Q1020

在这个例子中，如何使用$html=str\u get\u html和$html->find

我设法发送了完整的HTML内容，但这不是我想要的

<?php

include_once('simple_html_dom.php');
$html = file_get_html('http://test.com')->plaintext;


// The data to send to the API

$postData = array('text' => $html);


// Setup cURL
$ch = curl_init('https://uri.com/test');
curl_setopt_array($ch, array(
    CURLOPT_POST => TRUE,
    CURLOPT_RETURNTRANSFER => TRUE,
    CURLOPT_HTTPHEADER => array(
        'Authorization: '.$authToken,
        'Content-Type: application/json'
    ),
    CURLOPT_POSTFIELDS => json_encode($postData)
));

// Send the request
$response = curl_exec($ch);

// Check for errors
if($response === FALSE){
    die(curl_error($ch));
}

// Decode the response
$responseData = json_decode($response, TRUE);

// Print the date from the response
echo $responseData['published'];
?>

非常感谢您可以使用

PHP:DOM

是

simple\u html\u DOM

下面的示例从谷歌搜索中获取链接

<?php
# Use the Curl extension to query Google and get back a page of results
$url = "http://www.google.com";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);

# Create a DOM parser object
$dom = new DOMDocument();

# Parse the HTML from Google.
# The @ before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
@$dom->loadHTML($html);

# Iterate over all the <a> tags
foreach($dom->getElementsByTagName('font') as $link) {
        # Show the <font>
        echo $link->textContent;
        echo "<br />";
}
?>

您可以使用PHP:DOM
作为simple\u html\u DOM

下面的示例从谷歌搜索中获取链接
<?php
# Use the Curl extension to query Google and get back a page of results
$url = "http://www.google.com";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);

# Create a DOM parser object
$dom = new DOMDocument();

# Parse the HTML from Google.
# The @ before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
@$dom->loadHTML($html);

# Iterate over all the <a> tags
foreach($dom->getElementsByTagName('font') as $link) {
        # Show the <font>
        echo $link->textContent;
        echo "<br />";
}
?>

如果您确定该行与此行完全相同，您可以
$line = explode('<br>', $response);

如果你确定这条线和这条一模一样，你可以
$line = explode('<br>', $response);

非常感谢！如果我替换为标记“font”，那么echo$link->getAttribute（'href'）行将如何显示$link->innertext
我刚刚将迭代周期改为这样：#将所有标记foreach（$dom->getElementsByTagName（'font'）迭代为$link）{#显示内容echo$link->textContent；echo“
”}太好了，让我在回答中更改它吧谢谢！如果我替换为标记“font”，那么echo$link->getAttribute（'href'）行将如何显示$link->innertext
我刚刚将迭代周期更改为这样一个周期：#将所有标记foreach（$dom->getElementsByTagName（'font'）作为$link迭代）{#显示内容echo$link->textContent；echo“
”}太好了，让我在回答中更改它