如何使用php文档获取特定标记的html_Php_Html_Domdocument

如何使用php文档获取特定标记的html

php html

如何使用php文档获取特定标记的html,php,html,domdocument,Php,Html,Domdocument,首先，我让它找到元素 $dom= new \DOMDocument(); $dom->loadHTML($html_string); $only_p = $dom->getElementsByTagName('p')->item(1); 如果我尝试像 $only_p->textContent; /* <-- it will only return the text inside the paragraph and even remove all the tags

首先，我让它找到元素

$dom= new \DOMDocument();
$dom->loadHTML($html_string);
$only_p = $dom->getElementsByTagName('p')->item(1);

如果我尝试像

$only_p->textContent; /* <-- it will only return the text inside the paragraph and even remove all the tags inside of it */

$only_p = $dom->getElementsByTagName('p')->item(1);
$only_p->outerHTML;

这将返回它周围的HTML，如

<p class="something"><a href="link"> this is text </a></p>

这不是字符串“this is text”

而是我如何解决它的

/**
 * Returns the outer HTML of a certain tag.
 * You can decide whether the result should be returned as string or inside an array.
 *
 * @param $html_string
 * @param $tagName
 * @param string $return_as
 * @return array|string
 */
public function getOuterHTMLFromTagName($html_string,$tagName,$return_as='array')
{
    //create a new DomDocument based on the first parameter's html
    $dom_doc= new \DOMDocument();
    $dom_doc->loadHTML($html_string);

    //set variables for the result type
    $html_results_as_array = array();
    $html_results_as_string = "";

    // get tags from DocDocument
    $elements_in_tag =$dom_doc->getElementsByTagName($tagName);

    // loop through found tags
    for($a=0; $a < $elements_in_tag->length; $a++)
    {
        // get tag of current key
        $element_in_tag = $dom_doc->getElementsByTagName($tagName)->item($a);

        //create a new DomDocument that only contains the tags HTML
        $element_doc = new \DOMDocument();
        $element_doc->appendChild($element_doc->importNode($element_in_tag,true));

        //save the elements HTML in variables
        $html_results_as_string .= $element_doc->saveHTML();
        array_push($html_results_as_array,$element_doc->saveHTML());
    }

    //return either as array or string
    if($return_as == 'array')
    {
        return $html_results_as_array;
    }
    else
    {
        return $html_results_as_string;
    }
}

/**
*返回某个标记的外部HTML。
*您可以决定是将结果作为字符串还是在数组中返回。
*
*@param$html\u字符串
*@param$标记名
*@param string$return\u as
*@return数组|字符串
*/
公共函数getOuterHTMLFromTagName（$html\u字符串，$tagName，$return\u as='array'））
{
//基于第一个参数的html创建一个新的DomDocument
$dom_doc=new\DOMDocument（）；
$dom\u doc->loadHTML（$html\u字符串）；
//为结果类型设置变量
$html_results_作为_array=array（）；
$html_results_as_string=“”；
//从DocDocument获取标记
$elements\u in\u标记=$dom\u doc->getElementsByTagName（$tagName）；
//循环查找找到的标记
对于（$a=0；$a<$elements\u in_tag->length；$a++）
{
//获取当前密钥的标记
$element_in_tag=$dom_doc->getElementsByTagName（$tagName）->item（$a）；
//创建一个只包含HTML标记的新DomDocument
$element_doc=new\DOMDocument（）；
$element_doc->appendChild（$element_doc->importNode（$element_in_tag，true））；
//将元素保存在HTML变量中
$html_results_作为_字符串。=$element_doc->saveHTML（）；
数组推送（$html\u结果\u作为数组，$element\u doc->saveHTML（））；
}
//以数组或字符串形式返回
if（$return_as=='array'）
{
返回$html\u results\u作为\u数组；
}
其他的
{
以字符串形式返回$html\u results\u；
}
}

您可以尝试

$only\p->ownerDocument->saveXML（$only\p）