Php Regexp无法处理span标记的文件\u获取\u内容

Php Regexp无法处理span标记的文件\u获取\u内容,php,regex,Php,Regex,下面是我的Regexp,当我直接将html内容分配给它时,它工作得很好。但不使用文件获取内容() 尝试: 有关更多详细信息,请查看和 anchor标记和b标记示例: $urlcontent = '<td valign="top" align="left" class="SearchResultItemHeader"> <a class="thickbox" title="Dishwasher Tube/Spray Arm Kit" hre

下面是我的
Regexp
,当我直接将html内容分配给它时,它工作得很好。但不使用文件获取内容()

尝试:

有关更多详细信息,请查看和

anchor
标记和
b
标记示例:

$urlcontent = '<td valign="top" align="left" class="SearchResultItemHeader">
                    <a class="thickbox" title="Dishwasher Tube/Spray Arm Kit" href="ItemDetailsPopup.aspx?itemcode=WHI%20675808&amp;keepThis=true&amp;TB_iframe=true&amp;height=500&amp;width=640"><b>Dishwasher Tube/Spray Arm Kit</b></a>
                      </td>';

$doc = new DOMDocument();

$doc->loadHTML($urlcontent);

$xpath = new DOMXpath($doc);

$elements = $xpath->query("//td[@class='SearchResultItemHeader']/a/b")->item(0)->nodeValue;

echo $elements;

////output: Dishwasher Tube/Spray Arm Kit
$urlcontent=”
';
$doc=新的DOMDocument();
$doc->loadHTML($urlcontent);
$xpath=新的DOMXpath($doc);
$elements=$xpath->query(“//td[@class='SearchResultItemHeader']/a/b”)->item(0)->nodeValue;
回音元素;
////输出:洗碗机管/喷淋臂套件
试试:

有关更多详细信息,请查看和

anchor
标记和
b
标记示例:

$urlcontent = '<td valign="top" align="left" class="SearchResultItemHeader">
                    <a class="thickbox" title="Dishwasher Tube/Spray Arm Kit" href="ItemDetailsPopup.aspx?itemcode=WHI%20675808&amp;keepThis=true&amp;TB_iframe=true&amp;height=500&amp;width=640"><b>Dishwasher Tube/Spray Arm Kit</b></a>
                      </td>';

$doc = new DOMDocument();

$doc->loadHTML($urlcontent);

$xpath = new DOMXpath($doc);

$elements = $xpath->query("//td[@class='SearchResultItemHeader']/a/b")->item(0)->nodeValue;

echo $elements;

////output: Dishwasher Tube/Spray Arm Kit
$urlcontent=”
';
$doc=新的DOMDocument();
$doc->loadHTML($urlcontent);
$xpath=新的DOMXpath($doc);
$elements=$xpath->query(“//td[@class='SearchResultItemHeader']/a/b”)->item(0)->nodeValue;
回音元素;
////输出:洗碗机管/喷淋臂套件

像这样更改Regexp-

preg_match_all('%<span.*id=\"ContentPlaceHolder1_Repeater1_lblLongDesc_0\"(.*)\/span>%', $urlcontent, $desc);
输出-

Array
(
    [0] => Array
        (
            [0] => <span id="ContentPlaceHolder1_Repeater1_lblLongDesc_0">*WAS W10224675 M BASKT-WARE WAS W10171734</span>
        )

    [1] => Array
        (
            [0] => *WAS W10224675 M BASKT-WARE WAS W10171734
        )

)
Array

    (
        [0] => Array
            (
                [0] => <span id="ContentPlaceHolder1_Repeater1_lblLongDesc_0">*WAS W10224675 M BASKT-WARE WAS W10171734</span>
            )

        [1] => Array
            (
                [0] => *WAS W10224675 M BASKT-WARE WAS W10171734
            )

    )
数组
(
[0]=>阵列
(
[0]=>*为W10224675 M BASKT-WARE为W10171734
)
[1] =>阵列
(
[0]=>*为W10224675 M BASKT-WARE为W10171734
)
)

像这样更改Regexp-

preg_match_all('%<span.*id=\"ContentPlaceHolder1_Repeater1_lblLongDesc_0\"(.*)\/span>%', $urlcontent, $desc);
输出-

Array
(
    [0] => Array
        (
            [0] => <span id="ContentPlaceHolder1_Repeater1_lblLongDesc_0">*WAS W10224675 M BASKT-WARE WAS W10171734</span>
        )

    [1] => Array
        (
            [0] => *WAS W10224675 M BASKT-WARE WAS W10171734
        )

)
Array

    (
        [0] => Array
            (
                [0] => <span id="ContentPlaceHolder1_Repeater1_lblLongDesc_0">*WAS W10224675 M BASKT-WARE WAS W10171734</span>
            )

        [1] => Array
            (
                [0] => *WAS W10224675 M BASKT-WARE WAS W10171734
            )

    )
数组
(
[0]=>阵列
(
[0]=>*为W10224675 M BASKT-WARE为W10171734
)
[1] =>阵列
(
[0]=>*为W10224675 M BASKT-WARE为W10171734
)
)

Hi Chetan,同样的东西对a href不起作用。。我认为,当直接传递html内容时,请检查网站上的更新问题。类和href的顺序不同,我认为在这种情况下,您可以使用xpath删除数据。它是如何工作的。。你能举个例子吗?请看更新后的答案。。你可以在xpathHi Chetan的基础上获得其他标签,同样的东西对a href不起作用。。我认为,当直接传递html内容时,请检查网站上的更新问题。类和href的顺序不同,我认为在这种情况下,您可以使用xpath删除数据。它是如何工作的。。你能举个例子吗?请看更新后的答案。。您可以基于xpath获得其他标记
preg_match_all('%<span.*id=\"ContentPlaceHolder1_Repeater1_lblLongDesc_0\"(.*)\/span>%', $urlcontent, $desc);
$description = strip_tags($desc[1][0]);
Array

    (
        [0] => Array
            (
                [0] => <span id="ContentPlaceHolder1_Repeater1_lblLongDesc_0">*WAS W10224675 M BASKT-WARE WAS W10171734</span>
            )

        [1] => Array
            (
                [0] => *WAS W10224675 M BASKT-WARE WAS W10171734
            )

    )