Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/259.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 如何使用simplehtmldom从此页提取数据_Php_Html_Dom_Simple Html Dom - Fatal编程技术网

Php 如何使用simplehtmldom从此页提取数据

Php 如何使用simplehtmldom从此页提取数据,php,html,dom,simple-html-dom,Php,Html,Dom,Simple Html Dom,我正在尝试使用simplehtmldom从中提取信息 具体来说,我想访问页面上的以下部分: ISSN:1874-1207 我有以下代码: $html=file\u get\u html('https://benthamopen.com/browse-by-title/B/1/'); foreach($html->find('div[style=padding:10px;]')作为$ele){ 回声(“ 我不确定如何从这里开始。我想摘录: ISSN文本(不在echo语句中显示-不确定原因)[

我正在尝试使用simplehtmldom从中提取信息

具体来说,我想访问页面上的以下部分:


ISSN:1874-1207
我有以下代码:

$html=file\u get\u html('https://benthamopen.com/browse-by-title/B/1/');
foreach($html->find('div[style=padding:10px;]')作为$ele){
回声(“
我不确定如何从这里开始。我想摘录:

  • ISSN文本(不在echo语句中显示-不确定原因)[1874-1207,在上面的示例中]。它是[nodes]的元素零
  • “数据url”[https://benthamopen.com/TOBEJ/home/,在上述示例中]
  • “数据标题”[开放生物医学工程杂志,在上面的例子中]
也许我对PHP对象和数组的理解没有它应有的那么好,我不知道为什么ISSN没有显示在echo语句中


我尝试了各种(许多)方法,但只是努力从元素中提取数据。

我不熟悉simplehtmldom,只是想避免它。因此,我将提供一个使用PHP内置DOM类的解决方案:

loadHtml($html);
//创建一个XPath对象并查询它
$xpath=newdomxpath($dom);
$elements=$xpath->query(“//div[@style='padding:10px;']);
//在比赛中循环
foreach($el形式的元素){
//跳过没有ISSN的元素
$text=trim($el->textContent);
if(strpos($text,“ISSN”)!==0){
继续;
}
//把第一个div放在这里面
$div=$el->getElementsByTagName(“div”)[0];
//把它扔掉
printf(“%s%s%s
\n”,str_替换(“ISSN:”,“,$text),$div->getAttribute(“数据标题”),$div->getAttribute(“数据url”); }
XPath的内容可能有点让人不知所措,但对于像这样的简单搜索,它与CSS选择器没有太大的不同。希望注释能解释一切,如果没有,请告诉我

输出:

1874-1207开放生物医学工程杂志https://benthamopen.com/TOBEJ/home/
1874-1967《开放生物学杂志》https://benthamopen.com/TOBIOJ/home/
1874-091X开放式生物化学杂志https://benthamopen.com/TOBIOCJ/home/
1875-0362《开放生物信息学杂志》https://benthamopen.com/TOBIOIJ/home/
1875-3183《开放生物标志物杂志》https://benthamopen.com/TOBIOMJ/home/
2665-9956开放式生物材料科学杂志https://benthamopen.com/TOBMSJ/home/
1874-0707《开放生物技术杂志》https://benthamopen.com/TOBIOTJ/home/

再次感谢。我还没有“将此转换为简单HTML DOM,但解决方案满足了我的需要,我非常感激。如果有人能告诉我如何在简单HTML DOM中实现此功能,我仍然很感兴趣,因为这可能对其他人以及我将来都很有用。”。
simplehtmldom\HtmlNode Object
(
    [nodetype] => HDOM_TYPE_ELEMENT (1)
    [tag] => div
    [attributes] => Array
        (
            [style] => padding:10px;
        )

    [nodes] => Array
        (
            [0] => simplehtmldom\HtmlNode Object
                (
                    [nodetype] => HDOM_TYPE_ELEMENT (1)
                    [tag] => strong
                    [attributes] => none
                    [nodes] => none
                )

            [1] => simplehtmldom\HtmlNode Object
                (
                    [nodetype] => HDOM_TYPE_TEXT (3)
                    [tag] => text
                    [attributes] => none
                    [nodes] => none
                )

            [2] => simplehtmldom\HtmlNode Object
                (
                    [nodetype] => HDOM_TYPE_ELEMENT (1)
                    [tag] => br
                    [attributes] => none
                    [nodes] => none
                )

            [3] => simplehtmldom\HtmlNode Object
                (
                    [nodetype] => HDOM_TYPE_ELEMENT (1)
                    [tag] => div
                    [attributes] => Array
                        (
                            [class] => sharethis-inline-share-buttons
                            [style] => padding-top:10px;
                            [data-url] => https://benthamopen.com/TOBEJ/home/
                            [data-title] => The Open Biomedical Engineering Journal
                        )

                    [nodes] => none
                )

        )

)
1874-1207 The Open Biomedical Engineering Journal https://benthamopen.com/TOBEJ/home/<br/>
1874-1967 The Open Biology Journal https://benthamopen.com/TOBIOJ/home/<br/>
1874-091X The Open Biochemistry Journal https://benthamopen.com/TOBIOCJ/home/<br/>
1875-0362 The Open Bioinformatics Journal https://benthamopen.com/TOBIOIJ/home/<br/>
1875-3183 The Open Biomarkers Journal https://benthamopen.com/TOBIOMJ/home/<br/>
2665-9956 The Open Biomaterials Science Journal https://benthamopen.com/TOBMSJ/home/<br/>
1874-0707 The Open Biotechnology Journal https://benthamopen.com/TOBIOTJ/home/<br/>