Php DOM错误-ID';someAnchor';已在实体中定义,第X行

Php DOM错误-ID';someAnchor';已在实体中定义,第X行,php,dom,anchor,Php,Dom,Anchor,如果我尝试将HTML文档加载到PHP DOM中,会出现如下错误: Error DOMDocument::loadHTML() [domdocument.loadhtml]: ID someAnchor already defined in Entity, line: 9 我不明白为什么。下面是一些将HTML字符串加载到DOM中的代码 第一个不包含锚定标记,第二个包含锚定标记。第二个文档产生一个错误 希望您能够将其剪切并粘贴到脚本中,然后运行它以查看相同的输出: <?php ini_set

如果我尝试将HTML文档加载到PHP DOM中,会出现如下错误:

Error DOMDocument::loadHTML() [domdocument.loadhtml]: ID someAnchor already defined in Entity, line: 9
我不明白为什么。下面是一些将HTML字符串加载到DOM中的代码

第一个不包含锚定标记,第二个包含锚定标记。第二个文档产生一个错误

希望您能够将其剪切并粘贴到脚本中,然后运行它以查看相同的输出:

<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);


$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
</body>
</html>
EOT;

$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
<a name="someAnchor" id="someAnchor"></a>
</body>
</html>
EOT;

class domGrabber
    {
    public $_FileErrorStr = '';

    /**
    *@desc DOM object factory does the work of loading the DOM object
    */
    public function getLoadAsDOMObj($htmlString)
        {
        $this->_FileErrorStr =''; //reset error container
        $xmlDoc = new DOMDocument();
        set_error_handler(array($this, '_FileErrorHandler')); // Warnings and errors are suppressed
        $xmlDoc->loadHTML($htmlString);
        restore_error_handler();
        return $xmlDoc;
        }

    /**
    *@desc public so that it can catch errors from outside this class
    */
    public function _FileErrorHandler($errno, $errstr, $errfile, $errline)
        {
        if ($this->_FileErrorStr === null)
            {
            $this->_FileErrorStr = $errstr;
            }
        else    {
            $this->_FileErrorStr .= (PHP_EOL . $errstr);
            }
        }
    }

$domGrabber = new  domGrabber();
$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithNoAnchor );

echo 'PHP Version: '. phpversion() .'<br />'."\n";

echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."\n";
if ($domGrabber->_FileErrorStr)
    {
    echo 'Error'. $domGrabber->_FileErrorStr;
    }



$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithAnchor);
echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."\n";
if ($domGrabber->_FileErrorStr)
    {
    echo 'Error'. $domGrabber->_FileErrorStr;
    }

我的文件
你好
EOT;
$stringWithAnchor=saveXML();
回显“”。“\n”;
如果($this->\u FileErrorStr)
{
回显“错误”。$this->\u FileErrorStr;
}
}
/**
*@desc public,以便它可以捕获来自该类之外的错误
*/
公共函数_FileErrorHandler($errno、$errstr、$errfile、$errline)
{
如果($this->\u FileErrorStr==null)
{
$this->_FileErrorStr=$errstr;
}
否则{
$this->_FileErrorStr.=(PHP_EOL.$errstr);
}
}
}
$domGrabber=新的domGrabber();
回显“PHP版本:”。phpversion()。“
”。“\n”; $domGrabber->useHTMLMethod=TRUE//DOM->loadHTML $domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor); $domGrabber->loadDOMObjAndWriteOut($stringWithAnchor); $domGrabber->loadDOMObjAndWriteOut($stringwithanchorbutonlydatt); $domGrabber->useHTMLMethod=FALSE//使用DOM->loadXML $domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor); $domGrabber->loadDOMObjAndWriteOut($stringWithAnchor); $domGrabber->loadDOMObjAndWriteOut($stringwithanchorbutonlydatt);
如果您正在加载XML文件(在这种情况下,XHTML就是XML),那么您应该使用,而不是

在HTML中,
name
id
都引入了一个id。因此,重复id“someAnchor”,因此出现了错误

<>但是,W3C验证器允许您在LIbxML2中显示“代码>”的形式重复的ID,用户提出一个补丁只考虑<代码>名称<代码>属性作为ID:

根据HTML和XHTML规范,只有元素的name属性 与id属性共享名称空间。对于某些元素,它是有争议的 使用相同名称的多个实例没有意义,但它们应该 但是,不能在与其他元素的id相同的命名空间中考虑 属性

一目了然 采用名称属性及其语义的元素


更新:我发现HTMLTidy正在将id属性放入我的xhtml文档中。。。。正在查看当前获取的新错误:tidy::parseFile()[tidy.parseFile]:未知的tidy配置选项“锚定为名称”,但这是一个单独(但相关)的问题。仅出于您的兴趣。另一个注意事项是:如果文档中有实体,如£;,则loadXML()不起作用;或£;(我希望我可以声明它们——但这违背了xhtml的观点)因此我发现loadHTML()必须与不同时将id和name属性设置为相同值的锚一起使用。
PHP Version: 5.2.9<br />
<pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body>
<h1>Hello</h1>
</body></html>
</pre>
<pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body>
<h1>Hello</h1>
<a name="someAnchor" id="someAnchor"></a>

</body></html>
</pre>
Error
DOMDocument::loadHTML() [<a href='domdocument.loadhtml'>domdocument.loadhtml</a>]: ID someAnchor already defined in Entity, line: 9
<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);


$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithNoAnchor</p>
</body>
</html>
EOT;

$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchor</p>
<a  name="someAnchor" id="someAnchor" ></a>
</body>
</html>
EOT;

$stringWithAnchorButOnlyIdAtt = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchorButOnlyIdAtt</p>
<a id="someAnchor"></a>
</body>
</html>
EOT;

class domGrabber
    {
    public $_FileErrorStr = '';
    public $useHTMLMethod = TRUE;

    /**
    *@desc DOM object factory does the work of loading the DOM object
    */
    public function loadDOMObjAndWriteOut($htmlString)
        {
        $this->_FileErrorStr ='';

        $xmlDoc = new DOMDocument();
        set_error_handler(array($this, '_FileErrorHandler')); // Warnings and errors are suppressed


        if ($this->useHTMLMethod)
            {
            $xmlDoc->loadHTML($htmlString);
            }
        else    {
            $xmlDoc->loadXML($htmlString);
            }


        restore_error_handler();

        echo "<h1>";
        echo ($this->useHTMLMethod) ? 'using xmlDoc->loadHTML() ' : 'using $xmlDoc->loadXML()';
        echo "</h1>";
        echo '<pre>';
        print $xmlDoc->saveXML();
        echo '</pre>'."\n";
        if ($this->_FileErrorStr)
            {
            echo 'Error'. $this->_FileErrorStr;
            }
        }

    /**
    *@desc public so that it can catch errors from outside this class
    */
    public function _FileErrorHandler($errno, $errstr, $errfile, $errline)
        {
        if ($this->_FileErrorStr === null)
            {
            $this->_FileErrorStr = $errstr;
            }
        else    {
            $this->_FileErrorStr .= (PHP_EOL . $errstr);
            }
        }
    }

$domGrabber = new  domGrabber();

echo 'PHP Version: '. phpversion() .'<br />'."\n";

$domGrabber->useHTMLMethod = TRUE; //DOM->loadHTML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);

$domGrabber->useHTMLMethod = FALSE; //use DOM->loadXML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);