Php Domdocument saveHTML()添加额外的引号和其他一些url编码字符

Php Domdocument saveHTML()添加额外的引号和其他一些url编码字符,php,domdocument,Php,Domdocument,我一直在使用PHP的Domdocument扩展来查找没有alt属性或alt属性为空的图像标记。以下是我用于测试目的的html代码: <span style="font-weight:bold;">Blender</span> is an Open Source 3D modelling and animation software. This is a very popular software among hobbyists.<i>Blender</

我一直在使用PHP的Domdocument扩展来查找没有alt属性或alt属性为空的图像标记。以下是我用于测试目的的html代码:

<span style="font-weight:bold;">Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href="http://www.blender.org">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href="http://www.google.com"" target="_blank">Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com. <img src="http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg">
这里是Domdocument代码,我用它搜索IMG标记并向其添加alt属性

$dom=new DOMDocument();
$dom->loadHTML($content);
$dom->formatOutput = true;
$imgs = $dom->getElementsByTagName("img");
foreach($imgs as $img){
 $alt = $img->getAttribute('alt');
 if ($alt == ''){
  $k_alt = $this->keyword;    
 }else{
  $k_alt = $alt;
 }
 $img->setAttribute( 'alt' , $k_alt );
}
$html_mod = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));
return $html_mod;
这是我得到的html作为回报

<span style='"font-weight:bold;"'>Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href=""http://www.blender.org"">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href=""http://www.google.com""" target='"_blank"'>Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as BlenderGuru.com. 
<img src=""http://www.cochinsquare.com/wp-content/uploads/2010/08/Blender.jpg"" alt="Blender">
观察img src和锚定标记以及span的style属性中的单引号和双引号

请帮忙!我希望html完整返回,只添加新的alt属性


另外,我想提到的是,我在Ubuntu 10.04上使用带有Suhosin补丁的PHP 5.3.2,我无法在PHP/5.3.6 for Windows上重现这个问题。由于HTML href无效,我在loadHTML上收到警告=http://www.google.com 但输出是正确的。有没有可能错误在别的地方?您的代码似乎是从类中剥离出来的:-?没关系。我走了regex的路你也是对的,我正在类中使用代码。顺便说一下,我也在使用64位ubuntu系统。这可能与这个问题有关