Php 无效的HTML引用属性

Php 无效的HTML引用属性,php,html,quotes,Php,Html,Quotes,我有以下HTML: <td width=140 style='width:105.0pt;padding:0cm 0cm 0cm 0cm'> <p class=MsoNormal><span style='font-size:9.0pt;font-family:"Arial","sans-serif"; mso-fareast-font-family:"Times New Roman";color:#666666'>OCCUPANCY

我有以下HTML:

<td width=140 style='width:105.0pt;padding:0cm 0cm 0cm 0cm'>
    <p class=MsoNormal><span style='font-size:9.0pt;font-family:"Arial","sans-serif";
       mso-fareast-font-family:"Times New Roman";color:#666666'>OCCUPANCY
       TAX:</span></p>
</td>

入住率 税款:

一些HTML属性没有被引用,例如:width=140和class=MsoNormal

有任何PHP函数用于这类事情吗?如果没有的话,在HTML中解决这个问题的聪明方法是什么


谢谢。

我想您可以使用regexp来实现:

/\s([\w]{1,}=)((?!")[\w]{1,}(?!"))/g


\s match any white space character [\r\n\t\f ]
1st Capturing group ([\w]{1,}=)
    [\w]{1,} match a single character present in the list below
        Quantifier: {1,} Between 1 and unlimited times, as many times as possible, giving back as needed [greedy]
    \w match any word character [a-zA-Z0-9_]
    = matches the character = literally
2nd Capturing group ((?!")[\w]{1,}(?!"))
    (?!") Negative Lookahead - Assert that it is impossible to match the regex below
    " matches the characters " literally
    [\w]{1,} match a single character present in the list below
        Quantifier: {1,} Between 1 and unlimited times, as many times as possible, giving back as needed [greedy]
    \w match any word character [a-zA-Z0-9_]
    (?!") Negative Lookahead - Assert that it is impossible to match the regex below
    " matches the characters " literally
g modifier: global. All matches (don't return on first match)
将实现如下内容:

echo preg_replace_callback('/\s([\w]{1,}=)((?!")[\w]{1,}(?!"))/', function($matches){
    return ' '.$matches[1].'"'.$matches[2].'"';
}, $str);
并将导致:

 <td width="140" style='width:105.0pt;padding:0cm 0cm 0cm 0cm'>
   <p class="MsoNormal"><span style='font-size:9.0pt;font-family:"Arial","sans-serif";
     mso-fareast-font-family:"Times New Roman";color:#666666'>OCCUPANCY
      TAX:</span></p>
 </td>

入住率
税款:


注意,这是一个糟糕的例子,肯定可以清理。

没有本机php函数,而且已经清理过了。真正需要
的时间只有在值中存在特殊字符或空格时。有鉴于此,我认为最好是自己清理文件,使用sublime之类的文本编辑器。我必须通过编程解决这个问题。没有引号的width=140给我带来了麻烦,因为我使用的是quoted_printable_decode()函数,当它找到=140时,它会将它转换为一些不带引号的字符。但是,使用class='140'(带引号)就可以了。但我想要一些在整个文件中引用所有属性的聪明方法。也许吧?我建议您不要使用内联样式。将您的样式与标记分开,这将为您省去很多麻烦。相信我。@Nuno Aruda这是我得到的HTML,不是我写的。我必须使用它。强制性的“你不能用正则表达式解析HTML”: