php的替代方案?

php的替代方案?,php,tidy,htmltidy,Php,Tidy,Htmltidy,我使用php tidy处理数据库中的html输入 $fragment = tidy_repair_string($dom->saveHTML(), array('output-xhtml'=>1,'show-body-only'=>1)); 我在服务器上打开了这个php_tidy,但我的live服务器不支持tidy 致命错误:调用中未定义的函数tidy_repair_string() /customers/0/5/a/mysite.com/httpd.www/models/f

我使用php tidy处理数据库中的html输入

$fragment = tidy_repair_string($dom->saveHTML(), array('output-xhtml'=>1,'show-body-only'=>1));
我在服务器上打开了这个php_tidy,但我的live服务器不支持tidy

致命错误:调用中未定义的函数tidy_repair_string() /customers/0/5/a/mysite.com/httpd.www/models/functions.php,第587行

有什么办法可以解决这个问题吗?

可以重写HTML以符合标准。如果您需要为XSS预防等过滤该输入,它也会这样做


这都是PHP,因此您应该能够在任何服务器上使用它。

如果您使用的是RedHat/CentOS/Fedora linux设备,并且具有服务器的root访问权限,则可以运行

yum install php-tidy
作为根。然后重新启动apache,这将使您能够继续工作

可能会出现需要添加的缺少依赖项的错误,但通常您只需要上面的命令

其他发行版的命令略有不同,但应该提供类似的命令


在windows上,您需要手动安装它。说明可以在这里找到

我发现速度非常快。我在寻找HTMLPurifier的替代品时发现了它,这非常缓慢

或仅通过DOMDocument对象:

$dirty = "<xml>some content</xml>"
$x = new DOMDocument;
$x->loadHTML($dirty);
$clean = $x->saveXML();
$dirty=“某些内容”
$x=新文件;
$x->loadHTML($dirty);
$clean=$x->saveXML();

PHP SuperTidy?

我厌倦了PHP Tidy的糟糕工作,所以我开始写这篇文章。它还应该整理所有javascript。它没有经过充分测试,因此您可能会发现一些需要考虑的意外情况。看到其他一些有才华的开发人员阐述这一点是很好的。我知道这是一条老线索,但我想在某个地方分享

这是一个开始。享受

超级整洁的实施

$Tidy = new SuperTidy($html);
$Tidy->SetIndentSize(4);
$Tidy->SetOffset(0);
echo $Tidy->BeautifiedHTML();
超级整洁类:

<?php
    class SuperTidy
    {
        /*
            Name: PHP SuperTidy
            Author: Paul Ishak
            Copyright: 2020
        */
        private $usedJSNames = [];
        private $indentSize = 4;
        private $sourceHtml = "";
        private $offset = -4;
        public function SetIndentSize($size)
        {
            $this->indentSize = $size;
        }
        public function __construct($html)
        {
            $this->sourceHtml = $html;
        }
        public function OriginalSource()
        {
            return $this->sourceHtml;
        }
        public function UpdateSource($html)
        {
            $this->sourceHtml = $html;          
        }
        public function SetOffset($offset)
        {
            $this->offset = $offset;
        }
        function BeautifiedHTML()
        {
            $this->usedJSNames = [];
            $buffer = $this->sourceHtml;
            $spacesPerIndent = $this->indentSize;
            $JSPlaceHolders = [];
            $out = str_replace("\r","\n",$buffer);
            $out = str_replace("\n\n","\n",$out);
            $out = str_replace("<script", "\n<script",$out);
            $out = str_replace("</script>", "\n</script>\n",$out);
            $lines = explode("\n",$out);
            $javascript = "";
            $outLines = [];
            for($i = 0; $i < count($lines); $i++)
            {
                $line = $lines[$i];
                $line = trim($line);
                if($line == "</script>") continue;
                if(strlen($line) >= strlen("<script"))
                {
                    if(strtolower(substr($line,0,7)) == "<script")
                    {
                        if(strpos(strtolower($line),"</script>"))
                        {
                            $outLines[] = $line;
                        }
                        else
                        {
                            $counter = $i + 1;
                            $jsLine = $lines[$counter];
                            $javascript = "";
                            $lineCount = 0;
                            while(strtolower(trim($jsLine)) !== "</script>")
                            {
                                $lineCount++;
                                $javascript.=$jsLine."\n";
                                $counter++;
                                if($counter > count($lines) - 1) break;
                                $jsLine = $lines[$counter];
                            }
                            $i+=$lineCount;
                            if(trim($javascript) == "")
                            {
                                $i++;
                                $line2 = $lines[$i];
                                $thisLine = $line.$line2;
                                if(strpos($thisLine,"src="))
                                {
                                    $outLines[] = $thisLine;
                                }
                                else
                                {
                                    $chars = str_split($thisLine);
                                    
                                    $stO = strpos(strtolower($thisLine),"<script");
                                    $enO = strpos(strtolower($thisLine),">",$stO)+1;
                                    $tagO = substr($thisLine,$stO,$enO);
                                    
                                    $stC = strpos(strtolower($thisLine),"</script");
                                    $enC = strpos(strtolower($thisLine),">",$stC)+1;
                                    $tagC = substr($thisLine,$stC,$enC);
                                    $javascript = substr($thisLine,$enO,$stC - $enO);
                                    $outLines[] = "<script type='application/javascript'>".$javascript."</script>";             
                                }
                            }
                            else
                            {
                                $unique = $this->GetUniqueJSPlaceHolder($out);
                                $JSPlaceHolders[$unique] = ['javascript'=>$javascript];
                                $outLines[] = "<$unique type='application/javascript'></$unique>";
                            }
                        }
                    }
                    else
                    {
                        $outLines[] = $line;
                    }
                }
                else
                {
                    $outLines[] = $line;
                }
            }
            $modHTML = "";
            foreach($outLines as $line)
            {
                $modHTML .= $line."\n";
            }
            $modHTML = str_replace("\n","",$modHTML);
            $modHTML = str_replace(">",">\n",$modHTML);
            $modHTML = str_replace("<","\n<",$modHTML);
            $modHTML = str_replace("\n\n","\n",$modHTML);
            $lines = explode("\n",$modHTML);
            $outLines = [];
            $indentLevel = -$spacesPerIndent + $this->offset;
            $openTags = [];
            foreach($lines as $line)
            {
                $line = trim($line);
                if($line !== "") $outLines[] = $line;
            }
            $modHTML = "";
            for($j = 0; $j < count($outLines); $j++)
            {
                $line = $outLines[$j];
                $isCloseTag = false;
                $firstChar = substr($line,0,1);
                $isMetaTag = substr(strtolower($line),1, 4) == "meta" ? true: false;
                $isDocType = substr(strtolower($line),2, 7) == "doctype" ? true: false;
                $isSelfClosing = substr($line, strlen($line)-2,1) == "/" ? true : false;
                $beginComment = substr($line, 0,4) == "<!--" ? true : false;
                $applyIndent = ($firstChar == "<") ? true : false;
                $applyIndent = $isMetaTag     ? false : $applyIndent;
                $applyIndent = $isDocType     ? false : $applyIndent;
                $applyIndent = $isSelfClosing ? false : $applyIndent;
                $applyIndent = $beginComment ? false : $applyIndent;
                $contentIndent = $applyIndent ? false : true;
                $tag = "";
                if($applyIndent) 
                { //This is a tag only
                    $tagInner = substr($line,1,-1);
                    $tag = "";
                    for($i = 0; $i < strlen($tagInner); $i++)
                    {
                        $char = substr($tagInner,$i,1);
                        if($char == " ") break;
                        if($char == ">") break; 
                        $tag .=$char;
                    }
                    $isCloseTag = substr($tag,0,1) == "/" ? true: false;
                    
                    if($isCloseTag)
                    {
                        $indentLevel -= $spacesPerIndent;   
                    }
                    else
                    {
                        $indentLevel += $spacesPerIndent;
                        $findTag = "</$tag>";
                        $line2 = $outLines[$j+1];
                        if(strtolower($line2) == strtolower($findTag))
                        {
                            $line = $line.$line2;
                            $j+=1;
                            $indentLevel -= $spacesPerIndent;
                            $isCloseTag = true;
                        }
                    }
                }
                $spaces = $indentLevel;
                $spaces += $contentIndent ? $spacesPerIndent : 0;
                $spaces += $isCloseTag    ? $spacesPerIndent : 0;
                $prependSpace = str_repeat(" ", $spaces);
                $line = $prependSpace.$line;
                if($tag !== "")
                {
                    $keys = array_keys($JSPlaceHolders);
                    if(in_array($tag,$keys))
                    {
                        $JSPlaceHolders[$tag]['indent'] = $indentLevel;
                    }
                }           
                $modHTML .= $line."\n";
            }
            $keys = array_keys($JSPlaceHolders);
            foreach($keys as $key)
            {
                $javascript = $JSPlaceHolders[$key]['javascript'];
                $indentOffset = $JSPlaceHolders[$key]['indent']+1;
                $javascript = $this->JSTidy($javascript, $indentOffset + ($spacesPerIndent*2), $spacesPerIndent);
                $otStart = strpos($modHTML,"<$key");
                $otEnd   = strpos($modHTML,">", $otStart)+1;
                $ot = substr($modHTML,$otStart, ($otEnd - $otStart));
                $otOut = str_replace($key, "script",$ot);
                $ctStart = strpos($modHTML,"</$key", $otEnd);
                $ctEnd = strpos($modHTML,">", $ctStart)+1;
                $ct = substr($modHTML,$ctStart, ($ctEnd - $ctStart));
                $ctOut = str_repeat(" ",$indentOffset+$spacesPerIndent-1).str_replace($key, "script",$ct);
                $otOut .= "\n".$javascript."\n";
                $modHTML = str_replace($ot,$otOut,$modHTML);
                $modHTML = str_replace($ct,$ctOut,$modHTML);
            }
            return $modHTML;
        }
        function JSTidy($javascript, $indentOffset, $spacesPerIndent)
        {
            $javascript = str_replace("{", "\n{",$javascript);
            $javascript = str_replace("}", "\n}",$javascript);
            $minJs = preg_replace(array("/\s+\n/", "/\n\s+/", "/ +/"), array("\n", "\n ", " "), $javascript);
            $jsLines = explode("\n",$minJs);
            $jsOut = "";
            $indent = $indentOffset;
            $count = count($jsLines);
            for($j = 0; $j < $count;$j++)
            {
                $line = trim($jsLines[$j]);
                if($line == "") continue;
                $c = substr($line,0,1);
                if($c == "}") $indent = $indent - $spacesPerIndent;
                $i = 0;
                $outLine = "";
                while(++$i < $indent)
                {
                    $outLine .=" ";
                }
                $outLine .=$line;
                $jsOut .=$outLine;
                if($j < $count - 2)
                {
                    $jsOut .="\n";
                }
                if($c == "{") $indent = $indent + $spacesPerIndent;             
            }
            return $jsOut;
        }
        function GetUniqueJSPlaceHolder($targetHTML)
        {
            $this->usedJSNames;
            $str = rand(); 
            $unique = "JS".strtoupper(hash("sha256", $str));
            while(strpos($targetHTML,$unique) || in_array($unique, $this->usedJSNames))
            {
                $str = rand(); 
                $unique = "JS".strtoupper(hash("sha256", $str));
            }
            $this->usedJSNames[] = $unique;
            return $unique;
        }
    }
?>


HTML净化器依赖于php tidy AFAIK。不,它似乎不是:“你不需要在你的php上安装tidy就可以使用这些功能!”也许只有OO的方式起作用:
$tidy=new tidy()$clean=$tidy->repairString($dom->saveHTML(),…)没有…但我已经找到了另一个解决方案,使用regex。。。谢谢谢谢你,救了我。仅供参考
libxml使用内部错误(true)
将抑制由错误HTML生成的php警告。效果非常好!谢谢你@visualex没有尝试,但通常它取决于您在xml文件根上设置的“encoding”属性。例如:您还可以通过编程方式提供编码。更多的阅读对我来说是赢家。我在一个Moodle站点中使用它,因为我得到了相同的“未定义函数tidy_repair_string()”错误。它将是一个共享主机,否则teelou会自己做。