不转换html标记的PHP htmlentities（）_Php_Xhtml_Html Parsing_Domdocument_Html Entities

不转换html标记的PHP htmlentities（）

php

不转换html标记的PHP htmlentities（）,php,xhtml,html-parsing,domdocument,html-entities,Php,Xhtml,Html Parsing,Domdocument,Html Entities,我已经找到了一些关于这个问题的帖子，但是没有一篇能够完全解决这个问题我需要一个函数，它将输出内容，并以htmlentities（）的方式转换所有特殊字符，但保留所有html标记我尝试过许多不同的方法，但正如我上面提到的，没有一种能像预期的那样有效我想知道是否有一种方法可以使用PHP类DomDocument来实现这一点我已尝试使用以下方法进行此操作： $objDom = new DOMDocument('1.0', 'utf-8'); $objDom->loadhtml($conte

我已经找到了一些关于这个问题的帖子，但是没有一篇能够完全解决这个问题

我需要一个函数，它将输出内容，并以htmlentities（）的方式转换所有特殊字符，但保留所有html标记

我尝试过许多不同的方法，但正如我上面提到的，没有一种能像预期的那样有效

我想知道是否有一种方法可以使用PHP类DomDocument来实现这一点

我已尝试使用以下方法进行此操作：

$objDom = new DOMDocument('1.0', 'utf-8');
$objDom->loadhtml($content);
return $objDom->savehtml();

require_once('HTMLPurifier/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$objPurifier = new HTMLPurifier($config);
return $objPurifier->purify($string);

这是可行的，但它也增加了页面的整个结构，即

<head><body> etc.

等。

我只需要转换$content变量的内容并完成作业

这里值得一提的另一件事是，$content可能还有一些字符被转换成xhtml格式，因为它来自Wysiwyg。所以它可能包含等等，这些也应该被保存

任何人都知道如何使用DomDocument—也许我应该使用不同的保存方法

好的-我想到了以下几点-不太好，但工作做得很到位：

$objDom = new DOMDocument('1.0', 'UTF-8');
$objDom->loadHTML($string);
$output = $objDom->saveXML($objDom->documentElement);
$output = str_replace('<html><body>', '', $output);
$output = str_replace('</body></html>', '', $output);
$output = str_replace('&#13;', '', $output);
return $output;

$objDom=新的DOMDocument（'1.0'，'UTF-8'）；
$objDom->loadHTML（$string）；
$output=$objDom->saveXML（$objDom->documentElement）；
$output=str_replace（“”，，$output）；
$output=str_replace（“”，，$output）；
$output=str#u replace（“；”，“$output”）；
返回$output；

如果您有更好的想法，我们将不胜感激。

您可以使用并删除

项：

$trans = get_html_translation_table(HTML_ENTITIES, ENT_NOQUOTES);
unset($trans['<'], $trans['>']);
$output = strtr($input, $trans);

$trans=get_html_translation_表（html_实体，entu NOQUOTES）；
未结算（$trans['']）；
$output=strtr（$input，$trans）；

get_html_translation_table（html_ENTITIES）将htmlentities（）中使用的转换表作为数组提供给您。您可以像这样从阵列中删除和：

<?php
$trans = get_html_translation_table(HTML_ENTITIES);
unset($trans["\""], $trans["<"], $trans[">"]);
$str = "Hallo <strong>& Frau</strong> & Krämer";
$encoded = strtr($str, $trans);

echo $encoded;
?>

首先，我想说，在我看来，您要做的是基本错误。如果有人想输入一个小于号怎么办？就我个人而言，我认为

htmlentities（）

是确保用户不能输入他们自己的HTML代码

如果您需要用户能够设置文本样式，有许多解决方案已经准备好了（退房或，例如）

如果必须允许用户输入HTML标记，并且必须假定不知道如何使用实体，下面是一个简单的函数：

function my_htmlentities ($str)
{
  // We'll append everything to this.
  $result = '';

  // Continue while there are HTML tags.
  while (($lt = strpos($str, '<')) !== false)
  {
    // Run `htmlentities` on everything before the tag, and pop it 
    // off the original string.
    $result .= htmlentities(substr($str, 0, $lt));
    $str = substr($str, $lt);

    // We want to continue until we reach the end of the tag. I know 
    // these loops are bad form. Sorry. I still think in F77 :p
    while (true)
    {
      // Find the closing tag as well as quotes.
      $gt = strpos($str, '>');
      $quot = strpos($str, '"');

      // If there is no closing bracket, append the rest of the tag 
      // as plaintext and exit.
      if ($gt === false)
        return $result . $str;

      // If there is a quote before the closing bracket, take care 
      // of it.
      if ($quot !== false && $quot < $gt)
      {
        // Grab everything before the quote.
        $result .= substr($str, 0, $quot+1);
        $str = substr($str, $quot+1);

        // Find the closing quote (if there is none, append and 
        // exit).
        if (($quot = strpos($str, '"')) === false)
          return $result . $str;

        // Grab the inside of the quote.
        $result .= substr($str, 0, $quot+1);
        $str = substr($str, $quot+1);

        // Start over as if we were at the beginning of the tag.
        continue;
      }

      // We just have the closing bracket to deal with. Deal.
      $result .= substr($str, 0, $gt+1);
      $str = substr($str, $gt+1);
      break;
    }
  }

  // There are no more tags, so we can run `htmlentities()` on the 
  // rest of the string.
  return $result . htmlentities($str);

  // Alternatively, if you want users to be able to enter their own
  // entities as well, you'll have to use this last line instead:
  return str_replace('&amp;', '&', $result . htmlentities($str));
}

function my\u htmlentities（$str）
{
//我们将把所有的东西都附加到这个上面。
$result=''；
//在有HTML标记时继续。
而（$lt=STRPO（$str，）；
$quot=strpos（$str，“”）；
//如果没有结束括号，请附加标记的其余部分
//作为明文和退出。
如果（$gt==false）
返回$result.$str；
//如果在结束括号前有报价，请小心
//当然。
如果（$quot！==false&&$quot<$gt）
{
//在报价前抓住所有东西。
$result.=substr（$str，0，$quot+1）；
$str=substr（$str，$quot+1）；
//查找结束引号（如果没有，请追加并
//退出）。
如果（$quot=strpos（$str，“'））==false）
返回$result.$str；
//抓住报价的内部。
$result.=substr（$str，0，$quot+1）；
$str=substr（$str，$quot+1）；
//重新开始，就像我们在标签的开头一样。
继续；
}
//我们要处理的只是最后一个括号。
$result.=substr（$str，0，$gt+1）；
$str=substr（$str，$gt+1）；
打破
}
}
//没有更多的标记，因此我们可以在
//绳子的其余部分。
返回$result.htmlentities（$str）；
//或者，如果您希望用户能够输入自己的
//此外，您还必须使用最后一行：
返回str_replace（“&；”、“&”、$result.htmlentities（$str））；
}

但请允许我重申：这是非常不安全的！我给你怀疑的好处是你知道你想要什么，但我不认为

你（或任何人）都应该想要这个。

好吧——经过大量研究，我终于想出了最后的选择——这似乎正是我所需要的

我使用了，并使用以下内容过滤了我的内容：

$objDom = new DOMDocument('1.0', 'utf-8');
$objDom->loadhtml($content);
return $objDom->savehtml();

require_once('HTMLPurifier/HTMLPurifier.auto.php');
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Doctype', 'XHTML 1.0 Transitional');
$objPurifier = new HTMLPurifier($config);
return $objPurifier->purify($string);

我希望其他人会觉得它有用。

我不明白。你想得到“内容”吗"转换了特殊字符，但未转换特殊字符？我不想获取内容-我有内容-我想转换它-以便&been&；等，但此函数不会转换html标记-因此将保持不变您使用的php版本是什么？如果>=5.3.6，您可以向

savehtml添加一个参数

指定要保存的节点。我正在使用5.2，但系统最终将使用最新版本的PHP-您介意包括该示例吗？请注意，任何带有属性的html标记都不会正确转换，除非您同时取消设置双引号（“）是的，这并不理想，因为我希望双引号也能转换为“如果在内容中找到并且与html不关联tag@user398341：为什么？除了双引号属性值外，不需要对其进行编码。验证html时，此文档返回以下错误：抱歉，我无法验证此文档，因为第179行包含一个或多个字节，我无法将其解释为utf-8（换句话说，找到的字节不是指定字符编码中的有效值）。请检查文件内容和字符编码指示。错误为：utf8“\x80”不映射到Unicode@user398341：您可能需要调整

get\u html\u translation\u table

的charset\u提示参数。为此，输出仅由管理员修改，因此没有外部输入。对于访问者，我将使用完全不同的方法。此方法输出的内容来自CKEditor sof已转换为相关符号，但