Php htmlentities()未将直角三角形转换为HTML实体

Php htmlentities()未将直角三角形转换为HTML实体,php,utf-8,special-characters,html-entities,Php,Utf 8,Special Characters,Html Entities,根据此列表:,此特殊字符:▶ 具有HTML实体和#9654。因此,我认为PHP函数htmlentities()将转换▶ 至和#9654。但是,当我通过该函数运行带有该特殊字符的字符串并将其存储在MySQL数据库中时,会显示以下内容: ∗–∗ 我已经在将字符串发送到的页面上设置了HTTP头,我甚至尝试将其添加到处理字符串的PHP文件中:头('Content-Type:text/html;charset=utf-8'),但它没有帮助。我做错了什么 提前感谢。在处理UTF-8字符时,关键是每个编码都需

根据此列表:,此特殊字符:▶ 具有HTML实体
和#9654。因此,我认为PHP函数
htmlentities()
将转换▶ 至
和#9654。但是,当我通过该函数运行带有该特殊字符的字符串并将其存储在MySQL数据库中时,会显示以下内容:

∗–∗

我已经在将字符串发送到
的页面上设置了HTTP头,我甚至尝试将其添加到处理字符串的PHP文件中:
头('Content-Type:text/html;charset=utf-8'),但它没有帮助。我做错了什么


提前感谢。

在处理UTF-8字符时,关键是每个编码都需要使用UTF-8,否则将转换为ISO-8859-1

确保检查:

  • 数据库中表列的排序规则
  • 如果该值硬编码到PHP文件中,请确保该文件以UTF-8格式保存
  • 如果数据来自浏览器,请确保PHP内容类型标题用于UTF-8编码。通常,您可以在HTML中省略
    ,因为如果收到HTTP头,浏览器将使用它
  • 与数据库的连接必须指定编码,如下所示:

编辑:

我认为这可能有点误导:

htmlentities-将所有适用字符转换为HTML实体

我认为应该说,“将翻译表中可用的所有适用字符转换为HTML实体”。并非所有字符都必须在翻译表中可用,任何不在翻译表中的字符都不会转换为它们的HTML实体。要查看翻译表中的字符,请参阅

例如,做:

print_r( get_html_translation_table(HTML_ENTITIES));
将输出:

Array
(
    ["] => "
    [&] => &
    [<] => &lt;
    [>] => &gt;
    [ ] => &nbsp;
    [¡] => &iexcl;
    [¢] => &cent;
    [£] => &pound;
    [¤] => &curren;
    [¥] => &yen;
    [¦] => &brvbar;
    [§] => &sect;
    [¨] => &uml;
    [©] => &copy;
    [ª] => &ordf;
    [«] => &laquo;
    [¬] => &not;
    [­] => &shy;
    [®] => &reg;
    [¯] => &macr;
    [°] => &deg;
    [±] => &plusmn;
    [²] => &sup2;
    [³] => &sup3;
    [´] => &acute;
    [µ] => &micro;
    [¶] => &para;
    [·] => &middot;
    [¸] => &cedil;
    [¹] => &sup1;
    [º] => &ordm;
    [»] => &raquo;
    [¼] => &frac14;
    [½] => &frac12;
    [¾] => &frac34;
    [¿] => &iquest;
    [À] => &Agrave;
    [Á] => &Aacute;
    [Â] => &Acirc;
    [Ã] => &Atilde;
    [Ä] => &Auml;
    [Å] => &Aring;
    [Æ] => &AElig;
    [Ç] => &Ccedil;
    [È] => &Egrave;
    [É] => &Eacute;
    [Ê] => &Ecirc;
    [Ë] => &Euml;
    [Ì] => &Igrave;
    [Í] => &Iacute;
    [Î] => &Icirc;
    [Ï] => &Iuml;
    [Ð] => &ETH;
    [Ñ] => &Ntilde;
    [Ò] => &Ograve;
    [Ó] => &Oacute;
    [Ô] => &Ocirc;
    [Õ] => &Otilde;
    [Ö] => &Ouml;
    [×] => &times;
    [Ø] => &Oslash;
    [Ù] => &Ugrave;
    [Ú] => &Uacute;
    [Û] => &Ucirc;
    [Ü] => &Uuml;
    [Ý] => &Yacute;
    [Þ] => &THORN;
    [ß] => &szlig;
    [à] => &agrave;
    [á] => &aacute;
    [â] => &acirc;
    [ã] => &atilde;
    [ä] => &auml;
    [å] => &aring;
    [æ] => &aelig;
    [ç] => &ccedil;
    [è] => &egrave;
    [é] => &eacute;
    [ê] => &ecirc;
    [ë] => &euml;
    [ì] => &igrave;
    [í] => &iacute;
    [î] => &icirc;
    [ï] => &iuml;
    [ð] => &eth;
    [ñ] => &ntilde;
    [ò] => &ograve;
    [ó] => &oacute;
    [ô] => &ocirc;
    [õ] => &otilde;
    [ö] => &ouml;
    [÷] => &divide;
    [ø] => &oslash;
    [ù] => &ugrave;
    [ú] => &uacute;
    [û] => &ucirc;
    [ü] => &uuml;
    [ý] => &yacute;
    [þ] => &thorn;
    [ÿ] => &yuml;
    [Œ] => &OElig;
    [œ] => &oelig;
    [Š] => &Scaron;
    [š] => &scaron;
    [Ÿ] => &Yuml;
    [ƒ] => &fnof;
    [ˆ] => &circ;
    [˜] => &tilde;
    [Α] => &Alpha;
    [Β] => &Beta;
    [Γ] => &Gamma;
    [Δ] => &Delta;
    [Ε] => &Epsilon;
    [Ζ] => &Zeta;
    [Η] => &Eta;
    [Θ] => &Theta;
    [Ι] => &Iota;
    [Κ] => &Kappa;
    [Λ] => &Lambda;
    [Μ] => &Mu;
    [Ν] => &Nu;
    [Ξ] => &Xi;
    [Ο] => &Omicron;
    [Π] => &Pi;
    [Ρ] => &Rho;
    [Σ] => &Sigma;
    [Τ] => &Tau;
    [Υ] => &Upsilon;
    [Φ] => &Phi;
    [Χ] => &Chi;
    [Ψ] => &Psi;
    [Ω] => &Omega;
    [α] => &alpha;
    [β] => &beta;
    [γ] => &gamma;
    [δ] => &delta;
    [ε] => &epsilon;
    [ζ] => &zeta;
    [η] => &eta;
    [θ] => &theta;
    [ι] => &iota;
    [κ] => &kappa;
    [λ] => &lambda;
    [μ] => &mu;
    [ν] => &nu;
    [ξ] => &xi;
    [ο] => &omicron;
    [π] => &pi;
    [ρ] => &rho;
    [ς] => &sigmaf;
    [σ] => &sigma;
    [τ] => &tau;
    [υ] => &upsilon;
    [φ] => &phi;
    [χ] => &chi;
    [ψ] => &psi;
    [ω] => &omega;
    [ϑ] => &thetasym;
    [ϒ] => &upsih;
    [ϖ] => &piv;
    [ ] => &ensp;
    [ ] => &emsp;
    [ ] => &thinsp;
    [‌] => &zwnj;
    [‍] => &zwj;
    [‎] => &lrm;
    [‏] => &rlm;
    [–] => &ndash;
    [—] => &mdash;
    [‘] => &lsquo;
    [’] => &rsquo;
    [‚] => &sbquo;
    [“] => &ldquo;
    [”] => &rdquo;
    [„] => &bdquo;
    [†] => &dagger;
    [‡] => &Dagger;
    [•] => &bull;
    […] => &hellip;
    [‰] => &permil;
    [′] => &prime;
    [″] => &Prime;
    [‹] => &lsaquo;
    [›] => &rsaquo;
    [‾] => &oline;
    [⁄] => &frasl;
    [€] => &euro;
    [ℑ] => &image;
    [℘] => &weierp;
    [ℜ] => &real;
    [™] => &trade;
    [ℵ] => &alefsym;
    [←] => &larr;
    [↑] => &uarr;
    [→] => &rarr;
    [↓] => &darr;
    [↔] => &harr;
    [↵] => &crarr;
    [⇐] => &lArr;
    [⇑] => &uArr;
    [⇒] => &rArr;
    [⇓] => &dArr;
    [⇔] => &hArr;
    [∀] => &forall;
    [∂] => &part;
    [∃] => &exist;
    [∅] => &empty;
    [∇] => &nabla;
    [∈] => &isin;
    [∉] => &notin;
    [∋] => &ni;
    [∏] => &prod;
    [∑] => &sum;
    [−] => &minus;
    [∗] => &lowast;
    [√] => &radic;
    [∝] => &prop;
    [∞] => &infin;
    [∠] => &ang;
    [∧] => &and;
    [∨] => &or;
    [∩] => &cap;
    [∪] => &cup;
    [∫] => &int;
    [∴] => &there4;
    [∼] => &sim;
    [≅] => &cong;
    [≈] => &asymp;
    [≠] => &ne;
    [≡] => &equiv;
    [≤] => &le;
    [≥] => &ge;
    [⊂] => &sub;
    [⊃] => &sup;
    [⊄] => &nsub;
    [⊆] => &sube;
    [⊇] => &supe;
    [⊕] => &oplus;
    [⊗] => &otimes;
    [⊥] => &perp;
    [⋅] => &sdot;
    [⌈] => &lceil;
    [⌉] => &rceil;
    [⌊] => &lfloor;
    [⌋] => &rfloor;
    [〈] => &lang;
    [〉] => &rang;
    [◊] => &loz;
    [♠] => &spades;
    [♣] => &clubs;
    [♥] => &hearts;
    [♦] => &diams;
)
因此,使用示例
,您可以执行以下操作:

var_dump(superentities('▶')); // outputs string(7) "&#9654;"

然而,尽管如此,我还是建议您在数据库中存储所有内容,而不进行编码。通常,最好在输出到浏览器之前进行适当编码。这样,如果你需要改变编码的方式,你就不必对它进行解码并以其他方式重新编码。要做到这一点,您必须确保所有编码都正确设置为UTF-8,如我在原始答案中所述。

您的数据库表的排序规则是什么?@Mike
latin1\u swedish\u ci
这是您的问题。将其更改为utf8\u general\u ci。@Mike-hmm。。已将排序规则更改为
utf8\u unicode\u ci
,但仍在数据库中获取
–¨
。是否需要其他方法才能使其工作?是否再次插入该值?如果没有,你需要这样做。将UTF-8值插入到ISO-8859-1表中。没有办法将其转换回UTF-8,并让额外的字符以您希望的方式显示出来。不过,我仍然对一件事感到困惑。我不知道为什么htmlentities函数不起作用。它应该将其转换为
如您在问题中所述。如果它转换它,你就不会有这个问题。因此,也许我的答案实际上并不正确,因为我认为您将其存储为
而不是
。如果我错了,请纠正我。是的,没错。我看了一下,你提到了,有
而不是数据库中的实体代码。这确实很奇怪,因为正如我前面提到的,函数在一些字符串上工作得很好,比如引号、£等等。有可能
htmlentities()
不影响
?例如,该特殊字符不在该列表()中,该列表()声称是HTML实体的完整列表▶给你?我得到和你一样的结果。它不能转换它。但是,上面的链接肯定不是UTF-8字符的完整列表。甚至不接近。
// Unicode-proof htmlentities.
// Returns 'normal' chars as chars and weirdos as numeric html entites.
function superentities( $str ){
    // get rid of existing entities else double-escape
    $str = html_entity_decode(stripslashes($str),ENT_QUOTES,'UTF-8');
    $ar = preg_split('/(?<!^)(?!$)/u', $str );  // return array of every multi-byte character
    foreach ($ar as $c){
        $o = ord($c);
        if ( (strlen($c) > 1) || /* multi-byte [unicode] */
            ($o <32 || $o > 126) || /* <- control / latin weirdos -> */
            ($o >33 && $o < 40) ||/* quotes + ambersand */
            ($o >59 && $o < 63) /* html */
        ) {
            // convert to numeric entity
            $c = mb_encode_numericentity($c,array (0x0, 0xffff, 0, 0xffff), 'UTF-8');
        }
        $str2 .= $c;
    }
    return $str2;
}
var_dump(superentities('▶')); // outputs string(7) "&#9654;"