Php htmlentities()未将直角三角形转换为HTML实体
根据此列表:,此特殊字符:▶ 具有HTML实体Php htmlentities()未将直角三角形转换为HTML实体,php,utf-8,special-characters,html-entities,Php,Utf 8,Special Characters,Html Entities,根据此列表:,此特殊字符:▶ 具有HTML实体和#9654。因此,我认为PHP函数htmlentities()将转换▶ 至和#9654。但是,当我通过该函数运行带有该特殊字符的字符串并将其存储在MySQL数据库中时,会显示以下内容: ∗–∗ 我已经在将字符串发送到的页面上设置了HTTP头,我甚至尝试将其添加到处理字符串的PHP文件中:头('Content-Type:text/html;charset=utf-8'),但它没有帮助。我做错了什么 提前感谢。在处理UTF-8字符时,关键是每个编码都需
和#9654代码>。因此,我认为PHP函数htmlentities()
将转换▶ 至和#9654代码>。但是,当我通过该函数运行带有该特殊字符的字符串并将其存储在MySQL数据库中时,会显示以下内容:
∗–∗
我已经在将字符串发送到
的页面上设置了HTTP头,我甚至尝试将其添加到处理字符串的PHP文件中:头('Content-Type:text/html;charset=utf-8')代码>,但它没有帮助。我做错了什么
提前感谢。在处理UTF-8字符时,关键是每个编码都需要使用UTF-8,否则将转换为ISO-8859-1
确保检查:
- 数据库中表列的排序规则
- 如果该值硬编码到PHP文件中,请确保该文件以UTF-8格式保存
- 如果数据来自浏览器,请确保PHP内容类型标题用于UTF-8编码。通常,您可以在HTML中省略
,因为如果收到HTTP头,浏览器将使用它
- 与数据库的连接必须指定编码,如下所示:
编辑:
我认为这可能有点误导:
htmlentities-将所有适用字符转换为HTML实体
我认为应该说,“将翻译表中可用的所有适用字符转换为HTML实体”。并非所有字符都必须在翻译表中可用,任何不在翻译表中的字符都不会转换为它们的HTML实体。要查看翻译表中的字符,请参阅
例如,做:
print_r( get_html_translation_table(HTML_ENTITIES));
将输出:
Array
(
["] => "
[&] => &
[<] => <
[>] => >
[ ] =>
[¡] => ¡
[¢] => ¢
[£] => £
[¤] => ¤
[¥] => ¥
[¦] => ¦
[§] => §
[¨] => ¨
[©] => ©
[ª] => ª
[«] => «
[¬] => ¬
[] => ­
[®] => ®
[¯] => ¯
[°] => °
[±] => ±
[²] => ²
[³] => ³
[´] => ´
[µ] => µ
[¶] => ¶
[·] => ·
[¸] => ¸
[¹] => ¹
[º] => º
[»] => »
[¼] => ¼
[½] => ½
[¾] => ¾
[¿] => ¿
[À] => À
[Á] => Á
[Â] => Â
[Ã] => Ã
[Ä] => Ä
[Å] => Å
[Æ] => Æ
[Ç] => Ç
[È] => È
[É] => É
[Ê] => Ê
[Ë] => Ë
[Ì] => Ì
[Í] => Í
[Î] => Î
[Ï] => Ï
[Ð] => Ð
[Ñ] => Ñ
[Ò] => Ò
[Ó] => Ó
[Ô] => Ô
[Õ] => Õ
[Ö] => Ö
[×] => ×
[Ø] => Ø
[Ù] => Ù
[Ú] => Ú
[Û] => Û
[Ü] => Ü
[Ý] => Ý
[Þ] => Þ
[ß] => ß
[à] => à
[á] => á
[â] => â
[ã] => ã
[ä] => ä
[å] => å
[æ] => æ
[ç] => ç
[è] => è
[é] => é
[ê] => ê
[ë] => ë
[ì] => ì
[í] => í
[î] => î
[ï] => ï
[ð] => ð
[ñ] => ñ
[ò] => ò
[ó] => ó
[ô] => ô
[õ] => õ
[ö] => ö
[÷] => ÷
[ø] => ø
[ù] => ù
[ú] => ú
[û] => û
[ü] => ü
[ý] => ý
[þ] => þ
[ÿ] => ÿ
[Œ] => Œ
[œ] => œ
[Š] => Š
[š] => š
[Ÿ] => Ÿ
[ƒ] => ƒ
[ˆ] => ˆ
[˜] => ˜
[Α] => Α
[Β] => Β
[Γ] => Γ
[Δ] => Δ
[Ε] => Ε
[Ζ] => Ζ
[Η] => Η
[Θ] => Θ
[Ι] => Ι
[Κ] => Κ
[Λ] => Λ
[Μ] => Μ
[Ν] => Ν
[Ξ] => Ξ
[Ο] => Ο
[Π] => Π
[Ρ] => Ρ
[Σ] => Σ
[Τ] => Τ
[Υ] => Υ
[Φ] => Φ
[Χ] => Χ
[Ψ] => Ψ
[Ω] => Ω
[α] => α
[β] => β
[γ] => γ
[δ] => δ
[ε] => ε
[ζ] => ζ
[η] => η
[θ] => θ
[ι] => ι
[κ] => κ
[λ] => λ
[μ] => μ
[ν] => ν
[ξ] => ξ
[ο] => ο
[π] => π
[ρ] => ρ
[ς] => ς
[σ] => σ
[τ] => τ
[υ] => υ
[φ] => φ
[χ] => χ
[ψ] => ψ
[ω] => ω
[ϑ] => ϑ
[ϒ] => ϒ
[ϖ] => ϖ
[ ] =>  
[ ] =>  
[ ] =>  
[] => ‌
[] => ‍
[] => ‎
[] => ‏
[–] => –
[—] => —
[‘] => ‘
[’] => ’
[‚] => ‚
[“] => “
[”] => ”
[„] => „
[†] => †
[‡] => ‡
[•] => •
[…] => …
[‰] => ‰
[′] => ′
[″] => ″
[‹] => ‹
[›] => ›
[‾] => ‾
[⁄] => ⁄
[€] => €
[ℑ] => ℑ
[℘] => ℘
[ℜ] => ℜ
[™] => ™
[ℵ] => ℵ
[←] => ←
[↑] => ↑
[→] => →
[↓] => ↓
[↔] => ↔
[↵] => ↵
[⇐] => ⇐
[⇑] => ⇑
[⇒] => ⇒
[⇓] => ⇓
[⇔] => ⇔
[∀] => ∀
[∂] => ∂
[∃] => ∃
[∅] => ∅
[∇] => ∇
[∈] => ∈
[∉] => ∉
[∋] => ∋
[∏] => ∏
[∑] => ∑
[−] => −
[∗] => ∗
[√] => √
[∝] => ∝
[∞] => ∞
[∠] => ∠
[∧] => ∧
[∨] => ∨
[∩] => ∩
[∪] => ∪
[∫] => ∫
[∴] => ∴
[∼] => ∼
[≅] => ≅
[≈] => ≈
[≠] => ≠
[≡] => ≡
[≤] => ≤
[≥] => ≥
[⊂] => ⊂
[⊃] => ⊃
[⊄] => ⊄
[⊆] => ⊆
[⊇] => ⊇
[⊕] => ⊕
[⊗] => ⊗
[⊥] => ⊥
[⋅] => ⋅
[⌈] => ⌈
[⌉] => ⌉
[⌊] => ⌊
[⌋] => ⌋
[〈] => ⟨
[〉] => ⟩
[◊] => ◊
[♠] => ♠
[♣] => ♣
[♥] => ♥
[♦] => ♦
)
因此,使用示例▶代码>,您可以执行以下操作:
var_dump(superentities('▶')); // outputs string(7) "▶"
然而,尽管如此,我还是建议您在数据库中存储所有内容,而不进行编码。通常,最好在输出到浏览器之前进行适当编码。这样,如果你需要改变编码的方式,你就不必对它进行解码并以其他方式重新编码。要做到这一点,您必须确保所有编码都正确设置为UTF-8,如我在原始答案中所述。您的数据库表的排序规则是什么?@Mikelatin1\u swedish\u ci
这是您的问题。将其更改为utf8\u general\u ci。@Mike-hmm。。已将排序规则更改为utf8\u unicode\u ci
,但仍在数据库中获取–¨
。是否需要其他方法才能使其工作?是否再次插入该值?如果没有,你需要这样做。将UTF-8值插入到ISO-8859-1表中。没有办法将其转换回UTF-8,并让额外的字符以您希望的方式显示出来。不过,我仍然对一件事感到困惑。我不知道为什么htmlentities函数不起作用。它应该将其转换为▶代码>如您在问题中所述。如果它转换它,你就不会有这个问题。因此,也许我的答案实际上并不正确,因为我认为您将其存储为▶数据库中的code>而不是▶代码>。如果我错了,请纠正我。是的,没错。我看了一下,你提到了,有▶代码>而不是数据库中的实体代码。这确实很奇怪,因为正如我前面提到的,函数在一些字符串上工作得很好,比如引号、£等等。有可能htmlentities()
不影响▶代码>?例如,该特殊字符不在该列表()中,该列表()声称是HTML实体的完整列表▶代码>给你?我得到和你一样的结果。它不能转换它。但是,上面的链接肯定不是UTF-8字符的完整列表。甚至不接近。
// Unicode-proof htmlentities.
// Returns 'normal' chars as chars and weirdos as numeric html entites.
function superentities( $str ){
// get rid of existing entities else double-escape
$str = html_entity_decode(stripslashes($str),ENT_QUOTES,'UTF-8');
$ar = preg_split('/(?<!^)(?!$)/u', $str ); // return array of every multi-byte character
foreach ($ar as $c){
$o = ord($c);
if ( (strlen($c) > 1) || /* multi-byte [unicode] */
($o <32 || $o > 126) || /* <- control / latin weirdos -> */
($o >33 && $o < 40) ||/* quotes + ambersand */
($o >59 && $o < 63) /* html */
) {
// convert to numeric entity
$c = mb_encode_numericentity($c,array (0x0, 0xffff, 0, 0xffff), 'UTF-8');
}
$str2 .= $c;
}
return $str2;
}
var_dump(superentities('▶')); // outputs string(7) "▶"