PHP-从UTF-8转换为Unicode十六进制_Php_Unicode_Encoding_Character Encoding_String Conversion

PHP-从UTF-8转换为Unicode十六进制

php unicode encoding character-encoding

PHP-从UTF-8转换为Unicode十六进制,php,unicode,encoding,character-encoding,string-conversion,Php,Unicode,Encoding,Character Encoding,String Conversion,有没有一种简单的方法可以将UTF-8中的字符串转换为unicode？我基本上想做的是将'è'转换为'00E8'。您可以使用json_encode来实现这一点 $str = "è"; $str = json_encode($str); print $str; 这将打印\u00e8。如果需要，您可以str\u replace删除\u。如果您想要E而不是E，您可以使用strtoupper。您可以使用json\u encode来做到这一点 $str = "è"; $str = json_encode

有没有一种简单的方法可以将UTF-8中的字符串转换为unicode？

我基本上想做的是将'è'转换为'00E8'。

您可以使用

json_encode

来实现这一点

$str = "è";
$str = json_encode($str);
print $str;

这将打印\u00e8。如果需要，您可以

str\u replace

删除\u。如果您想要E而不是E，您可以使用

strtoupper

。

您可以使用

json\u encode

来做到这一点

$str = "è";
$str = json_encode($str);
print $str;

这将打印\u00e8。如果需要，您可以

str\u replace

删除\u。如果你想要一个E而不是E，你可以使用strtoupper

/**
 * Display utf && non-printable characters as hex
 *
 * @param string  $str     string containing binary
 * @param boolean $htmlout add html markup?
 *
 * @return string
 */
public function strInspect($str)
{
    $this->htmlout = $htmlout;
    $regex = <<<EOD
/
( [\x01-\x7F] )                 # single-byte sequences   0xxxxxxx  (ascii 0 - 127)
| (
  (?: [\xC0-\xDF][\x80-\xBF]    # double-byte sequences   110xxxxx 10xxxxxx
    | [\xE0-\xEF][\x80-\xBF]{2} # triple-byte sequences   1110xxxx 10xxxxxx * 2
    | [\xF0-\xF7][\x80-\xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3
  ){1,100}                      # ...one or more times
)
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111   128 - 191
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111   192 - 255
| (.)                           # null (including x00 in the regex = fail)
/x
EOD;
    $str_orig = $str;
    $strlen = strlen($str);
    $str = preg_replace_callback($regex, 'strInspectCallback', $str);
    return $str;
}

/**
 * Callback used by strInspect's preg_replace_callback
 *
 * @param array $matches matches
 *
 * @return string
 */
protected function strInspectCallback($matches)
{
    $showHex = false;
    if ($matches[1] !== '') {
        // single byte sequence (may contain control char)
        $str = $matches[1];
        if (ord($str) < 32 || ord($str) == 127) {
            $showHex = true;
            if (in_array($str, array("\t","\n","\r"))) {
                $showHex = false;
            }
        }
    } elseif ($matches[2] !== '') {
        // Valid byte sequence. return unmodified.
        $str = $matches[2];
        $sequences = array(
            "\xef\xbb\xbf", // BOM
            "\xc2\xa0",     // no-break space
            // "\xE2\x80\x89", // thin space
            // "\xE2\x80\xAF", // narrow no-break space
            "\xEF\xBF\xBD",     // "Replacement Character"
        );
        foreach ($sequences as $seq) {
            if ($str === $seq) {
                $showHex = true;
                break;
            }
        }
    } elseif ($matches[3] !== '' || $matches[4] !== '') {
        // Invalid byte
        $str = $matches[3] != ''
            ? $matches[3]
            : $matches[4];
        $showHex = true;
    } else {
        // null char
        $str = $matches[5];
        $showHex = true;
    }
    if ($showHex) {
        $chars = str_split($str);
        foreach ($chars as $i => $c) {
            $chars[$i] = '\x'.bin2hex($c);
        }
        $str = implode('', $chars);
    }
    return $str;
}

/**
*将utf和不可打印字符显示为十六进制（&N）
*
*@param string$str string包含二进制
*@param boolean$htmlout是否添加html标记？
*
*@返回字符串
*/
公共功能检查（$str）
{
$this->htmlout=$htmlout；
$regex=这里是我从调试类修改的一个小东西
/**
 * Display utf && non-printable characters as hex
 *
 * @param string  $str     string containing binary
 * @param boolean $htmlout add html markup?
 *
 * @return string
 */
public function strInspect($str)
{
    $this->htmlout = $htmlout;
    $regex = <<<EOD
/
( [\x01-\x7F] )                 # single-byte sequences   0xxxxxxx  (ascii 0 - 127)
| (
  (?: [\xC0-\xDF][\x80-\xBF]    # double-byte sequences   110xxxxx 10xxxxxx
    | [\xE0-\xEF][\x80-\xBF]{2} # triple-byte sequences   1110xxxx 10xxxxxx * 2
    | [\xF0-\xF7][\x80-\xBF]{3} # quadruple-byte sequence 11110xxx 10xxxxxx * 3
  ){1,100}                      # ...one or more times
)
| ( [\x80-\xBF] )               # invalid byte in range 10000000 - 10111111   128 - 191
| ( [\xC0-\xFF] )               # invalid byte in range 11000000 - 11111111   192 - 255
| (.)                           # null (including x00 in the regex = fail)
/x
EOD;
    $str_orig = $str;
    $strlen = strlen($str);
    $str = preg_replace_callback($regex, 'strInspectCallback', $str);
    return $str;
}

/**
 * Callback used by strInspect's preg_replace_callback
 *
 * @param array $matches matches
 *
 * @return string
 */
protected function strInspectCallback($matches)
{
    $showHex = false;
    if ($matches[1] !== '') {
        // single byte sequence (may contain control char)
        $str = $matches[1];
        if (ord($str) < 32 || ord($str) == 127) {
            $showHex = true;
            if (in_array($str, array("\t","\n","\r"))) {
                $showHex = false;
            }
        }
    } elseif ($matches[2] !== '') {
        // Valid byte sequence. return unmodified.
        $str = $matches[2];
        $sequences = array(
            "\xef\xbb\xbf", // BOM
            "\xc2\xa0",     // no-break space
            // "\xE2\x80\x89", // thin space
            // "\xE2\x80\xAF", // narrow no-break space
            "\xEF\xBF\xBD",     // "Replacement Character"
        );
        foreach ($sequences as $seq) {
            if ($str === $seq) {
                $showHex = true;
                break;
            }
        }
    } elseif ($matches[3] !== '' || $matches[4] !== '') {
        // Invalid byte
        $str = $matches[3] != ''
            ? $matches[3]
            : $matches[4];
        $showHex = true;
    } else {
        // null char
        $str = $matches[5];
        $showHex = true;
    }
    if ($showHex) {
        $chars = str_split($str);
        foreach ($chars as $i => $c) {
            $chars[$i] = '\x'.bin2hex($c);
        }
        $str = implode('', $chars);
    }
    return $str;
}

/**
*将utf和不可打印字符显示为十六进制（&N）
*
*@param string$str string包含二进制
*@param boolean$htmlout是否添加html标记？
*
*@返回字符串
*/
公共功能检查（$str）
{
$this->htmlout=$htmlout；
$regex=我试过了，但它对ASCII字符不起作用，我基本上是在寻找一些可以转换的东西，比如说：H到0048，è到00E8等等。我试过了，但对ASCII字符不起作用，我基本上是在寻找一些可以转换的东西，比如：H到0048，è到00E8等等。