PHP和MySQL中的土耳其字符问题_Php_Mysql_Turkish

PHP和MySQL中的土耳其字符问题

php mysql

PHP和MySQL中的土耳其字符问题,php,mysql,turkish,Php,Mysql,Turkish,我试图统计MySQL数据库中土耳其语字母表中所有字母的出现次数当我尝试像这样数数字母“a”时，我得到了正确的结果： while($nt=mysql_fetch_array($rt)) { $mystring = $nt["word"]; for($i = 0; $i < strlen($mystring) ; $i++) { if($mystring[$i] == 'a') { $a++;

我试图统计MySQL数据库中土耳其语字母表中所有字母的出现次数

当我尝试像这样数数字母“a”时，我得到了正确的结果：

while($nt=mysql_fetch_array($rt))
{
    $mystring = $nt["word"];

    for($i = 0; $i < strlen($mystring) ; $i++)
    {
        if($mystring[$i] == 'a')
        {
            $a++;
        }
    }
}

如何修复土耳其语字符的代码？谢谢。

在UTF-8中

ç

被编码为两个字节（

c3a7

），因此逐字节比较不起作用。考虑<代码>子目录计数< /代码>：

$s = "abçdeç";
print substr_count($s, 'ç'); // 2

或者使用unicode感知函数，如下所示：

function utf8_char_count($s) {
    $count = [];
    preg_match_all('~.~u', $s, $m);
    foreach($m[0] as $c)
        $count[$c] = isset($count[$c]) ? $count[$c] + 1 : 1;
    return $count;
}

print_r(utf8_char_count('çAüθç')); // [ç] => 2 [A] => 1 [ü] => 1 [θ] => 1

这假设您的字符串实际上是UTF-8，如果不是这种情况（提示：

var\u dump（rawurlencode（$str））

），请检查您的数据库和连接设置（请参阅链接的线程）。

可能重复您在数据库中使用的内容？

strlen（）

按字节工作，而不是按字符工作ç需要超过1个字节。改用

mb_strlen（）

。同样，不能在字符串中每个字节使用

[]

。“UTF-8中的ç需要超过1个字节”是错误的。有U+00E7，显然是一个字节符号。不过，组合cedilla会使用多个字节。@mudasobwa:no，

U+00E7

是一个码点，但用utf8编码是两个字节：

c3a7

。

function utf8_char_count($s) {
    $count = [];
    preg_match_all('~.~u', $s, $m);
    foreach($m[0] as $c)
        $count[$c] = isset($count[$c]) ? $count[$c] + 1 : 1;
    return $count;
}

print_r(utf8_char_count('çAüθç')); // [ç] => 2 [A] => 1 [ü] => 1 [θ] => 1