Php 多字节安全计数字符串中的不同字符_Php_String_Utf 8

Php 多字节安全计数字符串中的不同字符

php string utf-8

Php 多字节安全计数字符串中的不同字符,php,string,utf-8,Php,String,Utf 8,我不想找到一种聪明而有效的方法来计算一个字符串中有多少个不同的字母字符。例如： $str = "APPLE"; echo char_count($str) // should return 4, because APPLE has 4 different chars 'A', 'P', 'L' and 'E' $str = "BOB AND BOB"; // should return 5 ('B', 'O', 'A', 'N', 'D'). $str = 'PLÁTANO'; // sh

我不想找到一种聪明而有效的方法来计算一个字符串中有多少个不同的字母字符。例如：

$str = "APPLE";
echo char_count($str) // should return 4, because APPLE has 4 different chars 'A', 'P', 'L' and 'E'

$str = "BOB AND BOB"; // should return 5 ('B', 'O', 'A', 'N', 'D'). 

$str = 'PLÁTANO'; // should return 7 ('P', 'L', 'Á', 'T', 'A', 'N', 'O')

它应该支持UTF-8字符串

只需使用：

从

count\u chars（）

返回的数组还将告诉您字符串中每个字符的数量。

下面是一个函数，它将使用关联数组的魔力来完成此操作。以线性时间工作。（大O=

log（n）

）

count_chars返回所有ascii字符的映射，告诉您字符串中每个字符的数量。这里是您自己实现的起点

function countchars($str, $ignoreSpaces) {
  $map = array();
  $len = strlen($str);
  for ($i=0; $i < $len; $i++) {
    if (!isset($map[$str{$i}])) {
      $map[$str{$i}] = 1;
    } else {
      $map[$str{$i}]++;
    }    
  }

  if ($ignoreSpaces) {
    unset($map[' ']);
  }

  return $map;
}

print_r(countchars('Hello World'));

函数countchars（$str，$ignoreSpaces）{
$map=array（）；
$len=strlen（$str）；
对于（$i=0；$i<$len；$i++）{
if（！isset（$map[$str{$i}]））{
$map[$str{$i}]=1；
}否则{
$map[$str{$i}]++；
}    
}
如果（$ignoreSpaces）{
未设置（$map[''）；
}
返回$map；
}
印刷品（countchars（“Hello World”）；

我对它的看法

$chars = array_count_values(str_split($input));

这将为您提供一个唯一字母的关联数组作为键，出现次数作为值

如果您对出现的次数不感兴趣

$chars = array_unique(str_split($input));
$numChars = count($chars);

如果你在处理UTF-8（你真的应该考虑，IMHO），没有任何发布的解决方案（使用STRLLN、StrusS拆除器或CurtTyCARS）会起作用，因为它们都把一个字节当作一个字符（对于UTF-8来说，这是不正确的）。

考虑将其转换为一个字符数组（可能会抛出空格），然后对数组进行“唯一”运算。除非有一些性能要求（并且性能分析表明这些要求没有得到满足），否则它是相当智能和高效的。输入将采用什么字符集编码？UTF-8？是的，UTF-8字符。忘了加。我已经修改了我原来的帖子。+1然而，值得注意的是，空格（和标点符号？）应该按照OP的要求首先删除example@PrimozRome也就是说，它返回一个ascii映射（256个字符），其中包含每个字符的计数。试试我买的那个suggested@PrimozRome：抱歉，我没有过滤数组中的零值，在上面的答案中进行了更新。@JuanMendes:我已经修复了上面的答案，可能最好是单行。不过你的功劳真是太好了+1@JuanMendes对的我明白了，是的。您的函数可以正常工作，但不能在UTF-8字符串上工作。但仍然接近结果。谢谢正确，我正在处理UTF-8！我编辑了我的原始问题。此解决方案有效，事件使用UTF-8字符串。谢谢你，罗德尼！我刚刚添加了另一个忽略所有非字符（标点符号等）的示例，您可能也不会计算这些字符。
$chars = array_count_values(str_split($input));

$chars = array_unique(str_split($input));
$numChars = count($chars);

<?php

$treat_spaces_as_chars = true;
// contains hälöwrd and a space, being 8 distinct characters (7 without the space)
$string = "hällö wörld"; 
// remove spaces if we don't want to count them
if (!$treat_spaces_as_chars) {
  $string = preg_replace('/\s+/u', '', $string);
}
// split into characters (not bytes, like explode() or str_split() would)
$characters = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
// throw out the duplicates
$unique_characters = array_unique($characters);
// count what's left
$numer_of_characters = count($unique_characters);

<?php

$ignore_non_word_characters = true;
// contains hälöwrd and PIE, as this is treated as a word character (Greek)
$string = "h,ä*+l•π‘°’lö wörld"; 
// remove spaces if we don't want to count them
if ($ignore_non_word_characters) {
  $string = preg_replace('/\W+/u', '', $string);
}
// split into characters (not bytes, like explode() or str_split() would)
$characters = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
// throw out the duplicates
$unique_characters = array_unique($characters);
// count what's left
$numer_of_characters = count($unique_characters);

var_dump($characters, $unique_characters, $numer_of_characters);