PHP-如何正确计算多字节/UTF-8字符串中的前导空格数_Php

PHP-如何正确计算多字节/UTF-8字符串中的前导空格数

php

PHP-如何正确计算多字节/UTF-8字符串中的前导空格数,php,Php,我有如下UTF-8字符串： 21世纪其他语言一般藏品古代语言中世纪语言多个作者（两种或多种语言）如您所见，字符串包含字母数字字符以及前导和尾随空格我想使用PHP来检索每个字符串中前导空格（而不是尾随空格）的数量。请注意，这些空格可能是非标准ASCII空格。我尝试使用： var_dump(mb_ord($space_char, "UTF-8")); 其中，$space\u char包含一个从上述字符串之一复制的示例空格字符，我得到的是160而不是32 我试过： strspn($st

我有如下UTF-8字符串：

21世纪

其他语言

一般藏品

古代语言

中世纪语言

多个作者（两种或多种语言）

如您所见，字符串包含字母数字字符以及前导和尾随空格

我想使用PHP来检索每个字符串中前导空格（而不是尾随空格）的数量。请注意，这些空格可能是非标准ASCII空格。我尝试使用：

var_dump(mb_ord($space_char, "UTF-8"));

其中，

$space\u char

包含一个从上述字符串之一复制的示例空格字符，我得到的是160而不是32

我试过：

strspn($string,$cmask); // $cmask contains a string with two space characters with 160 and 32 as their Unicode code points.

但我得到了一个非常不可预测的值

数值应为：

(1) 12
(2) 6
(3) 9
(4) 9
(5) 9
(6) 12

我可以知道我做错了什么吗？

代码输出

我会选择正则表达式：

<?php
function count_leading_spaces($str) {
    // \p{Zs} will match a whitespace character that is invisible,
    // but does take up space
    if (mb_ereg('^\p{Zs}+', $str, $regs) === false)
        return 0;
    return mb_strlen($regs[0]);
}

$samples = [
'            21st century ',
'      Other languages ',
'         General collections ',
'         Ancient languages ',
'         Medieval languages ',
'            Several authors (Two or more languages) ',
];

foreach ($samples as $i => $sample) {
    printf("(%d) %d\n", $i + 1, count_leading_spaces($sample));
}

问题指定了前导空格的计数，而不是尾随空格的计数。您的代码似乎正常工作，但变量和函数命名令人困惑。好的，我不是以英语为母语的人，请原谅。：）固定的。：）
12
6
9
9
9
12

<?php
function count_leading_spaces($str) {
    // \p{Zs} will match a whitespace character that is invisible,
    // but does take up space
    if (mb_ereg('^\p{Zs}+', $str, $regs) === false)
        return 0;
    return mb_strlen($regs[0]);
}

$samples = [
'            21st century ',
'      Other languages ',
'         General collections ',
'         Ancient languages ',
'         Medieval languages ',
'            Several authors (Two or more languages) ',
];

foreach ($samples as $i => $sample) {
    printf("(%d) %d\n", $i + 1, count_leading_spaces($sample));
}

(1) 12
(2) 6
(3) 9
(4) 9
(5) 9
(6) 12