用PHP替换多余的空格和换行符?

用PHP替换多余的空格和换行符?,php,regex,whitespace,Php,Regex,Whitespace,我阅读了PHP的文档并遵循preg_replace的教程,但是这段代码会产生 我的文本有太多的空白,有很多空格和制表符 我怎样才能把它变成: 我的文本有太多空白 大量的空格和标签 尝试: $string = "My text has so much whitespace Plenty of spaces and tabs"; echo preg_replace("/\s\s+/", " ", $string); 备选办

我阅读了PHP的文档并遵循preg_replace的教程,但是这段代码会产生

我的文本有太多的空白,有很多空格和制表符

我怎样才能把它变成:

我的文本有太多空白
大量的空格和标签

尝试:

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";

echo preg_replace("/\s\s+/", " ", $string);
备选办法:

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";
//Remove duplicate newlines
$string = preg_replace("/[\n]*/", "\n", $string); 
//Preserves newlines while replacing the other whitspaces with single space
echo preg_replace("/[ \t]*/", " ", $string); 
输出

echo preg_replace_callback("/\s+/", function ($match) {
    $result = array();
    $prev = null;
    foreach (str_split($match[0], 1) as $char) {
        if ($prev === null || $char != $prev) {
            $result[] = $char;
        }

        $prev = $char;
    }

    return implode('', $result);
}, $string);

编辑:阅读此内容是因为它是一种不同的方法。它可能不是所要求的,但它至少不会合并不同的空白组(例如,
space,tab,tab,space,nl,nl,space,space
将变成
space,tab,space,nl,space
)。

首先,我想指出新行可以是\r\n,也可以是\r\n,这取决于操作系统

我的解决方案:

My text has so much whitespace
Plenty of spaces and tabs
如有必要,可将其分为两行:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));
更新

echo preg_replace_callback("/\s+/", function ($match) {
    $result = array();
    $prev = null;
    foreach (str_split($match[0], 1) as $char) {
        if ($prev === null || $char != $prev) {
            $result[] = $char;
        }

        $prev = $char;
    }

    return implode('', $result);
}, $string);
更好的解决方案是:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);
或:


我修改了正则表达式,使多行分隔成一行更好。它使用“m”修饰符(使“^”和“$”匹配新行的开始和结束),并删除任何\s(空格、制表符、新行、换行符)字符,这些字符是字符串结尾和下一行开头的字符。这就解决了只有空格的空行的问题。在我前面的例子中,如果一行被空格填充,它会跳过一行额外的空格。

为什么这样做?
html仅显示一个空间,即使您使用多个空间

例如:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);
/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}
echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _  sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧Had the same problem when passing echoed data from PHP to Javascript (formatted as JSON). The string was peppered with useless \r\n and \t characters that are neither required nor displayed on the page.

The solution i ended up using is another way of echoing. That saves a lot of server resources compared to preg_replace (as it is suggested by other people here).


Here the before and after in comparison:

Before:

echo '
<div>

    Example
    Example

</div>
';
echo 
'<div>',

    'Example',
    'Example',

'</div>';
测试内容1 2 3 4 5
输出将为:
测试内容1 2 3 4 5

如果html中需要多个空格,则必须使用

测试:)

这将完全缩小整个字符串(如大型博客文章),同时保留所有HTML标记

$text = preg_replace("/[\r\n]+/", "\n", $text);
$text = preg_replace("/\s+/", ' ', $text);

编辑了正确的答案。从PHP 5.2.4或更高版本中,可以使用以下代码:

//Newline and tab space to single space

$from_mysql = str_replace(array("\r\n", "\r", "\n", "\t"), ' ', $from_mysql);


// Multiple spaces to single space ( using regular expression)

$from_mysql = ereg_replace(" {2,}", ' ',$from_mysql);

// Replaces 2 or more spaces with a single space, {2,} indicates that you are looking for 2 or more than 2 spaces in a string.

不确定这是否有用,我也不是绝对肯定它会像它应该的那样工作,但它似乎对我有用

一种函数,用于清除多个空格和任何您想要或不想要的内容,并生成单行字符串或多行字符串(取决于传递的参数/选项)。还可以删除或保留其他语言的字符,并将换行符选项卡转换为空格

echo preg_replace('/\v(?:[\v\h]+)/', '', $string);
/**\_(ツ)_/''希望对某人有用**/
//如果$multiLine为null,则也会删除空格。$l=true的“[:表情符号:”,只允许已知表情符号。
//带有$l=true的“[:print:]”允许所有utf8可打印字符(包括表情符号)。
//****TODO:如果在$options中使用unicode表情符号或语言字符,而$l=false,则会得到一个奇数� 任何不匹配字符的符号替换。$options字符似乎可以通过,而不管$l=false?(错误(?)有趣)
函数alphaNumericMagic($value,$options='',$l=false,$multiLine=false,$tabSpaces=“”){
$utf8Emojis='';
$patterns=[];
$replacements=[];
如果($l&&preg\u匹配(~(\[\:emoji\:\])~,$options)){
$utf8Emojis=[
'\x{1F600}-\x{1F64F}',/*表情符号*/
“\x{1F9D0}-\x{1F9E6}”,
'\x{1F300}-\x{1F5FF}',/*其他字符*//\x{1F9D0}-\x{1F9E6}
'\x{1F680}-\x{1F6FF}',/*传输和映射*/
'\x{1F1E0}-\x{1F1FF}'/*标志(iOS)*/
];
$utf8Emojis=内爆(“”,$utf8Emojis);
}
$options=str_replace(“[:emoji:]”,$utf8Emojis,$options);
如果(!preg\u match(~(\[\:graph\:\]\[\:print\:\]\[\:punt\:\]\]\\-)~,$options)){
$value=str_替换(“-”,“”,$value);
}
若有($l){
$l='u';
$options=$options.'\p{L}\p{N}\p{Pd}';
}else{$l='';}
if(preg\u match(“~(\[\:print\:\])~”,$options)){
$patterns[]=“/[]+/m”;
$replacements[]=“”;
}
如果($多行){

$patterns[]=“/(?在将回显数据从PHP传递到Javascript(格式为JSON)时遇到相同的问题。字符串中添加了无用的\r\n和\t字符,这些字符既不是必需的,也不显示在页面上

我最终使用的解决方案是另一种回音方式,与preg_replace相比,它节省了大量服务器资源(正如这里其他人所建议的)


以下是前后对比:

之前:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);
/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}
echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _  sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧Had the same problem when passing echoed data from PHP to Javascript (formatted as JSON). The string was peppered with useless \r\n and \t characters that are neither required nor displayed on the page.

The solution i ended up using is another way of echoing. That saves a lot of server resources compared to preg_replace (as it is suggested by other people here).


Here the before and after in comparison:

Before:

echo '
<div>

    Example
    Example

</div>
';
echo 
'<div>',

    'Example',
    'Example',

'</div>';
echo'
例子
例子
';
输出:

\r\n\r\n\t样本\r\n\t样本\r\n\r\n


之后:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);
/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}
echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _  sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧Had the same problem when passing echoed data from PHP to Javascript (formatted as JSON). The string was peppered with useless \r\n and \t characters that are neither required nor displayed on the page.

The solution i ended up using is another way of echoing. That saves a lot of server resources compared to preg_replace (as it is suggested by other people here).


Here the before and after in comparison:

Before:

echo '
<div>

    Example
    Example

</div>
';
echo 
'<div>',

    'Example',
    'Example',

'</div>';
echo
'',
"例子",,
"例子",,
'';
输出:

示例示例



(是的,您不仅可以用点连接echo,还可以用逗号连接echo。)

如果空格的顺序类似于“\n\n\t\t\t”,会发生什么情况"-这里的换行符不会被替换为制表符吗?@Tudor Constantin-我正要写同样的东西。@Tudor Constantin@Francois Deschenes Mhm,你可能是对的。仍然到凌晨。我会修改我的答案。这太复杂了。它会对一个或多个空格字符的每个实例进行迭代冒号调用。我不会接受这种技术。另外,内爆的默认胶水是一个空字符串,因此您可以安全地忽略该参数。您忘记替换tabs@Tudor康斯坦丁——虽然他的例子没有任何标签(或者至少我可以说,我没有包括它们)但是我已经更新了我的答案以包含它们。谢谢!@Francois Deschenes我删除了我的答案,因为这一个几乎完成了。但是让我问一下,你的第二次替换不会产生与我的代码相同的问题吗?(将空格和制表符合并到一个空格)@Yoshi-是的,但它分两步完成。首先它负责\r\n,然后是空格和制表符,而不是同时完成。我的最新示例与您的as i有些相似