str_替换在PHP中的性能
这里有两种方法使用str_替换在PHP中的性能,php,str-replace,Php,Str Replace,这里有两种方法使用str\u replace替换给定短语中的字符串 // Method 1 $phrase = "You should eat fruits, vegetables, and fiber every day."; $healthy = array("fruits", "vegetables", "fiber"); $yummy = array("pizza", "beer", "ice cream"); $phrase = str_replace($healthy, $yu
str\u replace
替换给定短语中的字符串
// Method 1
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$healthy = array("fruits", "vegetables", "fiber");
$yummy = array("pizza", "beer", "ice cream");
$phrase = str_replace($healthy, $yummy, $phrase);
// Method 2
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);
哪种方法更有效(在执行时间和使用的资源方面)
假设实际短语更长(例如50000个字符),并且要替换的单词有更多的对
我想的是,方法2调用
str\u replace
3次,这将花费更多的函数调用;另一方面,方法1创建2个数组,str_replace
需要在运行时解析2个数组。我更喜欢使用方法1,因为它更干净,更有条理。方法1还提供了使用来自其他源的对的机会,例如:数据库中的坏字表。方法2需要另一个排序循环
<?php
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
// Method 1
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$healthy = array("fruits", "vegetables", "fiber");
$yummy = array("pizza", "beer", "ice cream");
$phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
// Method2
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>
Did测试1英寸(3.6321988105774秒)
在(2.8234610557556秒)内完成测试2
编辑:在进一步的测试字符串重复到50k时,更少的迭代次数和来自ajreal的建议,差异非常小
<?php
$phrase = str_repeat("You should eat fruits, vegetables, and fiber every day.",50000);
$healthy = array("fruits", "vegetables", "fiber");
$yummy = array("pizza", "beer", "ice cream");
$time_start = microtime(true);
for($i=0;$i<=10;$i++){
// Method 1
$phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";
$time_start = microtime(true);
for($i=0;$i<=10;$i++){
// Method2
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>
Did测试1英寸(1.1450328826904秒)
测试2在(1.3119208812714秒)内完成。即使旧,此基准也不正确 感谢匿名用户: “这个测试是错误的,因为当测试3开始时,$短语使用的是测试2的结果,在测试2中没有什么可替换的 当我添加$PHASE=“你应该每天吃水果、蔬菜和纤维。”在测试3之前,结果是:测试1在(4.3436799049377秒)中完成了测试2在(5.7581660747528秒)中完成了测试3在(7.5069718360901秒)中完成了测试
测试1在(3.5785729885101秒)内完成
在(3.8501658439636秒)内完成测试2
测试3是否在(0.138443946838秒)@djot中出现错误
<?php
foreach ($healthy as $k => $v) {
if (strpos($phrase, $healthy[$k]) === FALSE)
unset($healthy[$k], $yummy[$k]);
}
虽然问题中没有直接提问,但OP确实指出:
假设实际短语更长(例如50000个字符),并且
要替换的单词有很多对
在这种情况下,如果您不需要(或不希望)替换中的替换,那么使用preg\u replace\u回调
解决方案可能更有效,这样整个字符串只处理一次,而不是每对替换处理一次
这是一个通用函数,在我的例子中,使用1.5Mb字符串和~20000对替换,速度大约快了10倍,尽管由于“正则表达式太大”错误,它需要将替换拆分为块,因此可能会在替换中不确定地进行替换(然而,在我的特殊情况下,这是不可能的)
在我的特殊情况下,我能够进一步优化它,使其性能提高约100倍,因为我的搜索字符串都遵循特定的模式(Windows7上的PHP版本7.1.11,32位)
函数str\u replace\u bulk($search、$replace、$subject、&$count=null){
//假设$search和$replace是大小相等的数组
$lookup=array\u combine($search,$replace);
$result=preg\u replace\u回调(
'/' .
内爆(“|”,数组|映射(
功能($s){
返回预报价($s,“/”);
},
$search
)) .
'/',
函数($matches)使用($lookup){
返回$lookup[$matches[0]];
},
$subject,
-1,
美元计数
);
如果(
$result!==null||
计数($search)<2//避免错误时的无限递归
) {
返回$result;
}
//有大量替换件(>~2500?),
//PHP退出,因为正则表达式太大。
//将搜索和替换一分为二,并分别进行处理。
//注意:替换中的替换现在可能不确定地发生。
$split=(int)(计数($search)/2);
错误日志(“拆分为两个部分,使用~$split replacements”);
$result=str\u replace\u bulk(
数组\u切片($search,$split),
数组_切片($replace,$split),
str_替换_批量(
数组\u切片($search,0,$split),
数组_切片($replace,0,$split),
$subject,
$count1
),
$count2
);
$count=$count1+$count2;
返回$result;
}
两者都不是一个好的选择,如果你有一个长字符串并且反复需要str_替换,为什么不在str_替换后保存结果呢?如果你在循环中一次又一次地创建数组healty和yummy,那么速度就慢了,而不是你把它们放在外面。你花了10倍的时间来问这个问题,这比它在100万个循环中产生的差异要长你的时间比这些毫无意义的优化更宝贵;)@landons不正确。我正在制定一个严格的关键绩效指标(KPI),每毫秒一次都很重要。为什么这个问题和答案的得分这么多-1?是的,但是为了更好的编码和可伸缩性,我牺牲了1mil迭代的0.9分之一秒。我可以建议您将数组声明放在循环之外吗?这种性能差异与我预期的差不多。方法1应该更快。我也希望你更换的人越多,差别就越大。如果你将被替换的物品数量增加到10或20件,你可能会看到一些东西。另外,$phrase
也可以是一个数组,如果您需要对多个字符串执行相同的替换。这也可能是一个很大的不同(我期待)。太好了!这正是我要找的!测试代码并给出O(kn)性能(k=管柱长度,n为替换次数),而不是str_替换选项的O(kn²)性能。
<?php
foreach ($healthy as $k => $v) {
if (strpos($phrase, $healthy[$k]) === FALSE)
unset($healthy[$k], $yummy[$k]);
}
<?php
$time_start = microtime(true);
$healthy = array("fruits", "vegetables", "fiber");
$yummy = array("pizza", "beer", "ice cream");
for($i=0;$i<=1000000;$i++){
// Method 1
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)". PHP_EOL. PHP_EOL;
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
// Method2
$phrase = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)" . PHP_EOL. PHP_EOL;
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
$a = $healthy;
$b = $yummy;
foreach ($healthy as $k => $v) {
if (strpos($phrase, $healthy[$k]) === FALSE)
unset($a[$k], $b[$k]);
}
if ($a) $new_str = str_replace($a, $b, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 3 in ($time seconds)". PHP_EOL. PHP_EOL;
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
$ree = false;
foreach ($healthy as $k) {
if (strpos($phrase, $k) !== FALSE) { //something to replace
$ree = true;
break;
}
}
if ($ree === true) {
$new_str = str_replace($healthy, $yummy, $phrase);
}
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 4 in ($time seconds)". PHP_EOL. PHP_EOL;
function str_replace_bulk($search, $replace, $subject, &$count = null) {
// Assumes $search and $replace are equal sized arrays
$lookup = array_combine($search, $replace);
$result = preg_replace_callback(
'/' .
implode('|', array_map(
function($s) {
return preg_quote($s, '/');
},
$search
)) .
'/',
function($matches) use($lookup) {
return $lookup[$matches[0]];
},
$subject,
-1,
$count
);
if (
$result !== null ||
count($search) < 2 // avoid infinite recursion on error
) {
return $result;
}
// With a large number of replacements (> ~2500?),
// PHP bails because the regular expression is too large.
// Split the search and replacements in half and process each separately.
// NOTE: replacements within replacements may now occur, indeterminately.
$split = (int)(count($search) / 2);
error_log("Splitting into 2 parts with ~$split replacements");
$result = str_replace_bulk(
array_slice($search, $split),
array_slice($replace, $split),
str_replace_bulk(
array_slice($search, 0, $split),
array_slice($replace, 0, $split),
$subject,
$count1
),
$count2
);
$count = $count1 + $count2;
return $result;
}