str_替换在PHP中的性能_Php_Str Replace

str_替换在PHP中的性能

php

str_替换在PHP中的性能,php,str-replace,Php,Str Replace,这里有两种方法使用str\u replace替换给定短语中的字符串 // Method 1 $phrase = "You should eat fruits, vegetables, and fiber every day."; $healthy = array("fruits", "vegetables", "fiber"); $yummy = array("pizza", "beer", "ice cream"); $phrase = str_replace($healthy, $yu

这里有两种方法使用

str\u replace

替换给定短语中的字符串

// Method 1
$phrase  = "You should eat fruits, vegetables, and fiber every day.";
$healthy = array("fruits", "vegetables", "fiber");
$yummy   = array("pizza", "beer", "ice cream");
$phrase = str_replace($healthy, $yummy, $phrase);

// Method 2
$phrase  = "You should eat fruits, vegetables, and fiber every day.";
$phrase = str_replace("fruits", "pizza", $phrase);
$phrase = str_replace("vegetables", "beer", $phrase);
$phrase = str_replace("fiber", "ice cream", $phrase);

哪种方法更有效（在执行时间和使用的资源方面）

假设实际短语更长（例如50000个字符），并且要替换的单词有更多的对

我想的是，方法2调用

str\u replace

3次，这将花费更多的函数调用；另一方面，方法1创建2个数组，

str_replace

需要在运行时解析2个数组。

我更喜欢使用方法1，因为它更干净，更有条理。方法1还提供了使用来自其他源的对的机会，例如：数据库中的坏字表。方法2需要另一个排序循环

<?php
$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
    // Method 1
    $phrase  = "You should eat fruits, vegetables, and fiber every day.";
    $healthy = array("fruits", "vegetables", "fiber");
    $yummy   = array("pizza", "beer", "ice cream");
    $phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";



$time_start = microtime(true);
for($i=0;$i<=1000000;$i++){
    // Method2
    $phrase  = "You should eat fruits, vegetables, and fiber every day.";
    $phrase = str_replace("fruits", "pizza", $phrase);
    $phrase = str_replace("vegetables", "beer", $phrase);
    $phrase = str_replace("fiber", "ice cream", $phrase);

}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>

Did测试1英寸（3.6321988105774秒）

在（2.8234610557556秒）内完成测试2

编辑：在进一步的测试字符串重复到50k时，更少的迭代次数和来自ajreal的建议，差异非常小

<?php
$phrase  = str_repeat("You should eat fruits, vegetables, and fiber every day.",50000);
$healthy = array("fruits", "vegetables", "fiber");
$yummy   = array("pizza", "beer", "ice cream");

$time_start = microtime(true);
for($i=0;$i<=10;$i++){
    // Method 1
    $phrase = str_replace($healthy, $yummy, $phrase);
}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 1 in ($time seconds)\n<br />";



$time_start = microtime(true);
for($i=0;$i<=10;$i++){
    // Method2
    $phrase = str_replace("fruits", "pizza", $phrase);
    $phrase = str_replace("vegetables", "beer", $phrase);
    $phrase = str_replace("fiber", "ice cream", $phrase);

}
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Did Test 2 in ($time seconds)\n";
?>

Did测试1英寸（1.1450328826904秒）

测试2在（1.3119208812714秒）内完成。

即使旧，此基准也不正确

感谢匿名用户：

“这个测试是错误的，因为当测试3开始时，$短语使用的是测试2的结果，在测试2中没有什么可替换的

当我添加$PHASE=“你应该每天吃水果、蔬菜和纤维。”在测试3之前，结果是：测试1在（4.3436799049377秒）中完成了测试2在（5.7581660747528秒）中完成了测试3在（7.5069718360901秒）中完成了测试

测试1在（3.5785729885101秒）内完成

在（3.8501658439636秒）内完成测试2

测试3是否在（0.138443946838秒）

@djot中出现错误

<?php
     foreach ($healthy as $k => $v) {
        if (strpos($phrase, $healthy[$k]) === FALSE)  
             unset($healthy[$k], $yummy[$k]);
        }

虽然问题中没有直接提问，但OP确实指出：
假设实际短语更长（例如50000个字符），并且
要替换的单词有很多对
在这种情况下，如果您不需要（或不希望）替换中的替换，那么使用preg\u replace\u回调
解决方案可能更有效，这样整个字符串只处理一次，而不是每对替换处理一次
这是一个通用函数，在我的例子中，使用1.5Mb字符串和~20000对替换，速度大约快了10倍，尽管由于“正则表达式太大”错误，它需要将替换拆分为块，因此可能会在替换中不确定地进行替换（然而，在我的特殊情况下，这是不可能的）
在我的特殊情况下，我能够进一步优化它，使其性能提高约100倍，因为我的搜索字符串都遵循特定的模式（Windows7上的PHP版本7.1.11，32位）
函数str\u replace\u bulk（$search、$replace、$subject、&$count=null）{
//假设$search和$replace是大小相等的数组
$lookup=array\u combine（$search，$replace）；
$result=preg\u replace\u回调(
'/' .
内爆（“|”，数组|映射(
功能（$s）{
返回预报价（$s，“/”）；
},
$search
)) .
'/',
函数（$matches）使用（$lookup）{
返回$lookup[$matches[0]]；
},
$subject，
-1,
美元计数
);
如果(
$result！==null||
计数（$search）<2//避免错误时的无限递归
) {
返回$result；
}
//有大量替换件（>~2500？），
//PHP退出，因为正则表达式太大。
//将搜索和替换一分为二，并分别进行处理。
//注意：替换中的替换现在可能不确定地发生。
$split=（int）（计数（$search）/2）；
错误日志（“拆分为两个部分，使用~$split replacements”）；
$result=str\u replace\u bulk(
数组\u切片（$search，$split），
数组_切片（$replace，$split），
str_替换_批量(
数组\u切片（$search，0，$split），
数组_切片（$replace，0，$split），
$subject，
$count1
),
$count2
);
$count=$count1+$count2；
返回$result；
}
两者都不是一个好的选择，如果你有一个长字符串并且反复需要str_替换，为什么不在str_替换后保存结果呢？如果你在循环中一次又一次地创建数组healty和yummy，那么速度就慢了，而不是你把它们放在外面。你花了10倍的时间来问这个问题，这比它在100万个循环中产生的差异要长你的时间比这些毫无意义的优化更宝贵；）@landons不正确。我正在制定一个严格的关键绩效指标（KPI），每毫秒一次都很重要。为什么这个问题和答案的得分这么多-1？是的，但是为了更好的编码和可伸缩性，我牺牲了1mil迭代的0.9分之一秒。我可以建议您将数组声明放在循环之外吗？这种性能差异与我预期的差不多。方法1应该更快。我也希望你更换的人越多，差别就越大。如果你将被替换的物品数量增加到10或20件，你可能会看到一些东西。另外，$phrase也可以是一个数组，如果您需要对多个字符串执行相同的替换。这也可能是一个很大的不同（我期待）。太好了！这正是我要找的！测试代码并给出O（kn）性能（k=管柱长度，n为替换次数），而不是str_替换选项的O（kn²）性能。
<?php
     foreach ($healthy as $k => $v) {
        if (strpos($phrase, $healthy[$k]) === FALSE)  
             unset($healthy[$k], $yummy[$k]);
        }  

<?php 
 $time_start = microtime(true);

        $healthy = array("fruits", "vegetables", "fiber");
        $yummy   = array("pizza", "beer", "ice cream");

        for($i=0;$i<=1000000;$i++){
            // Method 1
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace($healthy, $yummy, $phrase);
        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 1 in ($time seconds)". PHP_EOL. PHP_EOL;



        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            // Method2
            $phrase  = "You should eat fruits, vegetables, and fiber every day.";
            $phrase = str_replace("fruits", "pizza", $phrase);
            $phrase = str_replace("vegetables", "beer", $phrase);
            $phrase = str_replace("fiber", "ice cream", $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 2 in ($time seconds)" . PHP_EOL. PHP_EOL;




        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            $a = $healthy;
            $b = $yummy;
                foreach ($healthy as $k => $v) {
                  if (strpos($phrase, $healthy[$k]) === FALSE)  
                  unset($a[$k], $b[$k]);
                }                                          
                if ($a) $new_str = str_replace($a, $b, $phrase);

        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 3 in ($time seconds)". PHP_EOL. PHP_EOL;



        $time_start = microtime(true);
        for($i=0;$i<=1000000;$i++){
            $ree = false;
            foreach ($healthy as $k) {
              if (strpos($phrase, $k) !== FALSE)  { //something to replace
                  $ree = true;
                  break;
              }
            }                                          
            if ($ree === true) {
                $new_str = str_replace($healthy, $yummy, $phrase);
            }
        }
        $time_end = microtime(true);
        $time = $time_end - $time_start;
        echo "Did Test 4 in ($time seconds)". PHP_EOL. PHP_EOL;

function str_replace_bulk($search, $replace, $subject, &$count = null) {
  // Assumes $search and $replace are equal sized arrays
  $lookup = array_combine($search, $replace);
  $result = preg_replace_callback(
    '/' .
      implode('|', array_map(
        function($s) {
          return preg_quote($s, '/');
        },
        $search
      )) .
    '/',
    function($matches) use($lookup) {
      return $lookup[$matches[0]];
    },
    $subject,
    -1,
    $count
  );
  if (
    $result !== null ||
    count($search) < 2 // avoid infinite recursion on error
  ) {
    return $result;
  }
  // With a large number of replacements (> ~2500?), 
  // PHP bails because the regular expression is too large.
  // Split the search and replacements in half and process each separately.
  // NOTE: replacements within replacements may now occur, indeterminately.
  $split = (int)(count($search) / 2);
  error_log("Splitting into 2 parts with ~$split replacements");
  $result = str_replace_bulk(
    array_slice($search, $split),
    array_slice($replace, $split),
    str_replace_bulk(
      array_slice($search, 0, $split),
      array_slice($replace, 0, $split),
      $subject,
      $count1
    ),
    $count2
  );
  $count = $count1 + $count2;
  return $result;
}