在PHP中,在排序数组中插入元素的更好方法是什么
我最近把我的简历发给了一家招聘PHP开发人员的公司。如果我有足够的经验,他们会送我一个任务去解决,去测量 任务是这样的:在PHP中,在排序数组中插入元素的更好方法是什么,php,arrays,performance,insert,sorted,Php,Arrays,Performance,Insert,Sorted,我最近把我的简历发给了一家招聘PHP开发人员的公司。如果我有足够的经验,他们会送我一个任务去解决,去测量 任务是这样的: ================== Functions time summary ====================== insert1() => 0.5983521938 insert2() => 0.2605950832 insert3() => 0.3288729191
================== Functions time summary ======================
insert1() => 0.5983521938
insert2() => 0.2605950832
insert3() => 0.3288729191
insert4() => 0.3288729191
SplMaxHeap::insert() => 0.0000801086
您有一个包含10k个唯一元素的数组,已排序的子体。编写生成此数组的函数,然后编写三个不同的函数,将新元素插入到数组中,插入后数组仍将按子代排序。编写一些代码来测量这些函数的速度。不能使用PHP排序函数
所以我编写了生成数组的函数和向数组插入新元素的四个函数
/********** Generating array (because use of range() was to simple :)): *************/
function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30){
$arr = array();
for($i = 1; $i <= $elementsNum; $i++){
$rand = mt_rand(1, $dev);
$start -= $rand;
$arr[] = $start;
}
return $arr;
}
/********************** Four insert functions: **************************/
// for loop, and array copying
function insert1(&$arr, $elem){
if(empty($arr)){
$arr[] = $elem;
return true;
}
$c = count($arr);
$lastIndex = $c - 1;
$tmp = array();
$inserted = false;
for($i = 0; $i < $c; $i++){
if(!$inserted && $arr[$i] <= $elem){
$tmp[] = $elem;
$inserted = true;
}
$tmp[] = $arr[$i];
if($lastIndex == $i && !$inserted) $tmp[] = $elem;
}
$arr = $tmp;
return true;
}
// new element inserted at the end of array
// and moved up until correct place
function insert2(&$arr, $elem){
$c = count($arr);
array_push($arr, $elem);
for($i = $c; $i > 0; $i--){
if($arr[$i - 1] >= $arr[$i]) break;
$tmp = $arr[$i - 1];
$arr[$i - 1] = $arr[$i];
$arr[$i] = $tmp;
}
return true;
}
// binary search for correct place + array_splice() to insert element
function insert3(&$arr, $elem){
$startIndex = 0;
$stopIndex = count($arr) - 1;
$middle = 0;
while($startIndex < $stopIndex){
$middle = ceil(($stopIndex + $startIndex) / 2);
if($elem > $arr[$middle]){
$stopIndex = $middle - 1;
}else if($elem <= $arr[$middle]){
$startIndex = $middle;
}
}
$offset = $elem >= $arr[$startIndex] ? $startIndex : $startIndex + 1;
array_splice($arr, $offset, 0, array($elem));
}
// for loop to find correct place + array_splice() to insert
function insert4(&$arr, $elem){
$c = count($arr);
$inserted = false;
for($i = 0; $i < $c; $i++){
if($elem >= $arr[$i]){
array_splice($arr, $i, 0, array($elem));
$inserted = true;
break;
}
}
if(!$inserted) $arr[] = $elem;
return true;
}
/*********************** Speed tests: *************************/
// check if array is sorted descending
function checkIfArrayCorrect($arr, $expectedCount = null){
$c = count($arr);
if(isset($expectedCount) && $c != $expectedCount) return false;
$correct = true;
for($i = 0; $i < $c - 1; $i++){
if(!isset($arr[$i + 1]) || $arr[$i] < $arr[$i + 1]){
$correct = false;
break;
}
}
return $correct;
}
// claculates microtimetime diff
function timeDiff($startTime){
$diff = microtime(true) - $startTime;
return $diff;
}
// prints formatted execution time info
function showTime($func, $time){
printf("Execution time of %s(): %01.7f s\n", $func, $time);
}
// generated elements num
$elementsNum = 10000;
// generate starting point
$start = 300000;
// generated elements random range 1 - $dev
$dev = 50;
echo "Generating array with descending order, $elementsNum elements, begining from $start\n";
$startTime = microtime(true);
$arr = generateSortedArray($start, $elementsNum, $dev);
showTime('generateSortedArray', timeDiff($startTime));
$step = 2;
echo "Generating second array using range range(), $elementsNum elements, begining from $start, step $step\n";
$startTime = microtime(true);
$arr2 = range($start, $start - $elementsNum * $step, $step);
showTime('range', timeDiff($startTime));
echo "Checking if array is correct\n";
$startTime = microtime(true);
$sorted = checkIfArrayCorrect($arr, $elementsNum);
showTime('checkIfArrayCorrect', timeDiff($startTime));
if(!$sorted) die("Array is not in descending order!\n");
echo "Array OK\n";
$toInsert = array();
// number of elements to insert from every range
$randElementNum = 20;
// some ranges of elements to insert near begining, middle and end of generated array
// start value => end value
$ranges = array(
300000 => 280000,
160000 => 140000,
30000 => 0,
);
foreach($ranges as $from => $to){
$values = array();
echo "Generating $randElementNum random elements from range [$from - $to] to insert\n";
while(count($values) < $randElementNum){
$values[mt_rand($from, $to)] = 1;
}
$toInsert = array_merge($toInsert, array_keys($values));
}
// some elements to insert on begining and end of array
array_push($toInsert, 310000);
array_push($toInsert, -1000);
echo "Generated elements: \n";
for($i = 0; $i < count($toInsert); $i++){
if($i > 0 && $i % 5 == 0) echo "\n";
printf("%8d, ", $toInsert[$i]);
if($i == count($toInsert) - 1) echo "\n";
}
// functions to test
$toTest = array('insert1' => null, 'insert2' => null, 'insert3' => null, 'insert4' => null);
foreach($toTest as $func => &$time){
echo "\n\n================== Testing speed of $func() ======================\n\n";
$tmpArr = $arr;
$startTime = microtime(true);
for($i = 0; $i < count($toInsert); $i++){
$func($tmpArr, $toInsert[$i]);
}
$time = timeDiff($startTime, 'checkIfArraySorted');
showTime($func, $time);
echo "Checking if after using $func() array is still correct: \n";
if(!checkIfArrayCorrect($tmpArr, count($arr) + count($toInsert))){
echo "Array INCORRECT!\n\n";
}else{
echo "Array OK!\n\n";
}
echo "Few elements from begining of array:\n";
print_r(array_slice($tmpArr, 0, 5));
echo "Few elements from end of array:\n";
print_r(array_slice($tmpArr, -5));
//echo "\n================== Finished testing $func() ======================\n\n";
}
echo "\n\n================== Functions time summary ======================\n\n";
print_r($toTest);
也许只是我,但也许他们也在寻求可读性和可维护性 我的意思是,您正在命名变量
$arr
,$c
和$middle
,甚至都不需要放置适当的文档
例如:
/**
* generateSortedArray() Function to generate a descending sorted array
*
* @param int $start Beginning with this number
* @param int $elementsNum Number of elements in array
* @param int $dev Maximum difference between elements
* @return array Sorted descending array.
*/
function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30) {
$arr = array(); #Variable definition
for ($i = 1; $i <= $elementsNum; $i++) {
$rand = mt_rand(1, $dev); #Generate a random number
$start -= $rand; #Substract from initial value
$arr[] = $start; #Push to array
}
return $arr;
}
/**
*函数生成降序排序数组
*
*@param int$以此数字开头
*@param int$elementsNum数组中的元素数
*@param int$dev元素之间的最大差异
*@return数组按降序数组排序。
*/
函数generateSortedArray($start=300000,$elementsUM=10000,$dev=30){
$arr=array();#变量定义
对于($i=1;$i),您可以使用二进制搜索查找插入新元素的位置。之后,只需使用for
将元素与一个临时变量(将存储当前项)进行移位。在最坏情况下为O(N),内存最小(1个数字变量)overhead@zerkms我已经在insert3()中使用了二进制搜索
。由于阵列拼接()
似乎没有得到太多优化。这可能是最有效的方法。我想你失去了你的分数是因为在php7中有ds
数据结构扩展。下面是一篇文章如何使用它和一些基准:为什么二进制搜索插入方法不如其他方法快?!?!(除了SplMaxHeap)代码在我看来没有那么复杂。有些代码用于带有数组拼接()
、二进制搜索等的循环。$arr
我认为是非常明显的:)。我在给他们的电子邮件中写道,如果他们需要更好的文档,我可以在我的代码中添加更多的注释,并向他们发送新版本:)但也许我应该在一开始就这样做……我甚至会说该函数太混乱,无法仅生成数据样本。range()
就足够了。@zerkms我也向他们发送了信息(在对该函数的评论中),我可以简单地使用range()(它比range()快2个TIM),但值不会是随机的(而且也不有趣:)。现在当我想到它时,它们是否随机并不重要:)如果随机性不是一个问题,最简单的解决方案是range()
然后是array\u reverse()
。(编辑,因为随机性是不相关的。)@zerkms我同意最简单的解决方案是最好的。如果他们不要求我的函数生成数组,我会使用range()
。可能是我在某种程度上误解了他们。