在PHP中,在排序数组中插入元素的更好方法是什么

在PHP中,在排序数组中插入元素的更好方法是什么,php,arrays,performance,insert,sorted,Php,Arrays,Performance,Insert,Sorted,我最近把我的简历发给了一家招聘PHP开发人员的公司。如果我有足够的经验,他们会送我一个任务去解决,去测量 任务是这样的: ================== Functions time summary ====================== insert1() => 0.5983521938 insert2() => 0.2605950832 insert3() => 0.3288729191

我最近把我的简历发给了一家招聘PHP开发人员的公司。如果我有足够的经验,他们会送我一个任务去解决,去测量

任务是这样的:

================== Functions time summary  ======================

           insert1() => 0.5983521938
           insert2() => 0.2605950832
           insert3() => 0.3288729191
           insert4() => 0.3288729191
SplMaxHeap::insert() => 0.0000801086
您有一个包含10k个唯一元素的数组,已排序的子体。编写生成此数组的函数,然后编写三个不同的函数,将新元素插入到数组中,插入后数组仍将按子代排序。编写一些代码来测量这些函数的速度。不能使用PHP排序函数

所以我编写了生成数组的函数和向数组插入新元素的四个函数

/********** Generating array (because use of range() was to simple :)): *************/

function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30){
    $arr = array();
    for($i = 1; $i <= $elementsNum; $i++){
        $rand = mt_rand(1, $dev);
        $start -= $rand;
        $arr[] = $start; 
    }
    return $arr;
}

/********************** Four insert functions: **************************/

// for loop, and array copying
function insert1(&$arr, $elem){
    if(empty($arr)){
        $arr[] = $elem;
        return true;
    }
    $c = count($arr);
    $lastIndex = $c - 1;
    $tmp = array();
    $inserted = false;
    for($i = 0; $i < $c; $i++){
        if(!$inserted && $arr[$i] <= $elem){
            $tmp[] = $elem;
            $inserted = true;
        }
        $tmp[] = $arr[$i];
        if($lastIndex == $i && !$inserted) $tmp[] = $elem;
    }
    $arr = $tmp;
    return true;
}

// new element inserted at the end of array 
// and moved up until correct place
function insert2(&$arr, $elem){
    $c = count($arr);
    array_push($arr, $elem);
    for($i = $c; $i > 0; $i--){
        if($arr[$i - 1] >= $arr[$i]) break;
        $tmp = $arr[$i - 1];
        $arr[$i - 1] = $arr[$i];
        $arr[$i] = $tmp;
    }
    return true;
}

// binary search for correct place + array_splice() to insert element
function insert3(&$arr, $elem){
    $startIndex = 0;
    $stopIndex = count($arr) - 1;
    $middle = 0;
    while($startIndex < $stopIndex){
        $middle = ceil(($stopIndex + $startIndex) / 2);
        if($elem > $arr[$middle]){
            $stopIndex = $middle - 1;
        }else if($elem <= $arr[$middle]){
            $startIndex = $middle;
        }
    }
    $offset = $elem >= $arr[$startIndex] ? $startIndex : $startIndex + 1; 
    array_splice($arr, $offset, 0, array($elem));
}

// for loop to find correct place + array_splice() to insert
function insert4(&$arr, $elem){
    $c = count($arr);
    $inserted = false;
    for($i = 0; $i < $c; $i++){
        if($elem >= $arr[$i]){
            array_splice($arr, $i, 0, array($elem));
            $inserted = true;
            break;
        }
    }
    if(!$inserted) $arr[] = $elem;
    return true;
}

/*********************** Speed tests: *************************/

// check if array is sorted descending
function checkIfArrayCorrect($arr, $expectedCount = null){
    $c = count($arr);
    if(isset($expectedCount) && $c != $expectedCount) return false;
    $correct = true;
    for($i = 0; $i < $c - 1; $i++){
        if(!isset($arr[$i + 1]) || $arr[$i] < $arr[$i + 1]){
            $correct = false;
            break;
        }
    }
    return $correct;
}
// claculates microtimetime diff
function timeDiff($startTime){
    $diff = microtime(true) - $startTime;
    return $diff;
}
// prints formatted execution time info
function showTime($func, $time){
    printf("Execution time of %s(): %01.7f s\n", $func, $time);
}
// generated elements num
$elementsNum = 10000;
// generate starting point
$start = 300000;
// generated elements random range    1 - $dev
$dev = 50;


echo "Generating array with descending order, $elementsNum elements, begining from $start\n";
$startTime = microtime(true);
$arr = generateSortedArray($start, $elementsNum, $dev);
showTime('generateSortedArray', timeDiff($startTime));

$step = 2;
echo "Generating second array using range range(), $elementsNum elements, begining from $start, step $step\n";
$startTime = microtime(true);
$arr2 = range($start, $start - $elementsNum * $step, $step);
showTime('range', timeDiff($startTime));

echo "Checking if array is correct\n";
$startTime = microtime(true);
$sorted = checkIfArrayCorrect($arr, $elementsNum);
showTime('checkIfArrayCorrect', timeDiff($startTime));

if(!$sorted) die("Array is not in descending order!\n");
echo "Array OK\n";

$toInsert = array();

// number of elements to insert from every range
$randElementNum = 20;

// some ranges of elements to insert near begining, middle and end of generated array
// start value => end value
$ranges = array(
    300000 => 280000,
    160000 => 140000,
    30000 => 0,
);
foreach($ranges as $from => $to){
    $values = array();
    echo "Generating $randElementNum random elements from range [$from - $to] to insert\n";
    while(count($values) < $randElementNum){
        $values[mt_rand($from, $to)] = 1;
    }
    $toInsert = array_merge($toInsert, array_keys($values));
}
// some elements to insert on begining and end of array
array_push($toInsert, 310000);
array_push($toInsert, -1000);

echo "Generated elements: \n";
for($i = 0; $i < count($toInsert); $i++){
    if($i > 0 && $i % 5 == 0) echo "\n";
    printf("%8d, ", $toInsert[$i]);
    if($i == count($toInsert) - 1) echo "\n";
}
// functions to test
$toTest = array('insert1' => null, 'insert2' => null, 'insert3' => null, 'insert4' => null);
foreach($toTest as $func => &$time){
    echo "\n\n================== Testing speed of $func() ======================\n\n";
    $tmpArr = $arr;
    $startTime = microtime(true);
    for($i = 0; $i < count($toInsert); $i++){
        $func($tmpArr, $toInsert[$i]);
    }
    $time = timeDiff($startTime, 'checkIfArraySorted');
    showTime($func, $time);
    echo "Checking if after using $func() array is still correct: \n";
    if(!checkIfArrayCorrect($tmpArr, count($arr) + count($toInsert))){
        echo "Array INCORRECT!\n\n";
    }else{
        echo "Array OK!\n\n";
    }
    echo "Few elements from begining of array:\n";
    print_r(array_slice($tmpArr, 0, 5));

    echo "Few elements from end of array:\n";
    print_r(array_slice($tmpArr, -5));

    //echo "\n================== Finished testing $func() ======================\n\n";
}
echo "\n\n================== Functions time summary    ======================\n\n";
print_r($toTest);

也许只是我,但也许他们也在寻求可读性和可维护性

我的意思是,您正在命名变量
$arr
$c
$middle
,甚至都不需要放置适当的文档

例如:

/**
 * generateSortedArray()    Function to generate a descending sorted array
 *
 * @param int $start        Beginning with this number
 * @param int $elementsNum  Number of elements in array
 * @param int $dev          Maximum difference between elements
 * @return array            Sorted descending array.
 */
function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30) {
    $arr = array();                             #Variable definition
    for ($i = 1; $i <= $elementsNum; $i++) {    
        $rand = mt_rand(1, $dev);               #Generate a random number
        $start -= $rand;                        #Substract from initial value
        $arr[] = $start;                        #Push to array
    }
    return $arr;
}
/**
*函数生成降序排序数组
*
*@param int$以此数字开头
*@param int$elementsNum数组中的元素数
*@param int$dev元素之间的最大差异
*@return数组按降序数组排序。
*/
函数generateSortedArray($start=300000,$elementsUM=10000,$dev=30){
$arr=array();#变量定义

对于($i=1;$i),您可以使用二进制搜索查找插入新元素的位置。之后,只需使用
for
将元素与一个临时变量(将存储当前项)进行移位。在最坏情况下为O(N),内存最小(1个数字变量)overhead@zerkms我已经在
insert3()中使用了二进制搜索
。由于
阵列拼接()
似乎没有得到太多优化。这可能是最有效的方法。我想你失去了你的分数是因为在php7中有
ds
数据结构扩展。下面是一篇文章如何使用它和一些基准:为什么二进制搜索插入方法不如其他方法快?!?!(除了SplMaxHeap)代码在我看来没有那么复杂。有些代码用于带有数组拼接()、二进制搜索等的循环。$arr我认为是非常明显的:)。我在给他们的电子邮件中写道,如果他们需要更好的文档,我可以在我的代码中添加更多的注释,并向他们发送新版本:)但也许我应该在一开始就这样做……我甚至会说该函数太混乱,无法仅生成数据样本。
range()
就足够了。@zerkms我也向他们发送了信息(在对该函数的评论中),我可以简单地使用range()(它比range()快2个TIM),但值不会是随机的(而且也不有趣:)。现在当我想到它时,它们是否随机并不重要:)如果随机性不是一个问题,最简单的解决方案是
range()
然后是
array\u reverse()
。(编辑,因为随机性是不相关的。)@zerkms我同意最简单的解决方案是最好的。如果他们不要求我的函数生成数组,我会使用
range()
。可能是我在某种程度上误解了他们。