Php 查找人口最多的年份(最有效的解决方案)

Php 查找人口最多的年份(最有效的解决方案),php,arrays,algorithm,language-agnostic,Php,Arrays,Algorithm,Language Agnostic,给定两个数组$BARTHIS包含出生年份列表,表示某人出生的时间,$DEATIONS包含死亡年份列表,表示某人死亡的时间,我们如何找到人口最多的年份 例如,给定以下数组: $births = [1984, 1981, 1984, 1991, 1996]; $deaths = [1991, 1984]; 人口最多的年份应该是1996,因为3这一年的人口数量是所有年份中最高的 以下是关于这方面的运行数学: | Birth | Death | Population | |-------|------

给定两个数组
$BARTHIS
包含出生年份列表,表示某人出生的时间,
$DEATIONS
包含死亡年份列表,表示某人死亡的时间,我们如何找到人口最多的年份

例如,给定以下数组:

$births = [1984, 1981, 1984, 1991, 1996];
$deaths = [1991, 1984];
人口最多的年份应该是
1996
,因为
3
这一年的人口数量是所有年份中最高的

以下是关于这方面的运行数学:

| Birth | Death | Population | |-------|-------|------------| | 1981 | | 1 | | 1984 | | 2 | | 1984 | 1984 | 2 | | 1991 | 1991 | 2 | | 1996 | | 3 |
此外,包括解决方案的大O也会有所帮助


我在这方面的最佳尝试如下:

function highestPopulationYear(Array $births, Array $deaths): Int {

    sort($births);
    sort($deaths);

    $nextBirthYear = reset($births);
    $nextDeathYear = reset($deaths);

    $years = [];
    if ($nextBirthYear) {
        $years[] = $nextBirthYear;
    }
    if ($nextDeathYear) {
        $years[] = $nextDeathYear;
    }

    if ($years) {
        $currentYear = max(0, ...$years);
    } else {
        $currentYear = 0;
    }

    $maxYear = $maxPopulation = $currentPopulation = 0;

    while(current($births) !== false || current($deaths) !== false || $years) {

        while($currentYear === $nextBirthYear) {
            $currentPopulation++;
            $nextBirthYear = next($births);
        }

        while($currentYear === $nextDeathYear) {
            $currentPopulation--;
            $nextDeathYear = next($deaths);
        }

        if ($currentPopulation >= $maxPopulation) {
            $maxPopulation = $currentPopulation;
            $maxYear = $currentYear;
        }

        $years = [];

        if ($nextBirthYear) {
            $years[] = $nextBirthYear;
        }
        if ($nextDeathYear) {
            $years[] = $nextDeathYear;
        }
        if ($years) {
            $currentYear = min($years);
        } else {
            $currentYear = 0;
        }
    }

    return $maxYear;
}
上述算法应该在多项式时间内工作,因为最坏情况下
O(((n log n)*2)+k)
其中
n
是从每个数组中排序的元素数,
k
是出生年数(因为我们知道
k
总是
k>=y
),其中
y
是死亡年数。但是,我不确定是否有更有效的解决方案

我的兴趣纯粹在于改进现有算法的计算复杂性。内存的复杂性不重要。运行时优化也是如此。至少这不是主要问题。任何次要/主要的运行时优化都是受欢迎的,但不是关键因素


我认为我们可以通过首先排序,然后在迭代过程中保持当前的总体和全局最大值,从而使
O(nlogn)
时间具有
O(1)
额外的空间。我试图用今年作为参考点,但逻辑似乎仍然有点棘手,所以我不确定它是否完全可行。希望它能给出一个方法的想法

JavaScript代码(反例/bug欢迎使用)

功能f(出生、死亡){
出生.排序((a,b)=>a-b);
死亡。排序((a,b)=>a-b);
log(JSON.stringify(JSON));
log(JSON.stringify(death));
设i=0;
设j=0;
让年份=出生率[i];
设curr=0;
设max=curr;
而(死亡[j]<出生[0])
j++;
而(i<出生.长度| | j<死亡.长度){
而(年份==出生[i]){
电流=电流+1;
i=i+1;
}
如果(j==死亡数。长度| |年<死亡数[j]){
max=数学最大值(max,curr);
log(`year:${year},max:${max},curr:${curr}`);
}否则,如果(jyear&(i==出生.length | |死亡[j]控制台日志(f(出生、死亡))我们可以使用桶排序在线性时间内解决此问题。假设输入的大小是n,年的范围是m

O(n): Find the min and max year across births and deaths.
O(m): Create an array of size max_yr - min_yr + 1, ints initialized to zero. 
      Treat the first cell of the array as min_yr, the next as min_yr+1, etc...
O(n): Parse the births array, incrementing the appropriate index of the array. 
      arr[birth_yr - min_yr] += 1
O(n): Ditto for deaths, decrementing the appropriate index of the array.
      arr[death_yr - min_yr] -= 1
O(m): Parse your array, keeping track of the cumulative sum and its max value.
最大的累积最大值就是你的答案

运行时间为O(n+m),需要的额外空间为O(m)

如果m是O(n),这是n中的线性解;i、 例如,如果年份范围的增长速度不超过出生和死亡人数的增长速度。对于真实世界的数据来说,这几乎肯定是正确的

$births = [1984, 1981, 1984, 1991, 1996];
$deaths = [1991, 1984];
$years = array_unique(array_merge($births, $deaths));
sort($years);

$increaseByYear = array_count_values($births);
$decreaseByYear = array_count_values($deaths);
$populationByYear = array();

foreach ($years as $year) {
    $increase = $increaseByYear[$year] ?? 0;
    $decrease = $decreaseByYear[$year] ?? 0;
    $previousPopulationTally = end($populationByYear);
    $populationByYear[$year] = $previousPopulationTally + $increase - $decrease;
}

$maxPopulation = max($populationByYear);
$maxPopulationYears = array_keys($populationByYear, $maxPopulation);

$maxPopulationByYear = array_fill_keys($maxPopulationYears, $maxPopulation);
print_r($maxPopulationByYear);

这将考虑到一个固定年份的可能性,以及如果某人的死亡年份与某人的出生年份不一致。

首先将出生和死亡汇总到一张地图中(
年=>人口变化
),按键排序,并计算该地图上的流动人口

这应该大约是
O(2n+n log n)
,其中
n
是出生数

$births = [1984, 1981, 1984, 1991, 1996];
$deaths = [1991, 1984];

function highestPopulationYear(array $births, array $deaths): ?int
{
    $indexed = [];

    foreach ($births as $birth) {
        $indexed[$birth] = ($indexed[$birth] ?? 0) + 1;
    }

    foreach ($deaths as $death) {
        $indexed[$death] = ($indexed[$death] ?? 0) - 1;
    }

    ksort($indexed);

    $maxYear = null;
    $max = $current = 0;

    foreach ($indexed as $year => $change) {
        $current += $change;
        if ($current >= $max) {
            $max = $current;
            $maxYear = $year;
        }
    }

    return $maxYear;
}

var_dump(highestPopulationYear($births, $deaths));

我用
O(n+m)
[在最坏的情况下,最好的情况下
O(n)
]的内存需求解决了这个问题

并且,
O(nlogn)
的时间复杂度

这里,
n&m
出生
死亡
数组的长度

我不懂PHP或javascript。我已经用Java实现了它,逻辑非常简单。但我相信我的想法也可以用这些语言实现

技术细节:

我使用java
TreeMap
结构来存储出生和死亡记录

TreeMap
将排序的数据作为(键,值)对插入(基于键的),这里键是年份,值是出生和死亡的累计总和(死亡为负值)

我们不需要
插入在最高出生年份之后发生的死亡值

一旦TreeMap中填充了出生和死亡记录,所有的累计总和都将更新,并随着年份的发展存储最大人口

样本输入和输出:1

Births: [1909, 1919, 1904, 1911, 1908, 1908, 1903, 1901, 1914, 1911, 1900, 1919, 1900, 1908, 1906]

Deaths: [1910, 1911, 1912, 1911, 1914, 1914, 1913, 1915, 1914, 1915]

Year counts Births: {1900=2, 1901=1, 1903=1, 1904=1, 1906=1, 1908=3, 1909=1, 1911=2, 1914=1, 1919=2}

Year counts Birth-Deaths combined: {1900=2, 1901=1, 1903=1, 1904=1, 1906=1, 1908=3, 1909=1, 1910=-1, 1911=0, 1912=-1, 1913=-1, 1914=-2, 1915=-2, 1919=2}

Yearwise population: {1900=2, 1901=3, 1903=4, 1904=5, 1906=6, 1908=9, 1909=10, 1910=9, 1911=9, 1912=8, 1913=7, 1914=5, 1915=3, 1919=5}

maxPopulation: 10
yearOfMaxPopulation: 1909
样本输入和输出:2

Births: [1906, 1901, 1911, 1902, 1905, 1911, 1902, 1905, 1910, 1912, 1900, 1900, 1904, 1913, 1904]

Deaths: [1917, 1908, 1918, 1915, 1907, 1907, 1917, 1917, 1912, 1913, 1905, 1914]

Year counts Births: {1900=2, 1901=1, 1902=2, 1904=2, 1905=2, 1906=1, 1910=1, 1911=2, 1912=1, 1913=1}

Year counts Birth-Deaths combined: {1900=2, 1901=1, 1902=2, 1904=2, 1905=1, 1906=1, 1907=-2, 1908=-1, 1910=1, 1911=2, 1912=0, 1913=0}

Yearwise population: {1900=2, 1901=3, 1902=5, 1904=7, 1905=8, 1906=9, 1907=7, 1908=6, 1910=7, 1911=9, 1912=9, 1913=9}

maxPopulation: 9
yearOfMaxPopulation: 1906
在这里,死亡发生(
1914年及以后
)在上一个出生年份
1913年
)之后,根本不计算在内,从而避免了不必要的计算

对于总共1000万数据(出生和死亡的总和)和1000年以上的数据,该程序花了大约3秒完成

如果相同大小的数据在
100年范围内
,则需要
1.3秒


所有的输入都是随机抽取的。

内存方面,保持
当前填充<
Births: [1906, 1901, 1911, 1902, 1905, 1911, 1902, 1905, 1910, 1912, 1900, 1900, 1904, 1913, 1904]

Deaths: [1917, 1908, 1918, 1915, 1907, 1907, 1917, 1917, 1912, 1913, 1905, 1914]

Year counts Births: {1900=2, 1901=1, 1902=2, 1904=2, 1905=2, 1906=1, 1910=1, 1911=2, 1912=1, 1913=1}

Year counts Birth-Deaths combined: {1900=2, 1901=1, 1902=2, 1904=2, 1905=1, 1906=1, 1907=-2, 1908=-1, 1910=1, 1911=2, 1912=0, 1913=0}

Yearwise population: {1900=2, 1901=3, 1902=5, 1904=7, 1905=8, 1906=9, 1907=7, 1908=6, 1910=7, 1911=9, 1912=9, 1913=9}

maxPopulation: 9
yearOfMaxPopulation: 1906
<?php
function getHighestPopulation($births, $deaths){
    $max = [];
    $currentMax = 0;
    $tmpArray = [];

    foreach($deaths as $key => $death){
        if(!isset($tmpArray[$death])){
            $tmpArray[$death] = 0;    
        }
        $tmpArray[$death]--;
    }
    foreach($births as $k => $birth){
        if(!isset($tmpArray[$birth])){
            $tmpArray[$birth] = 0;
        }
        $tmpArray[$birth]++;
        if($tmpArray[$birth] > $currentMax){
            $max = [$birth];
            $currentMax = $tmpArray[$birth];
        } else if ($tmpArray[$birth] == $currentMax) {
            $max[] = $birth;
        }
    }

    return [$currentMax, $max];
}

$births = [1997, 1997, 1997, 1998, 1999];
$deaths = [1998, 1999];

print_r (getHighestPopulation($births, $deaths));
?>
$births = [1909, 1919, 1904, 1911, 1908, 1908, 1903, 1901, 1914, 1911, 1900, 1919, 1900, 1908, 1906];
$deaths = [1910, 1911, 1912, 1911, 1914, 1914, 1913, 1915, 1914, 1915];

/* for generating 1 million records

for($i=1;$i<=1000000;$i++) {
    $births[] = rand(1900, 2020);
    $deaths[] = rand(1900, 2020);
}
*/

function highestPopulationYear(Array $births, Array $deaths): Int {
    $start_time = microtime(true); 
    $population = array_count_values($births);
    $deaths = array_count_values($deaths);

    foreach ($deaths as $year => $death) {
        $population[$year] = ($population[$year] ?? 0) - $death;
    }
    ksort($population, SORT_NUMERIC);
    $cumulativeSum = $maxPopulation = $maxYear = 0;
    foreach ($population as $year => &$number) {
        $cumulativeSum += $number;
        if($maxPopulation < $cumulativeSum) {
            $maxPopulation = $cumulativeSum;
            $maxYear = $year;
        }
    }
    print " Execution time of function = ".((microtime(true) - $start_time)*1000)." milliseconds"; 
    return $maxYear;
}

print highestPopulationYear($births, $deaths);
1909
O(m + log(n))