Php 删除大量数据和主索引

Php 删除大量数据和主索引,php,mysql,innodb,bigdata,clustered-index,Php,Mysql,Innodb,Bigdata,Clustered Index,我正试图用主/聚集索引从InnoDB MySQL表中删除大量行(>1000万行,约占表中所有记录的1/3)。 字段id是主/聚集索引,它是连续的,没有间隙。至少应该是这样,我不会在中间删除记录。但也有可能是一些insert查询失败,innodb分配了一些未使用的ID(我不确定这是不是真的)。我只删除不再需要的旧记录。表包含varchar列,所以行的大小不固定 首先,我尝试: DELETE FROM `table` WHERE id<=10000000 这种变体具有稳定的性能 性能低下的原

我正试图用主/聚集索引从InnoDB MySQL表中删除大量行(>1000万行,约占表中所有记录的1/3)。 字段
id
是主/聚集索引,它是连续的,没有间隙。至少应该是这样,我不会在中间删除记录。但也有可能是一些insert查询失败,innodb分配了一些未使用的ID(我不确定这是不是真的)。我只删除不再需要的旧记录。表包含varchar列,所以行的大小不固定

首先,我尝试:

DELETE FROM `table` WHERE id<=10000000
这种变体具有稳定的性能

性能低下的原因是数据库引擎必须物理地使用受影响的叶子中的所有记录数据。这是我的理解,如果您的知识更深入,欢迎您添加详细的实际情况描述。也许它会带来一些新的想法

问题:是否可以计算叶子上的行分布并删除整个叶子甚至分支,这样数据库引擎就不必使用数据?
对于这种情况,您可能对性能优化有一些其他的想法。

我已经面对过好几次了,通常我会创建一个分区(或者先创建几个分区)因为这将减少INNODB在大型删除查询中所需的IO,而无需重建整个索引树,然后一次在1000-1500之间执行块删除

这也是一种实践:

  • 将自动提交设置为1
  • 一次将删除的数据分块到1500个左右
  • 确保innodb_日志_文件_大小足够大
试试看

DELETE FROM `table` WHERE id BETWEEN 1 AND 10000000

也许可以尝试在删除之前删除索引,然后重新定义它们-在每次查询之后自动提交和“提交”查询之间有区别吗具有固定行数的块不是最优的。我做了一个测试。最佳行数在很大程度上取决于服务器上的硬件和当前负载。在大多数情况下,在实时服务器上,查询的执行时间会增加。所以,最好根据查询执行时间计算要删除的行数,以根据服务器的当前状态调整负载。-innodb_log_file_大小现在很好,这是我的第一个错误。它应该比
id<10000000
慢很多。
protected function deleteById($table, $id) {
    $MinId          = $this->getMinFromTable($table, 'id');
    $PackDeleteCount= $this->PackDeleteCount;
    $timerTotal     = new Timer();
    $delCountTotal  = 0;
    $delCountReport = 0;
    $delInfo        = array();
    $PackMinTime    = round($this->PackDeleteTime - $this->PackDeleteTime*$this->PackDeleteDiv, 3);
    $PackMaxTime    = round($this->PackDeleteTime + $this->PackDeleteTime*$this->PackDeleteDiv, 3);
    $this->LogString(sprintf('Del `%s`, PackMinTime: %s; PackMaxTime: %s', $table, $PackMinTime, $PackMaxTime));
    for (; $MinId < $id;) {
        $MinId          += $PackDeleteCount;
        $delCountReport += $PackDeleteCount;
        if ($MinId > $id) {
            $MinId = $id;
        }
        $timer          = new Timer();
        $sql            = sprintf('DELETE FROM `%s` WHERE id<=%s', $table, $MinId);
        $this->s->Query($sql, __FILE__, __LINE__);
        $delCount       = $this->s->AffectedRows();
        $this->s->CommitT();
        $RoundTime      = round($timer->end(), 3);
        $delInfo[]      = array(
            'time'  => $RoundTime,
            'rows'  => $PackDeleteCount,
        );
        $delCountTotal  += $delCount;
        if ($delCountReport >= $this->PackDeleteReport) {
            $delCountReport = 0;
            $delSqlCount    = count($delInfo);
            $EvTime         = 0;
            $PackTime       = 0;
            $EvCount        = 0;
            $PackCount      = 0;
            foreach ($delInfo as $v) {
                $PackTime   += $v['time'];
                $PackCount  += $v['rows'];
            }
            $EvTime         = round($PackTime/$delSqlCount, 2);
            $PackTime       = round($PackTime, 2);
            $EvCount        = round($PackCount/$delSqlCount);
            $TotalTime      = $this->readableTime(intval($timerTotal->end()));
            $this->LogString(sprintf('Del `%s`, Sql query count: %d; Time: %s; Count: %d; Evarage Time %s; Evarage count per delete: %d; Del total: %s; Del Total Time: %s; id <= %s', $table, $delSqlCount, $PackTime, $PackCount, $EvTime, $EvCount, $delCountTotal, $TotalTime, $MinId));
            $delInfo        = array();
        }

        $PackDeleteCountOld = $PackDeleteCount;
        if ($RoundTime < $PackMinTime) {
            $PackDeleteCount    = intval($PackDeleteCount + $PackDeleteCount*(1 - $RoundTime/$this->PackDeleteTime));
        } elseif ($RoundTime > $PackMaxTime) {
            $PackDeleteCount    = intval($PackDeleteCount - $PackDeleteCount*(1 - $this->PackDeleteTime/$RoundTime));
        }
        //$this->LogString(sprintf('Del `%s`, round time: %s; row count old: %d; row count new: %d', $table, $RoundTime, $PackDeleteCountOld, $PackDeleteCount));
    }
    $this->LogString(sprintf('Finished del `%s`: time: %s', $table, round($timerTotal->end(), 2)));
}
$table - target table, where rows needs to be deleted
$id - all records up to this id should be deleted
$MinId - Minimal id in the target table
$this->PackDeleteCount - Initial count of records, to start from. Then it recalculates row count to be deleted each new query.
$this->PackDeleteTime - desirable query execution time in average. I used 0.5
$this->PackDeleteDiv - acceptable deviation from $this->PackDeleteTime. In percentage. I used 0.3
$this->PackDeleteReport - Each N records should print statistic information about deleting
DELETE FROM `table` WHERE id BETWEEN 1 AND 10000000