Performance 在Mysql上使用Spring数据JPA/Hibernate进行批量/批更新_Performance_Hibernate_Spring Data Jpa_Batch Updates_Batching

Performance 在Mysql上使用Spring数据JPA/Hibernate进行批量/批更新

performance hibernate

Performance 在Mysql上使用Spring数据JPA/Hibernate进行批量/批更新,performance,hibernate,spring-data-jpa,batch-updates,batching,Performance,Hibernate,Spring Data Jpa,Batch Updates,Batching,我使用的是Mysql，Spring数据JPA。在我的用例中，我只有一个表，例如客户ID、名字、姓氏我试图实现的是成批/批量更新，其中update语句是一个组，如上例所示，以减少数据库往返我已经设置了所有属性 hibernate.order\u插入：true hibernate.order\u更新：true hibernate.jdbc.batch_版本化的_数据：true 但结果是update语句没有分组：来自MySQL常规日志的日志期望的结果：将更新分组为单个查询，从而减少数据库往返 2

我使用的是Mysql，Spring数据JPA。在我的用例中，我只有一个表，例如客户ID、名字、姓氏我试图实现的是成批/批量更新，其中update语句是一个组，如上例所示，以减少数据库往返

我已经设置了所有属性

hibernate.order\u插入：true hibernate.order\u更新：true hibernate.jdbc.batch_版本化的_数据：true 但结果是update语句没有分组：来自MySQL常规日志的日志

期望的结果：将更新分组为单个查询，从而减少数据库往返

2018-10-28T03:18:32.545233Z 1711 Query update CUSTOMER set FIRST_NAME=’499997′, LAST_NAME=’499998′ where id=499996; update CUSTOMER set FIRST_NAME=’499998′, LAST_NAME=’499999′ where id=499997; update CUSTOMER set FIRST_NAME=’499999′, LAST_NAME=’500000′ where id=499998;

我的应用程序需要执行超过1亿次的更新，我想这可能是最快的方法

我建议您也设置hibernate.jdbc.batch_size属性。下面是我尝试过的一个小示例：

int entityCount = 50;
int batchSize = 25;

EntityManager entityManager = entityManagerFactory()
    .createEntityManager();

EntityTransaction entityTransaction = entityManager
    .getTransaction();

try {
    entityTransaction.begin();

    for (int i = 0; i < entityCount; i++) {
        if (i > 0 && i % batchSize == 0) {
            entityTransaction.commit();
            entityTransaction.begin();

            entityManager.clear();
        }

        Post post = new Post(
            String.format("Post %d", i + 1)
        );

        entityManager.persist(post);
    } 

    entityTransaction.commit();
} catch (RuntimeException e) {
    if (entityTransaction.isActive()) {
        entityTransaction.rollback();
    }
    throw e;
} finally {
    entityManager.close();
}

每次迭代计数器（例如i）达到batchSize阈值的倍数时，我们都可以刷新EntityManager并提交数据库事务。通过在每次批处理执行后提交数据库事务，我们可以获得以下优势：

我们避免了对MVCC关系数据库系统有害的长时间运行的事务。我们确保如果出现故障，不会丢失以前成功执行的批处理作业所完成的工作。每次批处理执行后，EntityManager都会被清除，这样我们就不会继续累积可能导致多个问题的托管实体：

如果要持久化的实体数量很大，我们就有内存不足的风险。我们在持久性上下文中累积的实体越多，刷新的速度就越慢。因此，最好确保持久性上下文尽可能精简。如果抛出异常，我们必须确保回滚当前正在运行的数据库事务。如果不这样做，可能会导致许多问题，因为数据库可能仍然认为事务是打开的，并且锁可能会一直保持到事务因超时或DBA而结束

最后，我们需要关闭EntityManager，以便清除上下文并释放会话级资源。

我认为这是Hibernate/JPA最接近的方法

int entityCount = 50;
int batchSize = 25;

EntityManager entityManager = entityManagerFactory()
    .createEntityManager();

EntityTransaction entityTransaction = entityManager
    .getTransaction();

try {
    entityTransaction.begin();

    for (int i = 0; i < entityCount; i++) {
        if (i > 0 && i % batchSize == 0) {
            entityTransaction.commit();
            entityTransaction.begin();

            entityManager.clear();
        }

        Post post = new Post(
            String.format("Post %d", i + 1)
        );

        entityManager.persist(post);
    } 

    entityTransaction.commit();
} catch (RuntimeException e) {
    if (entityTransaction.isActive()) {
        entityTransaction.rollback();
    }
    throw e;
} finally {
    entityManager.close();
}