Java 为什么是春天'；batchUpdate（）太慢了？_Java_Mysql_Spring_Spring Batch_Jdbctemplate

Java 为什么是春天'；batchUpdate（）太慢了？

java mysql spring spring-batch

Java 为什么是春天'；batchUpdate（）太慢了？,java,mysql,spring,spring-batch,jdbctemplate,Java,Mysql,Spring,Spring Batch,Jdbctemplate,我正在尝试找到快速执行批处理插入的方法 DataSource ds = jdbcTemplate.getDataSource(); Connection connection = ds.getConnection(); connection.setAutoCommit(false); String sql = "insert into employee (name, city, phone) values (?, ?, ?)"; PreparedStatement ps = connection

我正在尝试找到快速执行批处理插入的方法

DataSource ds = jdbcTemplate.getDataSource();
Connection connection = ds.getConnection();
connection.setAutoCommit(false);
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;

for (Employee employee: employees) {

    ps.setString(1, employee.getName());
    ps.setString(2, employee.getCity());
    ps.setString(3, employee.getPhone());
    ps.addBatch();

    ++count;

    if(count % batchSize == 0 || count == employees.size()) {
        ps.executeBatch();
        ps.clearBatch(); 
    }
}

connection.commit();
ps.close();

我尝试使用jdbcTemplate.update（stringsql）插入几个批，其中 sql是由StringBuilder构建的，如下所示：

INSERT INTO TABLE(x, y, i) VALUES(1,2,3), (1,2,3), ... , (1,2,3)

批量大小正好是1000。我插入了将近100批。我用秒表检查了时间，发现了插入时间：

min[38ms], avg[50ms], max[190ms] per batch

我很高兴，但我想让我的代码更好

之后，我尝试以如下方式使用jdbcTemplate.batchUpdate：

    jdbcTemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {
        @Override
        public void setValues(PreparedStatement ps, int i) throws SQLException {
                       // ...
        }
        @Override
        public int getBatchSize() {
            return 1000;
        }
    });

sql是什么样子的

INSERT INTO TABLE(x, y, i) VALUES(1,2,3);

我很失望！jdbcTemplate以单独的方式批量执行1000行的每一次插入。我在mysql_日志中找到了上千个插入。我用秒表检查了时间，发现了插入时间：

min[38ms], avg[50ms], max[190ms] per batch

每批的最小值[900ms]，平均值[1100ms]，最大值[2000ms]

那么，有谁能向我解释一下，为什么jdbcTemplate在这个方法中进行分离插入？为什么方法的名称为batchUpdate？

或者可能是我以错误的方式使用了此方法？

将sql插入更改为

插入表（x，y，I）值（1,2,3）

。框架为您创建一个循环。例如：

public void insertBatch(final List<Customer> customers){

  String sql = "INSERT INTO CUSTOMER " +
    "(CUST_ID, NAME, AGE) VALUES (?, ?, ?)";

  getJdbcTemplate().batchUpdate(sql, new BatchPreparedStatementSetter() {

    @Override
    public void setValues(PreparedStatement ps, int i) throws SQLException {
        Customer customer = customers.get(i);
        ps.setLong(1, customer.getCustId());
        ps.setString(2, customer.getName());
        ps.setInt(3, customer.getAge() );
    }

    @Override
    public int getBatchSize() {
        return customers.size();
    }
  });
}

public void insertBatch（最终客户列表）{
String sql=“插入客户”+
“（客户ID、姓名、年龄）值（？、、？）”；
getJdbcTemplate（）.batchUpdate（sql，新的BatchPreparedStatementSetter（））{
@凌驾
公共void setValues（PreparedStatement ps，int i）引发SQLException{
Customer=customers.get（i）；
ps.setLong（1，customer.getCustId（））；
ps.setString（2，customer.getName（））；
ps.setInt（3，customer.getAge（））；
}
@凌驾
public int getBatchSize（）{
返回客户。size（）；
}
});
}

如果你有这样的东西。Spring将执行以下操作：

for(int i = 0; i < getBatchSize(); i++){
   execute the prepared statement with the parameters for the current iteration
}

for（int i=0；i


框架首先从查询中创建PreparedStatement（变量sql
），然后调用setValues方法并执行该语句。重复次数与您在getBatchSize（）方法中指定的次数相同。因此，编写insert语句的正确方法是只使用一个values子句。
您可以查看一下
我不知道这是否适用于您，但我最终使用了一种无弹簧的方式。它比我尝试的各种Spring方法要快得多。我甚至尝试使用另一个答案描述的JDBC模板批量更新方法，但即使这样也比我想要的慢。我不确定交易是什么，互联网也没有太多答案。我怀疑这与如何处理犯罪有关
这种方法就是直接使用java.sql包和PreparedStatement的批处理接口的JDBC。这是我将2400万条记录存入MySQL数据库的最快方式
我或多或少只是建立了“记录”对象的集合，然后在批插入所有记录的方法中调用下面的代码。构建集合的循环负责管理批大小
我试图在MySQL数据库中插入2400万条记录，而使用Spring批处理，它每秒大约有200条记录。当我切换到这种方法时，它上升到每秒2500条记录。所以我的24米记录负荷从理论上的1.5天增加到了2.5小时
首先创建一个连接
Connection conn = null;
try{
    Class.forName("com.mysql.jdbc.Driver");
    conn = DriverManager.getConnection(connectionUrl, username, password);
}catch(SQLException e){}catch(ClassNotFoundException e){}

然后创建一个准备好的语句，并为其加载用于插入的成批值，然后作为单个成批插入执行
PreparedStatement ps = null;
try{
    conn.setAutoCommit(false);
    ps = conn.prepareStatement(sql); // INSERT INTO TABLE(x, y, i) VALUES(1,2,3)
    for(MyRecord record : records){
        try{
            ps.setString(1, record.getX());
            ps.setString(2, record.getY());
            ps.setString(3, record.getI());

            ps.addBatch();
        } catch (Exception e){
            ps.clearParameters();
            logger.warn("Skipping record...", e);
        }
    }

    ps.executeBatch();
    conn.commit();
} catch (SQLException e){
} finally {
    if(null != ps){
        try {ps.close();} catch (SQLException e){}
    }
}

显然，我已经删除了错误处理，查询和记录对象是概念性的等等
编辑：
由于您最初的问题是将insert into foobar值（？，？），（？，？，，？）…（？，？，，？）方法与Spring batch进行比较，因此这里有一个更直接的回答：
看起来，您最初的方法可能是在不使用“加载数据填充”方法的情况下将大量数据加载到MySQL的最快方法。MysQL文档（）中的引用：
如果同时从同一客户机插入多行，
使用具有多个值列表的INSERT语句插入多个
一次行。这相当快（在某些情况下快很多倍）
案例）而不是使用单独的单行INSERT语句
您可以修改springjdbc模板batchUpdate方法，使用每个“setValues”调用指定的多个值进行插入，但在迭代插入的内容集时，必须手动跟踪索引值。最后，当插入的内容总数不是准备好的语句中的值列表数量的倍数时，您会遇到一个糟糕的边缘情况
如果您使用我概述的方法，您可以做同样的事情（使用一个准备好的语句和多个值列表），然后当您到达最后的边缘情况时，它会更容易处理，因为您可以使用正确数量的值列表构建和执行最后一个语句。它有点粗糙，但大多数优化的东西都是。
我也在SpringJDBC模板中遇到了同样的问题。可能在SpringBatch中，语句在每次插入或块上执行和提交，这会减慢速度
我已经用原始的JDBC批插入代码替换了jdbcTemplate.batchUpdate（）代码，并发现了主要的性能改进
DataSource ds = jdbcTemplate.getDataSource();
Connection connection = ds.getConnection();
connection.setAutoCommit(false);
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;

for (Employee employee: employees) {

    ps.setString(1, employee.getName());
    ps.setString(2, employee.getCity());
    ps.setString(3, employee.getPhone());
    ps.addBatch();

    ++count;

    if(count % batchSize == 0 || count == employees.size()) {
        ps.executeBatch();
        ps.clearBatch(); 
    }
}

connection.commit();
ps.close();

请检查此链接
JDBC连接URL中的这些参数可以大大提高批处理语句的速度——根据我的经验，它们可以加快速度：
？UseServerPrepsmts=false&rewriteBatchedStatements=true
请参阅：
简单使用事务。在方法上添加@Transactional
如果使用多个datasources@Transactional（“dsTxManager”），请确保声明正确的TX管理器。我有一个案例，inse
int[] argTypes = new int[35];
argTypes[0] = Types.VARCHAR;
argTypes[1] = Types.VARCHAR;
argTypes[2] = Types.VARCHAR;
argTypes[3] = Types.DECIMAL;
argTypes[4] = Types.TIMESTAMP;
.....

DataSource ds = jdbcTemplate.getDataSource();
Connection connection = ds.getConnection();
connection.setAutoCommit(false);
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;

for (Employee employee: employees) {

    ps.setString(1, employee.getName());
    ps.setString(2, employee.getCity());
    ps.setString(3, employee.getPhone());
    ps.addBatch();

    ++count;

    if(count % batchSize == 0 || count == employees.size()) {
        ps.executeBatch();
        ps.clearBatch(); 
    }
}

connection.commit();
ps.close();