Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/377.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 与org.apache.parquet.parquet-Protobuf的1.8.1相比,为什么在版本1.11.0中将Protobuf记录转换为parquet文件会花费这么多时间?_Java_Hadoop_Protocol Buffers_Parquet_Protobuf Java - Fatal编程技术网

Java 与org.apache.parquet.parquet-Protobuf的1.8.1相比,为什么在版本1.11.0中将Protobuf记录转换为parquet文件会花费这么多时间?

Java 与org.apache.parquet.parquet-Protobuf的1.8.1相比,为什么在版本1.11.0中将Protobuf记录转换为parquet文件会花费这么多时间?,java,hadoop,protocol-buffers,parquet,protobuf-java,Java,Hadoop,Protocol Buffers,Parquet,Protobuf Java,我有一个简单的protobuf模式,如下所示: protobuf: option java_outer_classname = "SimpleRecords"; message Record { required int64 number = 1; } 我使用下面的代码使用上面的记录生成拼花地板文件 int pageSize = 4 * 1024 * 1024; LongGenerator longGenerator = new LongGenerat

我有一个简单的protobuf模式,如下所示:

protobuf:

option java_outer_classname = "SimpleRecords";
message Record {
      required int64 number = 1;
    }
我使用下面的代码使用上面的记录生成拼花地板文件

int pageSize = 4 * 1024 * 1024;

LongGenerator longGenerator = new LongGenerator(500_000_000L);
Path filePath = new Path("benchmark/numbers.parquet");

long startTime = System.nanoTime();
try (ParquetWriter<SimpleRecords.Record> writer = new ProtoParquetWriter<>(filePath, SimpleRecords.Record.class, CompressionCodecName.SNAPPY, 32*pageSize, pageSize)) {
    SimpleRecords.Record.Builder recordBuilder = SimpleRecords.Record.newBuilder();
    for (Long i : longGenerator) {
        recordBuilder.setNumber(i);
        writer.write(recordBuilder.build());
    }
} catch (IOException e) {
    e.printStackTrace();
}
long endTime = System.nanoTime();
intpagesize=4*1024*1024;
LongGenerator LongGenerator=新的LongGenerator(500_000_000L);
路径filePath=新路径(“benchmark/numbers.parquet”);
long startTime=System.nanoTime();
try(ParquetWriter writer=newprotoparquetwriter(文件路径,SimpleRecords.Record.class,CompressionCodecName.SNAPPY,32*pageSize,pageSize)){
SimpleRecords.Record.Builder recordBuilder=SimpleRecords.Record.newBuilder();
用于(长i:长生成器){
recordBuilder.setNumber(i);
writer.write(recordBuilder.build());
}
}捕获(IOE异常){
e、 printStackTrace();
}
long-endTime=System.nanoTime();
我测量了生成拼花地板文件所需的时间,发现1.8.1版需要103秒,而1.11.0版需要2167秒