在Java中从拼花文件读取十进制字段

在Java中从拼花文件读取十进制字段,java,parquet,Java,Parquet,我读了一个拼花地板文件如下: Builder<GenericRecord> builder = AvroParquetReader.builder(path); ParquetReader<GenericRecord> reader = builder.build(); GenericRecord record = null; while((record = reader.read()) != null) { System.out.println(record.to

我读了一个拼花地板文件如下:

Builder<GenericRecord> builder = AvroParquetReader.builder(path);
ParquetReader<GenericRecord> reader = builder.build();

GenericRecord record = null;
while((record = reader.read()) != null) {
  System.out.println(record.toString());
}
我尝试对字节数组值执行的任何类型强制转换

(byte[]) record.get("var3")
投掷

java.lang.ClassCastException: org.apache.avro.generic.GenericData$Fixed cannot be cast to [B
如何将此GenericData转换回十进制

拼花地板文件架构:

-bash-4.1$ parquet-tools schema my-parquet-file.gz.parquet
message spark_schema {
optional binary var1 (UTF8);
optional int64 var2;
optional fixed_len_byte_array(16) var3 (DECIMAL(38,8));
}

使用较新版本的avro()


这正是我所需要的。

使用Avro GenericData API:

Binary binary = simpleGroup.getBinary(i, 0);
Conversions.DecimalConversion decimalConversions = new Conversions.DecimalConversion();
BigDecimal bigDecimal = decimalConversions.fromFixed(
   new GenericData.Fixed(Schema.create(Schema.Type.DOUBLE), binary.getBytes()), 
   Schema.create(Schema.Type.DOUBLE), 
   LogicalTypes.decimal(38, 10));
我更喜欢“火花之路”:


除了API文档外,您能否提供您实际如何使用它来解决问题的示例代码?
public BigDecimal fromFixed(GenericFixed value,
               Schema schema,
               LogicalType type)
Binary binary = simpleGroup.getBinary(i, 0);
Conversions.DecimalConversion decimalConversions = new Conversions.DecimalConversion();
BigDecimal bigDecimal = decimalConversions.fromFixed(
   new GenericData.Fixed(Schema.create(Schema.Type.DOUBLE), binary.getBytes()), 
   Schema.create(Schema.Type.DOUBLE), 
   LogicalTypes.decimal(38, 10));
Binary binary = simpleGroup.getBinary(i, 0);
new BigDecimal(new BigInteger(binary.getBytes()), scale);