Mapreduce 在apachecrunch中编写拼花地板文件_Mapreduce_Hadoop2_Parquet_Apache Crunch

Mapreduce 在apachecrunch中编写拼花地板文件

mapreduce

Mapreduce 在apachecrunch中编写拼花地板文件,mapreduce,hadoop2,parquet,apache-crunch,Mapreduce,Hadoop2,Parquet,Apache Crunch,我是apache crunch的新手，正在寻找在apache crunch中读取和写入拼花文件的方法。我遵循了文档和API，但没有得到直接的方法来做同样的事情 PCollection<String> pipeLine = MemPipeline.collectionOf("Pineapple", "Banana", "Orange"); PCollection<Integer> b = pipeLine.parallelDo(new DoFn<String, I

我是apache crunch的新手，正在寻找在apache crunch中读取和写入拼花文件的方法。我遵循了文档和API，但没有得到直接的方法来做同样的事情

PCollection<String> pipeLine = MemPipeline.collectionOf("Pineapple", "Banana", "Orange");

PCollection<Integer> b = pipeLine.parallelDo(new DoFn<String, Integer>() {

   private static final long serialVersionUID = 1L;

   @Override
   public void process(String input, Emitter<Integer> emitter) {
        emitter.emit(input.length());
    }
  }, ints());

  b.write(new AvroParquetFileTarget("D:\\Tutorials\\CCP_WorkSpace\\Crunch\\resources\\output"));

PCollection pipeLine=MemPipeline.collectionOf（“菠萝”、“香蕉”、“橙子”）；
PCollection b=pipeLine.parallelDo（新的DoFn（）{
私有静态最终长serialVersionUID=1L；
@凌驾
公共无效进程（字符串输入、发射器）{
emit（input.length（））；
}
}，ints（））；
b、 写入（新的AvroParquetFileTarget（“D:\\Tutorials\\CCP\u WorkSpace\\Crunch\\resources\\output”）；

提前感谢。

如果您有一个avro模式和一个来自该avro的编译类，其中包含与您的拼花地板数据相同的结构，您可以通过以下方式读取它

AvroParquetFileSource<MyClassCompiled> avroParquetFileSource = 
new AvroParquetFileSource<MyClassCompiled>(
                    new Path(input), Avros.records(MyClassCompiled.class)
);

请用您尝试过的方法和您遵循的文档链接编辑您的问题。另外，粘贴不起作用的代码。：）<代码>PCollection pipeLine=MemPipeline.collectionOf（“菠萝”、“香蕉”、“橘子”）；PCollection b=pipeLine.parallelDo（new DoFn（）{private static final long serialVersionUID=1L；@Override public void process（字符串输入，发射器发射器发射器）{Emitter.emit（输入.length（）；}}}，ints（））；b.write（新的AvroParquetFileTarget（“D:\\Tutorials\\WorkSpace\\Crunch\\resources\\output”）；}感谢@SagarKulkarni的回复，以上是我正在尝试的代码。请用带有适当缩进的代码编辑您的问题。：）@SagarKulkarni我已经在问题框中添加了代码片段，很抱歉给您带来不便：）

Target parquetFileTarget = new AvroParquetFileTarget(outputPath);
mypcollection.write(avroParquetFileSource);