Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在scala中提取avro文件中的部分数组?_Scala_Apache Spark_Avro - Fatal编程技术网

如何在scala中提取avro文件中的部分数组?

如何在scala中提取avro文件中的部分数组?,scala,apache-spark,avro,Scala,Apache Spark,Avro,这是我正在分析的avro文件的格式: var ttime: Long = 0; var eTime: Long = 0; var tids: String = ""; var tlevel: Integer = 0; var tboot: Long = 0; var rNo: Integer = 0; var varType: String = ""; var uids: List[TRUEntry] = Nil; List[TRUEntry]是我正在分析的数组。我就是这样做的 thi

这是我正在分析的avro文件的格式:

var ttime: Long = 0;
var eTime: Long = 0;
var tids: String = "";
var tlevel: Integer = 0;
var tboot: Long = 0;
var rNo: Integer = 0;
var varType: String = "";
var uids: List[TRUEntry] = Nil;
List[TRUEntry]是我正在分析的数组。我就是这样做的

    this.uids = Nil
    row.getAs[Seq[Row]]("uids")
    .foreach((objRow: Row) => 
      this.uids ::= (new TRUEntry(objRow))
    )
这就是我解析UID的方式:

 this.uids    
.foreach((obj:TRUEntry) => {
  uInfo += obj.uId + " , " + obj.initM.toString() + " , "
})    
如何在以下代码中从上述数组提取并传递obj.uid:

 val avroParsed = avroRow
    .map(x => new TRParser(x))
    .map((obj: TRParser) => ((obj.tids, **obj.uId**),1))

可以使用以下代码完成此操作:

val avroParsed = avroRow
    .map(x => new TRParser(x))
    .map((obj: TRParser) => {
      val tId = obj.source.trim
      var retVal: String = ""
      obj.uids
        .foreach((obj: TRUEntry) => {
          retVal += tId + "," + obj.uId.trim + ":"
        })
        retVal.dropRight(1)
    })

tid和uid的值保存为单独的变量,然后可以在“for”循环中处理这些变量。

可以使用以下代码完成:

val avroParsed = avroRow
    .map(x => new TRParser(x))
    .map((obj: TRParser) => {
      val tId = obj.source.trim
      var retVal: String = ""
      obj.uids
        .foreach((obj: TRUEntry) => {
          retVal += tId + "," + obj.uId.trim + ":"
        })
        retVal.dropRight(1)
    })
tid和uid的值保存为单独的变量,然后可以在“for”循环中处理这些变量