如何在scala中动态实例化数组缓冲区[Type]_Scala_Apache Spark_User Defined Functions_Scala Collections

如何在scala中动态实例化数组缓冲区[Type]

scala apache-spark

如何在scala中动态实例化数组缓冲区[Type],scala,apache-spark,user-defined-functions,scala-collections,Scala,Apache Spark,User Defined Functions,Scala Collections,我想在scala中创建一个数组缓冲区，而不在开始时用数据类型实例化它。我想检查一个条件，然后动态地将类型传递给它。查看给定的代码 def rowGen(startNumber:Int,tableIdentifier:String,NumRows:Int)={ var tmpArrayBuffer:collection.mutable.ArrayBuffer[_]=null // I tried [T] here. That didn't work either. tableIdentifier

我想在scala中创建一个数组缓冲区，而不在开始时用数据类型实例化它。我想检查一个条件，然后动态地将类型传递给它。查看给定的代码

def rowGen(startNumber:Int,tableIdentifier:String,NumRows:Int)={
var tmpArrayBuffer:collection.mutable.ArrayBuffer[_]=null  // I tried [T] here. That didn't work either.
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer= new collection.mutable.ArrayBuffer[(String,String,String,String)]()
case value if value==batchTable => tmpArrayBuffer= new collection.mutable.ArrayBuffer[(String,String)]()
}
for (currentNum <- startNumber to startNumber+NumRows)
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer+=(s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}")
case value if value==batchTable => tmpArrayBuffer+=(s"col1-${currentNum}",s"col2-${currentNum}")
}
tableIdentifier match {
case value if value==baseTable => tmpArrayBuffer.toSeq.toDF("col1","col2","col3","col4")
case value if value==batchTable => tmpArrayBuffer.toSeq.toDF("col1","col2")
}
}

def rowGen（起始编号：Int，表格标识符：String，NumRows:Int）={
var tmpArrayBuffer:collection.mutable.ArrayBuffer[\u]=null//我在这里尝试了[T]，但也没有成功。
表标识符匹配{
如果value==baseTable=>tmpArrayBuffer=new collection.mutable.ArrayBuffer[（String，String，String，String）]（）
如果值==batchTable=>tmpArrayBuffer=new collection.mutable.ArrayBuffer[（字符串，字符串）]（，则为大小写值
}
对于（currentNum tmpArrayBuffer+=（s“col1-${currentNum}”、s“col2-${currentNum}”、s“col3-${currentNum}”、s“col4-${currentNum}”）
如果值==batchTable=>tmpArrayBuffer+=（s“col1-${currentNum}”，s“col2-${currentNum}”），则为大小写值
}
表标识符匹配{
如果value==baseTable=>tmpArrayBuffer.toSeq.toDF（“col1”、“col2”、“col3”、“col4”），则为大小写值
如果value==batchTable=>tmpArrayBuffer.toSeq.toDF（“col1”、“col2”），则为大小写值
}
}

请帮助我。根据我想实例化ArrayBuffer[（String，String）]或ArrayBuffer[（String，String，String，String）]的条件。

我只想在匹配中定义数组缓冲区：

import org.apache.spark.sql.DataFrame

val baseTable = "baseTable"
val batchTable = "batchTable"

def rowGen(startNumber:Int, tableIdentifier:String, NumRows:Int) : DataFrame = {
    tableIdentifier match {
        case `baseTable` => {
            var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String,String,String)]
            for (currentNum <- startNumber to startNumber+NumRows){
                tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}"))
            }
            tmpArrayBuffer.toSeq.toDF("col1","col2","col3","col4")
        }
        case `batchTable` => {
            var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String)]
            for (currentNum <- startNumber to startNumber+NumRows) {
                tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}"))
            }
            tmpArrayBuffer.toSeq.toDF("col1","col2")
        }
    }
}

scala> rowGen(1, "batchTable", 5).show
+------+------+
|  col1|  col2|
+------+------+
|col1-1|col2-1|
|col1-2|col2-2|
|col1-3|col2-3|
|col1-4|col2-4|
|col1-5|col2-5|
|col1-6|col2-6|
+------+------+

scala> rowGen(1, "baseTable", 5).show
+------+------+------+------+
|  col1|  col2|  col3|  col4|
+------+------+------+------+
|col1-1|col2-1|col3-1|col4-1|
|col1-2|col2-2|col3-2|col4-2|
|col1-3|col2-3|col3-3|col4-3|
|col1-4|col2-4|col3-4|col4-4|
|col1-5|col2-5|col3-5|col4-5|
|col1-6|col2-6|col3-6|col4-6|
+------+------+------+------+

我只想在匹配中定义数组缓冲区：

import org.apache.spark.sql.DataFrame

val baseTable = "baseTable"
val batchTable = "batchTable"

def rowGen(startNumber:Int, tableIdentifier:String, NumRows:Int) : DataFrame = {
    tableIdentifier match {
        case `baseTable` => {
            var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String,String,String)]
            for (currentNum <- startNumber to startNumber+NumRows){
                tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}",s"col3-${currentNum}",s"col4-${currentNum}"))
            }
            tmpArrayBuffer.toSeq.toDF("col1","col2","col3","col4")
        }
        case `batchTable` => {
            var tmpArrayBuffer = new collection.mutable.ArrayBuffer[(String,String)]
            for (currentNum <- startNumber to startNumber+NumRows) {
                tmpArrayBuffer += ((s"col1-${currentNum}",s"col2-${currentNum}"))
            }
            tmpArrayBuffer.toSeq.toDF("col1","col2")
        }
    }
}

scala> rowGen(1, "batchTable", 5).show
+------+------+
|  col1|  col2|
+------+------+
|col1-1|col2-1|
|col1-2|col2-2|
|col1-3|col2-3|
|col1-4|col2-4|
|col1-5|col2-5|
|col1-6|col2-6|
+------+------+

scala> rowGen(1, "baseTable", 5).show
+------+------+------+------+
|  col1|  col2|  col3|  col4|
+------+------+------+------+
|col1-1|col2-1|col3-1|col4-1|
|col1-2|col2-2|col3-2|col4-2|
|col1-3|col2-3|col3-3|col4-3|
|col1-4|col2-4|col3-4|col4-4|
|col1-5|col2-5|col3-5|col4-5|
|col1-6|col2-6|col3-6|col4-6|
+------+------+------+------+

直接使用

Seq.newBuilder

，看不到

ArrayBuffer

的任何好处。谢谢@cchantep和mck的帮助。我原以为我们可以在不同的地方使用相同的变量，但Seq.newBuilder也很好用。@Raptor0009你可以，但我看不到事先动态实例化它的意义。我只是重构code为了避免这种需要，就像我在回答中所做的那样。@mck如果可以做到的话，你能帮我把数据类型动态传递给ArrayBuffer的示例代码给我吗。我只是好奇。@Raptor0009我不清楚如何做到这一点……你可能需要一些奇特的多态性来实现这一点。直接使用

Seq.newBuilder

，好吗没有看到ArrayBuffer的任何好处，谢谢@cchantep和mck的帮助。我原以为我们可以在不同的地方使用相同的变量，但Seq.newBuilder也可以工作得很好。@Raptor0009你可以，但我看不到事先动态实例化它的意义。我只是重构代码以避免需要，就像我所说的那样d在我的答案中。@mck如果可以的话，你能帮我把数据类型动态传递给ArrayBuffer的示例代码给我吗。我只是好奇。@Raptor0009我不清楚如何做到这一点……你可能需要一些奇特的多态性来实现这一点