Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/redis/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何传递复杂的外部变量,例如map';从Spark with Java中的驱动程序到UDF的值是多少?_Java_Scala_Apache Spark - Fatal编程技术网

如何传递复杂的外部变量,例如map';从Spark with Java中的驱动程序到UDF的值是多少?

如何传递复杂的外部变量,例如map';从Spark with Java中的驱动程序到UDF的值是多少?,java,scala,apache-spark,Java,Scala,Apache Spark,当我需要将Java hashmap传递给UDF时,我遇到了一个很大的问题,UDF本身被定义为一个单独的类,而不是一些内联lambda函数,它可以访问定义为广播变量的封闭范围的变量。我在这里开始这个问题也是为了这个目的: 没有提供令人满意的答案,因为人们只向我提供包含简单UDF的答案,这些UDF可以定义为小lambda,因此可以从驱动程序访问广播变量 正如我在另一个问题中所详述的那样,我开始研究typedlits,在我看来这是前进的方向,但是Java中几乎没有关于这个方法的文档,尽管Scala中

当我需要将Java hashmap传递给UDF时,我遇到了一个很大的问题,UDF本身被定义为一个单独的类,而不是一些内联lambda函数,它可以访问定义为广播变量的封闭范围的变量。我在这里开始这个问题也是为了这个目的:

没有提供令人满意的答案,因为人们只向我提供包含简单UDF的答案,这些UDF可以定义为小lambda,因此可以从驱动程序访问广播变量


正如我在另一个问题中所详述的那样,我开始研究typedlits,在我看来这是前进的方向,但是Java中几乎没有关于这个方法的文档,尽管Scala中也有关于这个方法的示例和教程。因此,我的问题是如何使用typedlit将复杂变量的值传递给UDF?

我通过一条漫长而艰难的途径找到了这个问题的答案,并将其发布在这里,以帮助其他可能面临同样问题的人

官方Spark Javadocs给出了typedLit方法定义,如下所示:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
为了在我的Java Maven项目中使用此对象,我遵循了本博客给出的结构:

我必须在pom中包含的依赖项如下:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
我定义了一个虚拟映射以发送到我的UDF:

 Map<String, String> testMap = new HashMap<>();
 testMap.put("1", "One");
我无法将MapString val发送到UDF,因为编译器总是抱怨它在TypeDefs中具有私有访问权限。通过链接,我发现在Java中,val是通过方法调用(如getter)访问的,而不是直接通过val本身

TestUDF I定义如下:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
公共类TestUDF实现UDF1{
@凌驾
公共字符串调用(scala.collection.immutable.Map t1)引发异常{
//TODO自动生成的方法存根
系统输出打印项次(t1);
AsJava asJavaMap=JavaConverters.mapAsJavaMapConverter(t1);
Map javaMap=asJavaMap.asJava();
System.out.println(“1的值:+javaMap.get(“1”));
返回null;
}
}


这终于奏效了,我可以从我的UDF访问地图。

我通过一条漫长而艰难的道路找到了这个问题的答案,我将此贴在这里,作为对其他可能面临同样问题的人的帮助

官方Spark Javadocs给出了typedLit方法定义,如下所示:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
为了在我的Java Maven项目中使用此对象,我遵循了本博客给出的结构:

我必须在pom中包含的依赖项如下:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
我定义了一个虚拟映射以发送到我的UDF:

 Map<String, String> testMap = new HashMap<>();
 testMap.put("1", "One");
我无法将MapString val发送到UDF,因为编译器总是抱怨它在TypeDefs中具有私有访问权限。通过链接,我发现在Java中,val是通过方法调用(如getter)访问的,而不是直接通过val本身

TestUDF I定义如下:

typedLit(T literal, scala.reflect.api.TypeTags.TypeTag<T> evidence$1)
<dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.7</version>
    </dependency>
      <plugin>
            <groupId>net.alchim31.maven</groupId>
            <artifactId>scala-maven-plugin</artifactId>
            <version>3.2.2</version>
            <executions>
                <execution>
                    <id>compile</id>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                    <phase>compile</phase>
                </execution>
                <execution>
                    <id>test-compile</id>
                    <goals>
                        <goal>testCompile</goal>
                    </goals>
                    <phase>test-compile</phase>
                </execution>
                <execution>
                    <phase>process-resources</phase>
                    <goals>
                        <goal>compile</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>
public class TestUDF implements UDF1<scala.collection.immutable.Map<String, String>,String> {

@Override
public String call(scala.collection.immutable.Map<String, String> t1) throws Exception {
    // TODO Auto-generated method stub
    System.out.println(t1);
    AsJava<Map<String, String>> asJavaMap = JavaConverters.mapAsJavaMapConverter(t1);
    Map<String, String> javaMap = asJavaMap.asJava();
    System.out.println("Value of 1: " + javaMap.get("1"));      
    return null;
}
公共类TestUDF实现UDF1{
@凌驾
公共字符串调用(scala.collection.immutable.Map t1)引发异常{
//TODO自动生成的方法存根
系统输出打印项次(t1);
AsJava asJavaMap=JavaConverters.mapAsJavaMapConverter(t1);
Map javaMap=asJavaMap.asJava();
System.out.println(“1的值:+javaMap.get(“1”));
返回null;
}
}


这终于奏效了,我可以从我的UDF访问地图。

这帮了我大忙,thnx。但由于scala版本的不兼容性,我遇到了一个问题。UDF接收了<代码>不可更改的映射,但是定义的<代码> SCALAMAP <代码>是一个<代码>可变的<代码>,所以我在将来的人们面临一个问题时,考虑一下这个场景对我帮助很大,THNX。但由于scala版本的不兼容性,我遇到了一个问题。UDF接收了<代码>不可更改的 map,但是定义的<代码> SCALAMAP <代码>是一个<代码>可变的 >所以我在将来的人面临一个问题时,考虑一下这个场景。