Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/376.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何让ApacheSpark忽略查询中的点?_Java_Json_Apache Spark - Fatal编程技术网

Java 如何让ApacheSpark忽略查询中的点?

Java 如何让ApacheSpark忽略查询中的点?,java,json,apache-spark,Java,Json,Apache Spark,给定以下JSON文件: [{"dog*woof":"bad dog 1","dog.woof":"bad dog 32"}] 为什么此Java代码失败: DataFrame df = sqlContext.read().json("dogfile.json"); df.groupBy("dog.woof").count().show(); 但这并不是: DataFrame df = sqlContext.read().json("dogfile.json"); df.groupBy("dog

给定以下JSON文件:

[{"dog*woof":"bad dog 1","dog.woof":"bad dog 32"}]
为什么此Java代码失败:

DataFrame df = sqlContext.read().json("dogfile.json");
df.groupBy("dog.woof").count().show();
但这并不是:

DataFrame df = sqlContext.read().json("dogfile.json");
df.groupBy("dog*woof").count().show();
这是失败的一个片段:

 Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'dog.woof' given input columns: [dog*woof, dog.woof];
    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:60)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
...

它失败是因为点用于访问
struct
字段的属性。可以使用反勾号转义列名:

val df = sqlContext.read.json(sc.parallelize(Seq(
   """{"dog*woof":"bad dog 1","dog.woof":"bad dog 32"}"""
)))

df.groupBy("`dog.woof`").count.show
// +----------+-----+
// |  dog.woof|count|
// +----------+-----+
// |bad dog 32|    1|
// +----------+-----+
但是,在名称中使用特殊字符并不是一种好的做法,一般情况下都无法使用