Scala spark:220:错误:缺少“的参数类型”;“地图”;
Spark 2.1,scala:我正在将GDELT数据转换为GraphX格式。但是,所列示例在使用哈希3创建哈希值时失败: 我对scala类型了解不够,无法诊断此错误消息Scala spark:220:错误:缺少“的参数类型”;“地图”;,scala,apache-spark,Scala,Apache Spark,Spark 2.1,scala:我正在将GDELT数据转换为GraphX格式。但是,所列示例在使用哈希3创建哈希值时失败: 我对scala类型了解不够,无法诊断此错误消息 val eventsFromTo = gdelt.select("Actor1Name","Actor2Name").where("actor1Name is not null and actor2name is not null") eventsFromTo.show(5) +-------------+-----
val eventsFromTo = gdelt.select("Actor1Name","Actor2Name").where("actor1Name is not null and actor2name is not null")
eventsFromTo.show(5)
+-------------+----------+
| Actor1Name|Actor2Name|
+-------------+----------+
| SENATE| RUSSIAN|
| MEXICO| TEXAS|
| RUSSIAN| SENATE|
| VERMONT| CANADA|
|UNITED STATES| POLICE|
+-------------+----------+
only showing top 5 rows
val eventActors = gdelt.select("Actor1Name","Actor2Name").where("actor1Name is not null and actor2name is not null").flatMap(x => Iterable(x(0).toString,x(1).toString))
eventActors.show(5)
+-------+
| value|
+-------+
| SENATE|
|RUSSIAN|
| MEXICO|
| TEXAS|
|RUSSIAN|
+-------+
然后我尝试将其转换为graphX:
val eventVertices: RDD[(VertexId, String)] = eventActors.distinct().map(x => (MurmurHash3.stringHash((x),x)))
<console>:265: error: missing parameter type
val eventVertices:RDD[(VertexId,String)]=eventActors.distinct().map(x=>(murrushash3.stringHash((x),x)))
:265:错误:缺少参数类型
如果为I添加类型,则会出现以下错误:
<console>:265: error: type mismatch;
found : String
required: Int
val eventVertices: RDD[(VertexId, String)] = eventActors.distinct().map((x:String) => (MurmurHash3.stringHash((x),x)))
:265:错误:类型不匹配;
找到:字符串
必填项:Int
val eventVertices:RDD[(VertexId,String)]=eventActors.distinct().map((x:String)=>(thrullehash3.stringHash((x,x)))
在执行map()之前,我缺少“.rdd”来将它们转换为rdd
val eventsFromTo = gdelt.select("Actor1Name","Actor2Name").where("actor1Name is not null and actor2name is not null").rdd
val eventActors = gdelt.select("Actor1Name","Actor2Name").where("actor1Name is not null and actor2name is not null").flatMap(x => Iterable(x(0).toString,x(1).toString)).rdd