Apache spark 斯帕克、亚当和齐柏林飞艇
试图用亚当和齐柏林飞艇进行基因组分析。我不确定我是否做得对,但遇到了以下问题Apache spark 斯帕克、亚当和齐柏林飞艇,apache-spark,apache-zeppelin,Apache Spark,Apache Zeppelin,试图用亚当和齐柏林飞艇进行基因组分析。我不确定我是否做得对,但遇到了以下问题 %dep z.reset() z.addRepo("Spark Packages Repo").url("http://dl.bintray.com/spark-packages/maven") z.load("com.databricks:spark-csv_2.10:1.2.0") z.load("mysql:mysql-connector-java:5.1.35") z.load("org.bdgeno
%dep
z.reset()
z.addRepo("Spark Packages Repo").url("http://dl.bintray.com/spark-packages/maven")
z.load("com.databricks:spark-csv_2.10:1.2.0")
z.load("mysql:mysql-connector-java:5.1.35")
z.load("org.bdgenomics.adam:adam-core_2.10:0.20.0")
z.load("org.bdgenomics.adam:adam-cli_2.10:0.20.0")
z.load("org.bdgenomics.adam:adam-apis_2.10:0.20.0")
%spark
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.projections.{ AlignmentRecordField, Projection }
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.Projection
import org.bdgenomics.adam.projections.AlignmentRecordField
import scala.io.Source
import org.apache.spark.rdd.RDD
import org.bdgenomics.formats.avro.Genotype
import scala.collection.JavaConverters._
import org.bdgenomics.formats.avro._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.linalg.{ Vector => MLVector, Vectors }
import org.apache.spark.mllib.clustering.{ KMeans, KMeansModel }
val ac = new ADAMContext(sc)
我得到了以下带有错误的输出
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.projections.{AlignmentRecordField, Projection}
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.bdgenomics.adam.rdd.ADAMContext
import org.bdgenomics.adam.rdd.ADAMContext._
import org.bdgenomics.adam.projections.Projection
import org.bdgenomics.adam.projections.AlignmentRecordField
import scala.io.Source
import org.apache.spark.rdd.RDD
import org.bdgenomics.formats.avro.Genotype
import scala.collection.JavaConverters._
import org.bdgenomics.formats.avro._
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.linalg.{Vector=>MLVector, Vectors}
import org.apache.spark.mllib.clustering.{KMeans, KMeansModel}
res7: org.apache.spark.SparkContext = org.apache.spark.SparkContext@62ec8142
<console>:188: error: constructor ADAMContext in class ADAMContext cannot be accessed in class $iwC
new ADAMContext(sc)
import org.bdgenomics.adam.rdd.ADAMContext_
导入org.bdgenomics.adam.rdd.ADAMContext
导入org.bdgenomics.adam.projections.{AlignmentRecordField,Projection}
导入org.apache.spark.SparkContext
导入org.apache.spark.SparkConf
导入org.bdgenomics.adam.rdd.ADAMContext
导入org.bdgenomics.adam.rdd.ADAMContext_
导入org.bdgenomics.adam.projections.Projection
导入org.bdgenomics.adam.projections.AlignmentRecordField
导入scala.io.Source
导入org.apache.spark.rdd.rdd
导入org.bdgenomics.formats.avro.genetic
导入scala.collection.JavaConverters_
导入org.bdgenomics.formats.avro_
导入org.apache.spark.SparkContext_
导入org.apache.spark.mllib.linalg.{Vector=>MLVector,Vectors}
导入org.apache.spark.mllib.clustering.{KMeans,KMeansModel}
res7:org.apache.spark.SparkContext=org.apache.spark。SparkContext@62ec8142
:188:错误:无法在类$iwC中访问类ADAMContext中的构造函数ADAMContext
新ADAMContext(sc)
知道去哪里找吗?我是否缺少任何依赖项?
^根据您使用的版本中的文件。构造函数是私有的
class ADAMContext private (@transient val sc: SparkContext)
extends Serializable with Logging {
...
}
你可以这样使用
import org.bdgenomics.adam.rdd.ADAMContext._
val adamContext: ADAMContext = z.sc
它将在对象ADAMContext中使用隐式转换
object ADAMContext {
implicit def sparkContextToADAMContext(sc: SparkContext): ADAMContext =
new ADAMContext(sc)
}
根据您使用的版本中的文件。构造函数是私有的
class ADAMContext private (@transient val sc: SparkContext)
extends Serializable with Logging {
...
}
你可以这样使用
import org.bdgenomics.adam.rdd.ADAMContext._
val adamContext: ADAMContext = z.sc
它将在对象ADAMContext中使用隐式转换
object ADAMContext {
implicit def sparkContextToADAMContext(sc: SparkContext): ADAMContext =
new ADAMContext(sc)
}
它没有使用Z引用就工作了
val ac:ADAMContext = sc
val genotypes: RDD[Genotype] = ac.loadGenotypes("/tmp/ADAM2").rdd
输出
ac: org.bdgenomics.adam.rdd.ADAMContext = org.bdgenomics.adam.rdd.ADAMContext@2c60ef7e
genotypes:
org.apache.spark.rdd.RDD[org.bdgenomics.formats.avro.Genotype] = MapPartitionsRDD[3] at map at ADAMContext.scala:207
我曾尝试在AdamShell提示符下执行此操作,但我不记得必须使用隐式转换。不过它使用的是ADAM的0.19版本。它在没有使用Z引用的情况下工作
val ac:ADAMContext = sc
val genotypes: RDD[Genotype] = ac.loadGenotypes("/tmp/ADAM2").rdd
输出
ac: org.bdgenomics.adam.rdd.ADAMContext = org.bdgenomics.adam.rdd.ADAMContext@2c60ef7e
genotypes:
org.apache.spark.rdd.RDD[org.bdgenomics.formats.avro.Genotype] = MapPartitionsRDD[3] at map at ADAMContext.scala:207
我曾尝试在AdamShell提示符下执行此操作,但我不记得必须使用隐式转换。但是它使用的是0.19版本的ADAM。我试过了,对象似乎是空的
%spark val ac:ADAMContext=sc ac:org.bdgenomics.ADAM.rdd.ADAMContext=null
我试过了,对象似乎是空的%spark val ac:ADAMContext=sc ac:org.bdgenomics.ADAM.rdd.ADAMContext=null