Intellij设置scala和spark

Intellij设置scala和spark,scala,intellij-idea,apache-spark,google-analytics,apache-spark-sql,Scala,Intellij Idea,Apache Spark,Google Analytics,Apache Spark Sql,我有Intellij IDEA社区版。我需要有关为IntellJ配置Apache Spark的帮助。我想通过Scala从GA获取数据..我使用crealytics My build.sbt: name := "scala-project-test" version := "1.0" scalaVersion := "2.11.8" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.6.1", "

我有Intellij IDEA社区版。我需要有关为IntellJ配置Apache Spark的帮助。我想通过Scala从GA获取数据..我使用crealytics

My build.sbt:

name := "scala-project-test"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.6.1",
  "org.apache.spark" %% "spark-sql" % "1.6.1",
  "com.crealytics" % "spark-google-analytics_2.11" % "0.8.1"
)
和scala.com

import org.apache.spark.sql.SQLContext

object analytics {


  val sqlContext = new SQLContext(sc)
  val df = sqlContext.read
    .format("com.crealytics.google.analytics")
    .option("serviceAccountId", "xxxxxxxxx@developer.gserviceaccount.com")
    .option("keyFileLocation", "/Users/userABC/IdeaProjects/scala-project-test/xxxx.p12")
    .option("ids", "ga:xxxxxx")
    .option("startDate", "7daysAgo")
    .option("endDate", "yesterday")
    .option("dimensions", "date,browser,city")
    .option("queryIndividualDays", "true")
    .load()

  df.select("browser", "users").show()
}
运行对象分析时,出现错误: 未找到:值sc

我认为Spark配置存在问题,因为sc是SparkContext,但我不知道它在哪里

有什么提示吗?

简短回答

  val conf = createSparkConf
  val sc = new SparkContext(conf)
  val sqlContext = new SQLContext(sc)

通常我使用SparkContextFactory

object SparkContextFactory {

  val conf = createSparkConf
  val sc = new SparkContext(conf)

  def createSparkSQLContext: SQLContext = {
    val sc = new SparkContext(conf)
    val sQLContext = new SQLContext(sc)
    sQLContext
  }

  def stopSparkContext() = sc.stop()

  private def createSparkConf: SparkConf = {
    val conf = ConfigLoader.conf

    val masterPath = conf.getString("spark.master.path")

    new SparkConf()
      .setMaster(masterPath).setAppName("SparkContext")
      .set( "spark.driver.allowMultipleContexts" , "true")
  }
}
在配置文件中,您需要设置一个主url,这是本地框的外观

spark.master.path="local[*]"
还有

set( "spark.driver.allowMultipleContexts" , "true")
只是为了进行本地测试,我使用的工厂是这样的

val sqlc = SparkContextFactory.createSparkSQLContext
简短回答

  val conf = createSparkConf
  val sc = new SparkContext(conf)
  val sqlContext = new SQLContext(sc)

通常我使用SparkContextFactory

object SparkContextFactory {

  val conf = createSparkConf
  val sc = new SparkContext(conf)

  def createSparkSQLContext: SQLContext = {
    val sc = new SparkContext(conf)
    val sQLContext = new SQLContext(sc)
    sQLContext
  }

  def stopSparkContext() = sc.stop()

  private def createSparkConf: SparkConf = {
    val conf = ConfigLoader.conf

    val masterPath = conf.getString("spark.master.path")

    new SparkConf()
      .setMaster(masterPath).setAppName("SparkContext")
      .set( "spark.driver.allowMultipleContexts" , "true")
  }
}
在配置文件中,您需要设置一个主url,这是本地框的外观

spark.master.path="local[*]"
还有

set( "spark.driver.allowMultipleContexts" , "true")
只是为了进行本地测试,我使用的工厂是这样的

val sqlc = SparkContextFactory.createSparkSQLContext

另外,如果您不想在spark.master.path=“local[*]”之类的配置文件中提供VM参数,那么在创建SparkConf时,也可以指定您的配置。val conf=new SparkConf().setAppName(“App Name”).setMaster(“local”)。此外,如果您不想将VM参数或设置为spark.master.path=“local[*]”等配置文件,则在创建SparkConf时,也可以指定您的配置。val conf=new SparkConf().setAppName(“应用程序名称”).setMaster(“本地”)。