Intellij设置scala和spark
我有Intellij IDEA社区版。我需要有关为IntellJ配置Apache Spark的帮助。我想通过Scala从GA获取数据..我使用crealytics My build.sbt:Intellij设置scala和spark,scala,intellij-idea,apache-spark,google-analytics,apache-spark-sql,Scala,Intellij Idea,Apache Spark,Google Analytics,Apache Spark Sql,我有Intellij IDEA社区版。我需要有关为IntellJ配置Apache Spark的帮助。我想通过Scala从GA获取数据..我使用crealytics My build.sbt: name := "scala-project-test" version := "1.0" scalaVersion := "2.11.8" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "1.6.1", "
name := "scala-project-test"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1",
"org.apache.spark" %% "spark-sql" % "1.6.1",
"com.crealytics" % "spark-google-analytics_2.11" % "0.8.1"
)
和scala.com
import org.apache.spark.sql.SQLContext
object analytics {
val sqlContext = new SQLContext(sc)
val df = sqlContext.read
.format("com.crealytics.google.analytics")
.option("serviceAccountId", "xxxxxxxxx@developer.gserviceaccount.com")
.option("keyFileLocation", "/Users/userABC/IdeaProjects/scala-project-test/xxxx.p12")
.option("ids", "ga:xxxxxx")
.option("startDate", "7daysAgo")
.option("endDate", "yesterday")
.option("dimensions", "date,browser,city")
.option("queryIndividualDays", "true")
.load()
df.select("browser", "users").show()
}
运行对象分析时,出现错误:
未找到:值sc
我认为Spark配置存在问题,因为sc是SparkContext,但我不知道它在哪里
有什么提示吗?简短回答
val conf = createSparkConf
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
通常我使用SparkContextFactory
object SparkContextFactory {
val conf = createSparkConf
val sc = new SparkContext(conf)
def createSparkSQLContext: SQLContext = {
val sc = new SparkContext(conf)
val sQLContext = new SQLContext(sc)
sQLContext
}
def stopSparkContext() = sc.stop()
private def createSparkConf: SparkConf = {
val conf = ConfigLoader.conf
val masterPath = conf.getString("spark.master.path")
new SparkConf()
.setMaster(masterPath).setAppName("SparkContext")
.set( "spark.driver.allowMultipleContexts" , "true")
}
}
在配置文件中,您需要设置一个主url,这是本地框的外观
spark.master.path="local[*]"
还有
set( "spark.driver.allowMultipleContexts" , "true")
只是为了进行本地测试,我使用的工厂是这样的
val sqlc = SparkContextFactory.createSparkSQLContext
简短回答
val conf = createSparkConf
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
通常我使用SparkContextFactory
object SparkContextFactory {
val conf = createSparkConf
val sc = new SparkContext(conf)
def createSparkSQLContext: SQLContext = {
val sc = new SparkContext(conf)
val sQLContext = new SQLContext(sc)
sQLContext
}
def stopSparkContext() = sc.stop()
private def createSparkConf: SparkConf = {
val conf = ConfigLoader.conf
val masterPath = conf.getString("spark.master.path")
new SparkConf()
.setMaster(masterPath).setAppName("SparkContext")
.set( "spark.driver.allowMultipleContexts" , "true")
}
}
在配置文件中,您需要设置一个主url,这是本地框的外观
spark.master.path="local[*]"
还有
set( "spark.driver.allowMultipleContexts" , "true")
只是为了进行本地测试,我使用的工厂是这样的
val sqlc = SparkContextFactory.createSparkSQLContext
另外,如果您不想在spark.master.path=“local[*]”之类的配置文件中提供VM参数,那么在创建SparkConf时,也可以指定您的配置。val conf=new SparkConf().setAppName(“App Name”).setMaster(“local”)。此外,如果您不想将VM参数或设置为spark.master.path=“local[*]”等配置文件,则在创建SparkConf时,也可以指定您的配置。val conf=new SparkConf().setAppName(“应用程序名称”).setMaster(“本地”)。