Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java SWIFT在Spark中从对象存储中获取数据需要什么配置_Java_Apache Spark_Openstack Swift_Object Storage_Stocator - Fatal编程技术网

Java SWIFT在Spark中从对象存储中获取数据需要什么配置

Java SWIFT在Spark中从对象存储中获取数据需要什么配置,java,apache-spark,openstack-swift,object-storage,stocator,Java,Apache Spark,Openstack Swift,Object Storage,Stocator,我仔细阅读了这份文件,但如何从swift获取数据仍然非常混乱 我在我的一台linux机器上配置了swift。通过使用下面的命令,我可以获得容器列表 swift-A-U 用户名-K密码密钥列表 我看过很多blumix()的博客,并编写了以下代码 sc.textFile("swift://container.myacct/file.xml") 我希望集成到java spark中。其中需要在java代码中配置对象存储凭据。是否有任何示例代码或博客?这说明了使用Scala语言加载数据的多种方法。Sca

我仔细阅读了这份文件,但如何从swift获取数据仍然非常混乱

我在我的一台linux机器上配置了swift。通过使用下面的命令,我可以获得容器列表

swift-A-U 用户名-K密码密钥列表

我看过很多blumix()的博客,并编写了以下代码

sc.textFile("swift://container.myacct/file.xml")
我希望集成到java spark中。其中需要在java代码中配置对象存储凭据。是否有任何示例代码或博客?

这说明了使用Scala语言加载数据的多种方法。Scala在JVM上运行。Java和Scala类可以自由混合,无论它们位于不同的项目中还是同一个项目中。了解Scala代码如何与Openstack Swift对象存储交互的机制,将有助于指导您设计一个Java等价物

在上面的笔记本中,以下是一些步骤,说明如何使用Scala语言从Openstack Swift对象存储实例中配置和提取数据。swift url分解为:

swift2d :// container . myacct / filename.extension
  ^            ^          ^            ^
stocator     name of   namespace    object storage
protocol     container               filename
进口 样本信条 帮助者方法 加载数据
import org.apache.spark.SparkContext
import scala.util.control.NonFatal
import play.api.libs.json.Json

val sqlctx = new SQLContext(sc)
val scplain = sqlctx.sparkContext
// @hidden_cell
var credentials = scala.collection.mutable.HashMap[String, String](
  "auth_url"->"https://identity.open.softlayer.com",
  "project"->"object_storage_3xxxxxx3_xxxx_xxxx_xxxx_xxxxxxxxxxxx",
  "project_id"->"6xxxxxxxxxx04fxxxxxxxxxx6xxxxxx7",
  "region"->"dallas",
  "user_id"->"cxxxxxxxxxxaxxxxxxxxxx1xxxxxxxxx",
  "domain_id"->"cxxxxxxxxxxaxxyyyyyyxx1xxxxxxxxx",
  "domain_name"->"853255",
  "username"->"Admin_cxxxxxxxxxxaxxxxxxxxxx1xxxxxxxxx",
  "password"->"""&M7372!FAKE""",
  "container"->"notebooks",
  "tenantId"->"undefined",
  "filename"->"file.xml"
)
def setRemoteObjectStorageConfig(name:String, sc: SparkContext, dsConfiguration:String) : Boolean = {
    try {
        val result = scala.util.parsing.json.JSON.parseFull(dsConfiguration)
        result match {
            case Some(e:Map[String,String]) => {
                val prefix = "fs.swift2d.service." + name
                val hconf = sc.hadoopConfiguration
                hconf.set("fs.swift2d.impl","com.ibm.stocator.fs.ObjectStoreFileSystem")
                hconf.set(prefix + ".auth.url", e("auth_url") + "/v3/auth/tokens")
                hconf.set(prefix + ".tenant", e("project_id"))
                hconf.set(prefix + ".username", e("user_id"))
                hconf.set(prefix + ".password", e("password"))
                hconf.set(prefix + "auth.method", "keystoneV3")
                hconf.set(prefix + ".region", e("region"))
                hconf.setBoolean(prefix + ".public", true)
                println("Successfully modified sparkcontext object with remote Object Storage Credentials using datasource name " + name)
                println("")
                return true
            }
            case None => println("Failed.")
                return false
        }
    }
    catch {
       case NonFatal(exc) => println(exc)
           return false
    }
}
val setObjStor = setRemoteObjectStorageConfig("sparksql", scplain, Json.toJson(credentials.toMap).toString)
val data_rdd = scplain.textFile("swift2d://notebooks.sparksql/" + credentials("filename"))
data_rdd.take(5)