Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/315.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在emr群集上安装com.databricks.spark.xml_Python_Amazon Web Services_Apache Spark_Amazon Emr_Apache Spark Xml - Fatal编程技术网

Python 在emr群集上安装com.databricks.spark.xml

Python 在emr群集上安装com.databricks.spark.xml,python,amazon-web-services,apache-spark,amazon-emr,apache-spark-xml,Python,Amazon Web Services,Apache Spark,Amazon Emr,Apache Spark Xml,有人知道如何在EMR集群上安装com.databricks.spark.xml包吗 我成功连接到主emr,但不知道如何在emr集群上安装软件包 代码 sc.install_pypi_package("com.databricks.spark.xml") 在EMR主节点上: cd /usr/lib/spark/jars sudo wget https://repo1.maven.org/maven2/com/databricks/spark-xml_2.11/0.9.0/spark-xml_2.

有人知道如何在EMR集群上安装com.databricks.spark.xml包吗

我成功连接到主emr,但不知道如何在emr集群上安装软件包

代码

sc.install_pypi_package("com.databricks.spark.xml")
在EMR主节点上:

cd /usr/lib/spark/jars
sudo wget https://repo1.maven.org/maven2/com/databricks/spark-xml_2.11/0.9.0/spark-xml_2.11-0.9.0.jar
确保根据Spark版本和中提供的指南选择正确的罐子

然后,启动Jupyter笔记本,您应该能够运行以下操作:

df = spark.read.format('com.databricks.spark.xml').options(rootTag='objects').options(rowTag='object').load("s3://bucket-name/sample.xml")