Pyspark 在本地运行aws粘合作业时出现问题
我试图在本地运行粘合作业,但我遇到了一个问题,当我运行脚本时,引发了一个异常:Pyspark 在本地运行aws粘合作业时出现问题,pyspark,aws-glue,aws-glue-spark,Pyspark,Aws Glue,Aws Glue Spark,我试图在本地运行粘合作业,但我遇到了一个问题,当我运行脚本时,引发了一个异常: py4j.protocol.Py4JJavaError: An error occurred while calling o47.getDynamicFrame. : java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>(Lorg/apach
py4j.protocol.Py4JJavaError: An error occurred while calling o47.getDynamicFrame.
: java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation
如果有人能帮忙,那就太好了。我已经使用docker(如果有兴趣)follow(如果有兴趣)在本地使用glue(如果有兴趣),我已经使用docker(如果有兴趣)follow(如果有兴趣)在本地使用glue(如果有兴趣)
from pyspark.sql import SparkSession
from awsglue.context import GlueContext
spark = SparkSession \
.builder \
.appName("GlueSparkJobExample") \
.config("spark.jars", "AWSGlueETLPython-1.0.0-jar-with-dependencies.jar") \
.config("spark.local.dir", "/tmp") \
.getOrCreate()
sc = spark.sparkContext
glueContext = GlueContext(sc)
db = "database"
table = "table"
my_df = glueContext.create_dynamic_frame.from_catalog(
database=db, table_name=table)