我在jupyter上使用pyspark时遇到了一些问题

我在jupyter上使用pyspark时遇到了一些问题,pyspark,Pyspark,加载文件时发生的问题 from pyspark.ml.classification import NaiveBayes from pyspark.ml.evaluation import MulticlassClassificationEvaluator import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder \ .appName("Mytest") \

加载文件时发生的问题

from pyspark.ml.classification import NaiveBayes  
from pyspark.ml.evaluation import MulticlassClassificationEvaluator 
import pyspark
from pyspark.sql import SparkSession

spark = SparkSession.builder \
         .appName("Mytest") \
         .config("spark.some.config.option","some-value") \
         .getOrCreate()
# Load training data  
data = spark.read.format("libsvm").load("/test/test.txt")  

Py4JJavaError回溯(最近一次调用)在()6 spark=SparkSession.builder.appName(“Mytest”).config(“spark.some.config.option”,“some value”).getOrCreate()7#加载训练数据-->8 data=spark.read.format(“libsvm”).Load(“/test/test.txt”)9#将数据拆分为序列并测试10个拆分=数据。随机拆分([0.6,0.4],1234)Py4JJavaError:调用o323.load时出错:org.apache.spark.sparkeexception:作业因阶段失败而中止:阶段3.0中的任务2失败1次,最近的失败:阶段3.0中的任务2.0丢失(TID 14,localhost,executor driver):java.lang.NumberFormatException:输入字符串:sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)在sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)在java.lang.Double.parseDouble(Double.java:538)在scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:284)请在您的帖子中添加注释中的代码。