如何使用python在apache spark mllib中设置逻辑回归中的优化器
我现在开始在ApacheSpark mllib上进行一些测试如何使用python在apache spark mllib中设置逻辑回归中的优化器,python,bigdata,apache-spark,Python,Bigdata,Apache Spark,我现在开始在ApacheSpark mllib上进行一些测试 def mapper(line): feats = line.strip().split(',') label = feats[len(feats)-1] feats = feats[:len(feats)-1] feats.insert(0,label) return numpy.array([float(feature) for feature in feats]) def test3()
def mapper(line):
feats = line.strip().split(',')
label = feats[len(feats)-1]
feats = feats[:len(feats)-1]
feats.insert(0,label)
return numpy.array([float(feature) for feature in feats])
def test3():
data = sc.textFile('/home/helxsz/Dropbox/exercise/spark/data_banknote_authentication.txt')
parsed = data.map(mapper)
logistic = LogisticRegressionWithSGD()
logistic.optimizer.setNumIterations(200).setMiniBatchFraction(0.1)
model = logistic.run(parsed)
labelsAndPreds = parsed.map(lambda points: (int(points[0]), model.predict( points[1:len(points)]) ))
trainErr = labelAndPreds.filter(lambda (v,p): v != p).count() / float(parsed.count())
print 'training error = ' + str(trainErr)
但当我使用逻辑回归时,Gd如下
logistic = LogisticRegressionWithSGD()
logistic.optimizer.setNumIterations(200).setMiniBatchFraction(0.1)
它给出了一个错误,即AttributeError:“LogisticRegressionWithGd”对象没有属性“optimizer”
这是python API中的API文档和,您可以在调用“train”时设置这些参数:
model = LogisticRegressionWithSGD.train(parsed, iterations=200, miniBatchFraction=0.1)
我能找到的关于这一点的唯一文档是