Python 2.7 如何从socketTextStream获取字符串格式的记录

Python 2.7 如何从socketTextStream获取字符串格式的记录,python-2.7,apache-spark,spark-streaming,Python 2.7,Apache Spark,Spark Streaming,我试图从套接字流中获取每条记录。我希望记录是行中的字符串数据类型。如何用python编写代码?谢谢 model = pipeline.PipelineModel.read().load(model_path) sc = spark.sparkContext ssc = StreamingContext(sc, 1) lines = ssc.socketTextStream(sys.argv[1], int(sys.argv[2])) if (lines is not None):

我试图从套接字流中获取每条记录。我希望记录是行中的字符串数据类型。如何用python编写代码?谢谢

model = pipeline.PipelineModel.read().load(model_path)

sc = spark.sparkContext
ssc = StreamingContext(sc, 1)

lines = ssc.socketTextStream(sys.argv[1], int(sys.argv[2]))

if (lines is not None):
       lines.foreachRDD(lambda rdd: rdd.foreach(processRecord))

def processRecord(record):

     print("test")
     ...

谢谢。

记录不是字符串类型我在那里添加了更多代码。请检查我的代码有什么问题。谢谢函数processRecord在赋值之前调用。
from __future__ import print_function
import sys
from pyspark import SparkContext
from pyspark.streaming import StreamingContext


if __name__ == "__main__":
    sc = SparkContext(appName="Demo")
    ssc = StreamingContext(sc, 1)

    #record = ssc.socketTextStream("localhost", 9999)
    record = ssc.socketTextStream(sys.argv[1], int(sys.argv[2]))
    # print out each single word
    record.flatMap(lambda line: line.split(" ")).pprint()

    # start streaming
    ssc.start()
    # stop when the socket we are listening is dead
    ssc.awaitTermination()