C# Dotnet Apache Spark-对象引用未设置为对象的实例
我一直在尝试使用dotnet apache spark注册并运行UDF。我正在MacOs上使用Microsoft.Spark.0.10.0 这就是我一直想做的C# Dotnet Apache Spark-对象引用未设置为对象的实例,c#,apache-spark,apache-spark-sql,user-defined-functions,.net-spark,C#,Apache Spark,Apache Spark Sql,User Defined Functions,.net Spark,我一直在尝试使用dotnet apache spark注册并运行UDF。我正在MacOs上使用Microsoft.Spark.0.10.0 这就是我一直想做的 var options = new Dictionary<string, string> { {"delimiter", "|" } }; var schema = "Username STRING, Machine ST
var options = new Dictionary<string, string>
{
{"delimiter", "|" }
};
var schema = "Username STRING, Machine STRING, Date STRING";
var df = spark
.Read()
.Format("csv")
.Options(options)
.Schema(schema)
.Load(staff);
df.PrintSchema();
df.Show();
spark.Udf().Register<string, string>("MyUDF", randomFunc);
df.CreateOrReplaceTempView("AllLogs");
DataFrame dateDf = spark.Sql("SELECT *, MyUDF(alllogs.Username) FROM AllLogs");
dateDf.Collect();
我似乎一直在犯同样的错误。我尝试过创建Udf的不同方法,但似乎没有一种有效
这就是错误:
[Error] [TaskRunner] [1] ProcessStream() failed with exception: System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.Spark.Utils.UdfSerDe.Deserialize(UdfData udfData) in /_/src/csharp/Microsoft.Spark/Utils/UdfSerDe.cs:line 168
at Microsoft.Spark.Utils.CommandSerDe.DeserializeUdfs[T](UdfWrapperData data, Int32& nodeIndex, Int32& udfIndex) in /_/src/csharp/Microsoft.Spark/Utils/CommandSerDe.cs:line 267
at Microsoft.Spark.Utils.CommandSerDe.Deserialize[T](Stream stream, SerializedMode& serializerMode, SerializedMode& deserializerMode, String& runMode) in /_/src/csharp/Microsoft.Spark/Utils/CommandSerDe.cs:line 243
at Microsoft.Spark.Worker.Processor.CommandProcessor.ReadSqlCommands(PythonEvalType evalType, Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 190
at Microsoft.Spark.Worker.Processor.CommandProcessor.ReadSqlCommands(PythonEvalType evalType, Stream stream, Version version) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 117
at Microsoft.Spark.Worker.Processor.CommandProcessor.Process(Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 62
at Microsoft.Spark.Worker.Processor.PayloadProcessor.Process(Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\PayloadProcessor.cs:line 74
源程序中的哪一行正在生成此错误。您使用的是什么版本的Apace Spark库?这是完整的堆栈跟踪,它不能归结为您的程序。你能分享完整的堆栈跟踪和完整的程序吗。。。
[Error] [TaskRunner] [1] ProcessStream() failed with exception: System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.Spark.Utils.UdfSerDe.Deserialize(UdfData udfData) in /_/src/csharp/Microsoft.Spark/Utils/UdfSerDe.cs:line 168
at Microsoft.Spark.Utils.CommandSerDe.DeserializeUdfs[T](UdfWrapperData data, Int32& nodeIndex, Int32& udfIndex) in /_/src/csharp/Microsoft.Spark/Utils/CommandSerDe.cs:line 267
at Microsoft.Spark.Utils.CommandSerDe.Deserialize[T](Stream stream, SerializedMode& serializerMode, SerializedMode& deserializerMode, String& runMode) in /_/src/csharp/Microsoft.Spark/Utils/CommandSerDe.cs:line 243
at Microsoft.Spark.Worker.Processor.CommandProcessor.ReadSqlCommands(PythonEvalType evalType, Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 190
at Microsoft.Spark.Worker.Processor.CommandProcessor.ReadSqlCommands(PythonEvalType evalType, Stream stream, Version version) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 117
at Microsoft.Spark.Worker.Processor.CommandProcessor.Process(Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\CommandProcessor.cs:line 62
at Microsoft.Spark.Worker.Processor.PayloadProcessor.Process(Stream stream) in D:\a\1\s\src\csharp\Microsoft.Spark.Worker\Processor\PayloadProcessor.cs:line 74