Apache spark HistoryServer REST API的JSON中存在错误
我想知道以前是否有人遇到过同样的问题 我试图从一个作业中提取任务列表的信息。我通过历史服务器的RESTAPI来实现这一点。然而,我只得到20行数据,而在Spark WEB UI中,所有任务都被反映出来(超过100行)。我附上了历史服务器上的截图和日志 在上面的图片中,您可以看到121个任务如何显示在UI中(由于空间不足,我不会附上121个任务的完整屏幕截图),但是当我查询REST API时,我只得到20行。不管我用什么工具 我将历史服务器的日志粘贴到这里Apache spark HistoryServer REST API的JSON中存在错误,apache-spark,Apache Spark,我想知道以前是否有人遇到过同样的问题 我试图从一个作业中提取任务列表的信息。我通过历史服务器的RESTAPI来实现这一点。然而,我只得到20行数据,而在Spark WEB UI中,所有任务都被反映出来(超过100行)。我附上了历史服务器上的截图和日志 在上面的图片中,您可以看到121个任务如何显示在UI中(由于空间不足,我不会附上121个任务的完整屏幕截图),但是当我查询REST API时,我只得到20行。不管我用什么工具 我将历史服务器的日志粘贴到这里 16/04/15 09:23:00
16/04/15 09:23:00 INFO history.HistoryServer: Registered signal handlers for [TERM, HUP, INT]
16/04/15 09:23:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/04/15 09:23:01 INFO spark.SecurityManager: Changing view acls to: abrandon
16/04/15 09:23:01 INFO spark.SecurityManager: Changing modify acls to: abrandon
16/04/15 09:23:01 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(abrandon); users with modify permissions: Set(abrandon)
16/04/15 09:23:01 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0004
16/04/15 09:23:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/04/15 09:23:01 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:18080
16/04/15 09:23:01 INFO util.Utils: Successfully started service on port 18080.
16/04/15 09:23:01 INFO history.HistoryServer: Started HistoryServer at http://172.16.100.1:18080
16/04/15 09:23:02 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0005
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0002
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0008
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0001
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0006
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0007
16/04/15 09:23:03 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0003
16/04/15 09:23:22 INFO spark.SecurityManager: Changing view acls to: abrandon
16/04/15 09:23:22 INFO spark.SecurityManager: Changing modify acls to: abrandon
16/04/15 09:23:22 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(abrandon); users with modify permissions: Set(abrandon)
16/04/15 09:23:22 INFO history.FsHistoryProvider: Replaying log path: file:/tmp/spark-events/application_1460638681315_0007
16/04/15 09:23:22 INFO spark.SecurityManager: Changing acls enabled to: false
16/04/15 09:23:22 INFO spark.SecurityManager: Changing admin acls to:
16/04/15 09:23:22 INFO spark.SecurityManager: Changing view acls to: abrandon
16/04/15 09:26:44 INFO core.PackagesResourceConfig: Scanning for root resource and provider classes in the packages:
org.apache.spark.status.api.v1
16/04/15 09:26:48 INFO core.ScanningResourceConfig: Root resource classes found:
class org.apache.spark.status.api.v1.ApiRootResource
16/04/15 09:26:48 INFO core.ScanningResourceConfig: Provider classes found:
class org.apache.spark.status.api.v1.JacksonMessageWriter
16/04/15 09:26:48 INFO application.WebApplicationImpl: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
16/04/15 09:26:49 WARN inject.Errors: The following warnings have been detected with resource and/or provider classes:
WARNING: A sub-resource method, public scala.collection.Seq org.apache.spark.status.api.v1.OneStageResource.stageData(int), with URI template, "", is treated as a resource method
似乎在中缺少,但是
taskList
端点使用分页来防止响应过大,默认页面大小为20,如图所示:
@GET
@路径(“/{stageAttentid:\\d+}/taskList”)
def任务列表(
@PathParam(“stageId”)stageId:Int,
@PathParam(“stageAttentid”)stageAttentid:Int,
@DefaultValue(“0”)@QueryParam(“offset”)offset:Int,
@DefaultValue(“20”)@QueryParam(“长度”)长度:Int,
@DefaultValue(“ID”)@QueryParam(“sortBy”)sortBy:TaskSorting:Seq[TaskData]={
WithStageAttest(stageId,stageAttentid){stage=>
val tasks=stage.ui.taskData.values.map{AllStageResource.convertTaskData}.toIndexedSeq
.sorted(一级资源.排序(sortBy))
任务.切片(偏移量,偏移量+长度)//
@GET
@Path("/{stageAttemptId: \\d+}/taskList")
def taskList(
@PathParam("stageId") stageId: Int,
@PathParam("stageAttemptId") stageAttemptId: Int,
@DefaultValue("0") @QueryParam("offset") offset: Int,
@DefaultValue("20") @QueryParam("length") length: Int,
@DefaultValue("ID") @QueryParam("sortBy") sortBy: TaskSorting): Seq[TaskData] = {
withStageAttempt(stageId, stageAttemptId) { stage =>
val tasks = stage.ui.taskData.values.map{AllStagesResource.convertTaskData}.toIndexedSeq
.sorted(OneStageResource.ordering(sortBy))
tasks.slice(offset, offset + length) // <--- here!
}
}