GridGain/Scala-在现有作业中生成作业
作为概念证明,我正在构建这个极其简单的Twitter好友爬虫。下面是它将要做的:GridGain/Scala-在现有作业中生成作业,scala,mapreduce,gridgain,Scala,Mapreduce,Gridgain,作为概念证明,我正在构建这个极其简单的Twitter好友爬虫。下面是它将要做的: 为Twitter帐户“Twitter-user-1”执行爬网作业 查找“twitter-user-1”的所有好友 为“twitter-user-1”的所有好友执行爬网作业 到目前为止,我的代码是这样的: def main( args:Array[String] ) { scalar { grid.execute(classOf[CrawlTask], "twitter-user-1").get }
def main( args:Array[String] ) {
scalar {
grid.execute(classOf[CrawlTask], "twitter-user-1").get
}
}
class CrawlTask extends GridTaskNoReduceSplitAdapter[String] {
def split( gridSize:Int, arg:String): Collection[GridJob] = {
val jobs:Collection[GridJob] = new ArrayList[GridJob]()
val initialCrawlJob = new CrawlJob()
initialCrawlJob.twitterId = arg
jobs.add(initialCrawlJob)
jobs
}
}
class CrawlJob extends GridJob {
var twitterId:String = new String()
def cancel() = {
println("cancel - " + twitterId)
}
def execute():Object = {
println("fetch friends for - " + twitterId)
// Fetch and execute CrawlJobs for all friends
return null
}
}
我为所有twitter交互准备了Java服务。需要一些示例来了解如何在现有作业中创建新作业并将其与原始任务关联
谢谢| Srirangan我是怎么解决的 从概念上统一GridTasks和GridJobs。MySpecialGridTask只能有一个MySpecialGridJob 然后,可以很容易地在任务或作业中执行新的GridTasks 在上述示例中:
class CrawlJob extends GridJob {
var twitterId:String = new String()
def cancel() = {
println("cancel - " + twitterId)
}
def execute():Object = {
println("fetch friends for - " + twitterId)
// Fetch and execute CrawlJobs for all friends
// Execute Job Here
grid.execute(classOf[CrawlTask], "twitter-user-2").get
grid.execute(classOf[CrawlTask], "twitter-user-3").get
return null
}
}
详细解释我的解决方案在。。。