Web crawler 如何在crawler4j中将参数路径设置为shouldVisit（）方法？_Web Crawler_Crawler4j

Web crawler 如何在crawler4j中将参数路径设置为shouldVisit（）方法？

web-crawler

Web crawler 如何在crawler4j中将参数路径设置为shouldVisit（）方法？,web-crawler,crawler4j,Web Crawler,Crawler4j,我想将参数传递给crawler4j中的should Visit（）方法。我在github上看到了文档库页面的示例，它使用工厂方式，但我无法理解。。请提供一个示例来实现变量1：将附加参数作为构造函数参数注入除了shouldVisit（…）的方法参数之外，还需要将附加参数作为构造函数参数传递到每个WebCrawler类中也就是说，您可以通过使用工厂类执行以下操作：带有两个自定义参数（customArgument1和customArgument2）的MyWebCrawler.class：要

我想将参数传递给crawler4j中的should Visit（）方法。我在github上看到了文档库页面的示例，它使用工厂方式，但我无法理解。。请提供一个示例来实现变量1：将附加参数作为构造函数参数注入除了

shouldVisit（…）

的方法参数之外，还需要将附加参数作为构造函数参数传递到每个

WebCrawler

类中

也就是说，您可以通过使用

工厂

类执行以下操作：

带有两个自定义参数（

customArgument1

和

customArgument2

）的MyWebCrawler.class：

要使其正常工作，

工厂

应如下所示：

public class MyCrawlerFactory implements CrawlController.WebCrawlerFactory<MyWebCrawler> {

        public MyCrawlerFactory newInstance() throws Exception {
        return new MyCrawlerFactory("some argument", "some other argument");
    }
}

controller.start(new MyCrawlerFactory(), numberOfCrawlers);

可以找到一个类似的工作示例

变体2：使用

CrawlController#getCustomData（）

（已弃用）您可以在

CrawlController

对象上使用

customData

，将其他数据注入web爬虫对象。但是，这是不推荐使用的方法，可能会在以后的

crawler4j

版本中删除

controller.start(new MyCrawlerFactory(), numberOfCrawlers);