Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/cplusplus/150.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Ant 坚果剂_Ant_Lucene_Web Crawler_Nutch - Fatal编程技术网

Ant 坚果剂

Ant 坚果剂,ant,lucene,web-crawler,nutch,Ant,Lucene,Web Crawler,Nutch,我正在尝试使用nutch-1.10中的rotate agent功能,下面是我在nutch-site.xml中的配置 <property> <name>http.agent.rotate</name> <value>true</value> <description> If true, instead of http.agent.name, alternating agent names

我正在尝试使用nutch-1.10中的rotate agent功能,下面是我在nutch-site.xml中的配置

<property>
    <name>http.agent.rotate</name>
    <value>true</value>
    <description>
        If true, instead of http.agent.name, alternating agent names are
        chosen from a list provided via http.agent.rotate.file.
    </description>
</property>

<property>
    <name>http.agent.rotate.file</name>
    <value>agents.txt</value>
    <description>
        File containing alternative user agent names to be used instead of
        http.agent.name on a rotating basis if http.agent.rotate is true.
        Each line of the file should contain exactly one agent
        specification including name, version, description, URL, etc.
    </description>
</property>
另外,这里是my agents.txt文件的内容

NutchCVS/0.7 Nutch Nutch-agent@lucene.apache.org


我尝试了各种方法来设置agents.txt,当我尝试在hadoop.log中grep“agent”时,agent仍然是我在http.agent.name中设置的。在进行更改后,我还运行了“ant runtime”来重新编译项目。请帮我弄清楚应该出什么问题。我认为是agents.txt文件,但我不知道代理的正确格式是什么

您可以尝试设置本地服务器,在本地主机上爬网,并检查服务器日志,以查看代理是否实际更改

我也遇到过类似的问题,但当我检查服务器日志时,代理实际上正在轮换