Web crawler 在robots.txt中加号是什么意思？_Web Crawler_Robots.txt

Web crawler 在robots.txt中加号是什么意思？

web-crawler

Web crawler 在robots.txt中加号是什么意思？,web-crawler,robots.txt,Web Crawler,Robots.txt,对于一个站点，我想在/telecomandes路径上进行web爬网。它是robots.txt： User-agent: * Disallow: *telecommande++* 我的问题是：在这种情况下，加号是什么意思对URL/telecomandes box decodeur.html进行爬网是否合适？关于robots.txt文件根据，+在禁止值中没有特殊意义，也没有* 因此可以爬行/telecomandes box decodeur.html 例如，不允许爬行/*telecoma

对于一个站点，我想在

/telecomandes

路径上进行web爬网。它是robots.txt：

User-agent: * 
Disallow: *telecommande++*

我的问题是：

在这种情况下，加号是什么意思
对URL
```
/telecomandes box decodeur.html
```
进行爬网是否合适？关于robots.txt文件

根据，

在

禁止

值中没有特殊意义，也没有

因此可以爬行

/telecomandes box decodeur.html

例如，不允许爬行

/*telecomande++*.html

（字面意思）

如果你想礼貌一点，你可以考虑使用“专有”robots.txt扩展名，例如来自谷歌和其他搜索引擎的扩展名。许多作者可能没有意识到这些并不是官方规范的一部分，并且期望它们甚至可以用于其他爬虫程序

根据，

没有特殊含义，但是

有一个特殊含义（它意味着：任何字符序列）

因此，仍然允许爬行

/telecomandesbox decodeur.html

例如，不允许爬行

/foo/telecomand++bar.html

（仍然

/*telecomand++*.html

）