Download WGET：想要从一个站点下载所有文件/pdf，会创建目录，但不会下载任何文件_Download_Wget

Download WGET：想要从一个站点下载所有文件/pdf，会创建目录，但不会下载任何文件

download

Download WGET：想要从一个站点下载所有文件/pdf，会创建目录，但不会下载任何文件,download,wget,Download,Wget,我显然对所有的幻灯片都忘得一干二净拥有pdf和ppt格式的网站：我想一次下载所有的链接文件。到目前为止，dir由wget生成，但它是空的我试过： wget -r -A.pdf,.ppt http://some.uni.edu/~name/slides.html wget -e robots=off -A.pdf,.ppt -r -l1 http://some.uni.edu/~name/slides.html wget -nd -l -r -e robots=off http://som

我显然对所有的幻灯片都忘得一干二净

拥有pdf和ppt格式的网站：我想一次下载所有的链接文件。到目前为止，dir由wget生成，但它是空的

我试过：

wget -r -A.pdf,.ppt http://some.uni.edu/~name/slides.html
wget -e robots=off -A.pdf,.ppt -r -l1 http://some.uni.edu/~name/slides.html
wget -nd -l -r -e robots=off http://some.uni.edu/~name/slides.html 
wget -r -np -R "slides.html" http://some.uni.edu/~name/slides.html  
wget -r -np -R "slides.html" http://some.uni.edu/~name/

例如：

$ wget -r https://web.cs.ucla.edu/~kaoru/
--2018-10-29 21:38:50--  https://web.cs.ucla.edu/~kaoru/
Resolving web.cs.ucla.edu (web.cs.ucla.edu)... 131.179.128.29
Connecting to web.cs.ucla.edu     (web.cs.ucla.edu)|131.179.128.29|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 623 [text/html]
Saving to: ‘web.cs.ucla.edu/~kaoru/index.html’

web.cs.ucla.edu/~ka 100%[===================>]     623  --.-KB/s    in 0s      

2018-10-29 21:38:51 (19.1 MB/s) -     ‘web.cs.ucla.edu/~kaoru/index.html’ saved [623/623]

Loading robots.txt; please ignore errors.
--2018-10-29 21:38:51--  https://web.cs.ucla.edu/robots.txt
Reusing existing connection to web.cs.ucla.edu:443.
HTTP request sent, awaiting response... 200 OK
Length: 95 [text/plain]
Saving to: ‘web.cs.ucla.edu/robots.txt’

web.cs.ucla.edu/rob 100%[===================>]      95  --.-KB/s        in 0s      

2018-10-29 21:38:51 (3.10 MB/s) - ‘web.cs.ucla.edu/robots.txt’ saved [95/95]

--2018-10-29 21:38:51--  https://web.cs.ucla.edu/~kaoru/paper11.gif
Reusing existing connection to web.cs.ucla.edu:443.
HTTP request sent, awaiting response... 200 OK
Length: 10230 (10.0K) [image/gif]
Saving to: ‘web.cs.ucla.edu/~kaoru/paper11.gif’

web.cs.ucla.edu/~ka 100%[===================>]   9.99K  --.-KB/s    in 0.001s  

2018-10-29 21:38:51 (12.3 MB/s) -     ‘web.cs.ucla.edu/~kaoru/paper11.gif’ saved [10230/10230]

FINISHED --2018-10-29 21:38:51--
Total wall clock time: 0.9s
Downloaded: 3 files, 11K in 0.001s (12.2 MB/s)

仍然不下载任何文件：

$ ls 
$ index.html  paper11.gif

你的例子

wget -r -A.pdf,.ppt http://some.uni.edu/~name/slides.html
wget -e robots=off -A.pdf,.ppt -r -l1 http://some.uni.edu/~name/slides.html
wget -nd -l -r -e robots=off http://some.uni.edu/~name/slides.html 
wget -r -np -R "slides.html" http://some.uni.edu/~name/slides.html

不应该按照您想要的方式工作，因为您专门针对一个html文件，即slides.html。您应该以目录为目标

然而，我认为你的最后一个例子是最接近的

既然@Kingsley的示例对您很有用，您应该首先尝试这个，然后开始使用-R和-A文件

也许应该是https

无论如何，若服务器不允许控制目录列表，那个么wget就不能递归地获取所有文件。它只能获取您知道其名称的特定文件

你试过忽略这个案子吗？否则-A.pdf不匹配。pdf。谢谢！只是尝试了一下，没有成功。目标PDF是否在同一台服务器上？尝试使用-H允许wget访问其他主机。显然，如果没有访问相关页面，很难回答。我理解，是的，所有文件都位于，例如，等等。好的，应该可以。不太可能，但服务器可能会基于用户代理等进行阻塞。。你能设置一个直接的文件URL吗？e、 g.：wgethttps://some.uni.edu/~name/slides002.ppt谢谢！实际上，该示例部分有效-我可以从直接URL获取一个文件。我仍然必须手动获取所有PDF文件。@marc111011当我使用wget时，哪个文件是从该示例下载的https://some.uni.edu/~name/slides002.ppt我确实收到了slides002.ppt。但我必须这样做50次，而且点击比换名字快：@marc111011然后我有两个问题。1.您使用http还是https？2.使用wget-r时会得到什么https://some.uni.edu/~name/@marc111011您还没有显示wget-r的输出https://some.uni.edu/~name/。相反，在您的示例中，您再次针对一个特定的文件，即ppt.html。您应该以目录为目标。也就是说，url中的最后一次输入应该以/（而不是特定的文件名）结尾

wget -r http://some.uni.edu/~name/