Html Jsoup将页面爬网到本地驱动器_Html_Search_Download_Jsoup_Web Crawler

Html Jsoup将页面爬网到本地驱动器

html search download web-crawler

Html Jsoup将页面爬网到本地驱动器,html,search,download,jsoup,web-crawler,Html,Search,Download,Jsoup,Web Crawler,我在一个搜索引擎上工作，我想用一个jsoup网络爬虫从一个网站获取页面，并将这些页面存储在我的本地硬盘上，例如C:\tmp。你能帮我吗谢谢, 您可以使用jsoup来尝试这一点 try { Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/Main_Page").get(); String html = doc.html(); BufferedWriter out = ne

我在一个搜索引擎上工作，我想用一个jsoup网络爬虫从一个网站获取页面，并将这些页面存储在我的本地硬盘上，例如C:\tmp。你能帮我吗

谢谢,

您可以使用jsoup来尝试这一点

    try {
        Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/Main_Page").get();
        String html = doc.html();
        BufferedWriter out = new BufferedWriter(new FileWriter("c:/tmp/wiki.html"));
        out.write(html);
        out.close();
    } catch (IOException e) {
        e.printStackTrace();
    }

它将生成一个名为wiki.html的文件，其中包含c:/tmp/

目录下的wikipedia主页，为什么要使用Jsoup？将网页下载为HTML文件并存储在您想要的位置？是的，但我不想手动执行，我希望jsoup一次从一个网站下载网页