Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/search/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java web搜索_Java_Search_Web - Fatal编程技术网

Java web搜索

Java web搜索,java,search,web,Java,Search,Web,我有谷歌搜索的代码: int num_risultati=15; String only="+filetype%3Ahtml+OR+filetype%3Ahtm+OR+filetype%3Axhtm+OR+filetype%3Axhtml"; String google = "http://www.google.com/search?lr=lang_en&num="+num_risultati+"&q="+only; String search

我有谷歌搜索的代码:

    int num_risultati=15;
    String only="+filetype%3Ahtml+OR+filetype%3Ahtm+OR+filetype%3Axhtm+OR+filetype%3Axhtml"; 

    String google = "http://www.google.com/search?lr=lang_en&num="+num_risultati+"&q="+only;
    String search = "\"Java\" \"C\"";
    String charset = "UTF-8";
    String userAgent = "ExampleBot 1.0 (+http://example.com/bot)"; 

    Elements links = Jsoup.connect(google + URLEncoder.encode(search, charset)).userAgent(userAgent).get().select("li.g>h3>a");

    for (Element link : links) {
        String title = link.text();
        String url = link.absUrl("href"); // Google returns URLs in format "http://www.google.com/url?q=<url>&sa=U&ei=<someKey>".
        url = URLDecoder.decode(url.substring(url.indexOf('=') + 1, url.indexOf('&')), "UTF-8");

        //System.out.println(url);

        if (!url.startsWith("http")) {
            continue; // Ads/news/etc.
        }

        System.out.println("Title: " + title);
        System.out.println("URL: " + url);

        System.out.println();
    }

您可以使用特定域(如en.wikipedia.org)过滤Google搜索,方法如下:

网址:en.wikipedia.org

试试这个,而不是像_lq=en.wikipedia.org那样。此外,在站点筛选器之前,您可能不需要最后一个或运算符

 String only="+filetype%3Ahtml+OR+filetype%3Ahtm+OR+filetype%3Axhtm+OR+filetype%3Axhtml+OR+as_lq=en.wikipedia.org"