Java 如何在使用spring boot访问一个网站时获取html代码,并将整个html数据存储在一个字符串变量中?

Java 如何在使用spring boot访问一个网站时获取html代码,并将整个html数据存储在一个字符串变量中?,java,html,spring-boot,Java,Html,Spring Boot,我试图找到一些关于如何在使用spring boot访问任何网站时获取HTML数据的资料,但我没有得到任何最好的示例资料。有人能帮我给出解决方案吗?你可以使用HTML解析器,例如JSoup 演示: import java.io.IOException; import org.jsoup.Jsoup; public class JSoupDemo { public static void main(String[] args) throws IOException {

我试图找到一些关于如何在使用spring boot访问任何网站时获取HTML数据的资料,但我没有得到任何最好的示例资料。有人能帮我给出解决方案吗?

你可以使用HTML解析器,例如
JSoup

演示:

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
    public static void main(String[] args) throws IOException {
        String webPage = "http://www.example.com";
        String html = Jsoup.connect(webPage).get().html();
        System.out.println(html);
    }
}
<!doctype html>
<html>
 <head> 
  <title>Example Domain</title> 
  <meta charset="utf-8"> 
  <meta http-equiv="Content-type" content="text/html; charset=utf-8"> 
  <meta name="viewport" content="width=device-width, initial-scale=1"> 
  <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style> 
 </head> 
 <body> 
  <div> 
   <h1>Example Domain</h1> 
   <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> 
   <p><a href="https://www.iana.org/domains/example">More information...</a></p> 
  </div>   
 </body>
</html>
输出:

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
    public static void main(String[] args) throws IOException {
        String webPage = "http://www.example.com";
        String html = Jsoup.connect(webPage).get().html();
        System.out.println(html);
    }
}
<!doctype html>
<html>
 <head> 
  <title>Example Domain</title> 
  <meta charset="utf-8"> 
  <meta http-equiv="Content-type" content="text/html; charset=utf-8"> 
  <meta name="viewport" content="width=device-width, initial-scale=1"> 
  <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style> 
 </head> 
 <body> 
  <div> 
   <h1>Example Domain</h1> 
   <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> 
   <p><a href="https://www.iana.org/domains/example">More information...</a></p> 
  </div>   
 </body>
</html>

您可以使用HTML解析器,例如
JSoup
来执行此操作

演示:

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
    public static void main(String[] args) throws IOException {
        String webPage = "http://www.example.com";
        String html = Jsoup.connect(webPage).get().html();
        System.out.println(html);
    }
}
<!doctype html>
<html>
 <head> 
  <title>Example Domain</title> 
  <meta charset="utf-8"> 
  <meta http-equiv="Content-type" content="text/html; charset=utf-8"> 
  <meta name="viewport" content="width=device-width, initial-scale=1"> 
  <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style> 
 </head> 
 <body> 
  <div> 
   <h1>Example Domain</h1> 
   <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> 
   <p><a href="https://www.iana.org/domains/example">More information...</a></p> 
  </div>   
 </body>
</html>
输出:

import java.io.IOException;
import org.jsoup.Jsoup;

public class JSoupDemo {    
    public static void main(String[] args) throws IOException {
        String webPage = "http://www.example.com";
        String html = Jsoup.connect(webPage).get().html();
        System.out.println(html);
    }
}
<!doctype html>
<html>
 <head> 
  <title>Example Domain</title> 
  <meta charset="utf-8"> 
  <meta http-equiv="Content-type" content="text/html; charset=utf-8"> 
  <meta name="viewport" content="width=device-width, initial-scale=1"> 
  <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style> 
 </head> 
 <body> 
  <div> 
   <h1>Example Domain</h1> 
   <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> 
   <p><a href="https://www.iana.org/domains/example">More information...</a></p> 
  </div>   
 </body>
</html>

你正在寻找的是一个叫做网络爬虫的程序,可能有一些库可以做到这一点。AJITHAN SEHIVAKANAN-有更新吗?这个解决方案对你有用吗?你正在寻找的是一个叫做网络爬虫的程序,可能有一些库可以做到这一点。AJITHAN SEHIVAKANAN-有更新吗?这个解决方案对你有用吗?