如何在Java代码中实现下一步按钮单击？_Java_Pagination

如何在Java代码中实现下一步按钮单击？

java pagination

如何在Java代码中实现下一步按钮单击？,java,pagination,Java,Pagination,我正在用Java编写代码来检索和解析源代码。我尝试访问的网站是：源代码仅适用于该页面，尽管总共有11个页面。要访问下一页的源代码，我必须单击“下一步”按钮，重新加载页面以查看新的源代码。我需要在我的代码中实现这个想法，让我的代码检索所有不同的源代码页面我读过关于可能使用PhantomJS或CasperJS来实现这一点的文章，但我不知道如何实现这些我的代码如下： // Scraper class takes an input of a string, and returns the sou

我正在用Java编写代码来检索和解析源代码。我尝试访问的网站是：

源代码仅适用于该页面，尽管总共有11个页面。要访问下一页的源代码，我必须单击“下一步”按钮，重新加载页面以查看新的源代码。我需要在我的代码中实现这个想法，让我的代码检索所有不同的源代码页面

我读过关于可能使用PhantomJS或CasperJS来实现这一点的文章，但我不知道如何实现这些

我的代码如下：

// Scraper class takes an input of a string, and returns the source code of the of the website. Also picks out the needed data
public class Scraper { 

  private static String url; // the input website to be scraped

  public static String sourcetext; //The source code that has been scraped


  //constructor which allows for the input of a URL
  public Scraper(String url) {
    this.url = url;
  }

  //scrapeWebsite runs the method to scrape the input URL and returns a string to be parsed.
  public static void scrapeWebsite() throws IOException {

    URL urlconnect = new URL(url); //creates the url from the variable
    URLConnection connection = urlconnect.openConnection(); // connects to the created URL
    BufferedReader in = new BufferedReader(new InputStreamReader( 
                                                                 connection.getInputStream(), "UTF-8")); // annonymous class to stream the website
    String inputLine; //creates a new variable of string
    StringBuilder sourcecode = new StringBuilder(); // creates a stringbuilder which contains the sourcecode

    //loop appends to the string builder as long as there is information
    while ((inputLine = in.readLine()) != null)
      sourcecode.append(inputLine);// appends the source code to the sting
    in.close();
    sourcetext = sourcecode.toString(); // Takes the text in stringbuilder and converts it to a string
    sourcetext = sourcetext.replace('"','*'); //deletes the quotes(") so it can be parsed
  }

  //This method parses through the data and adds the necesary information to a specified CSV file
  public static void getPlaintiff() throws IOException {

    PrintWriter docketFile = new PrintWriter("tester.csv", "UTF-8"); // creates the csv file. (name must be changed, override deletes file)

    int i = 0;

    //While loop runs through all the data in the source code. There is (14) entries per page.
    while(i<14) {
      String plaintiffAtty = "PlaintiffAtty_"+i+"*>"; //creates the search string for the plaintiffatty
      Pattern plaintiffPattern = Pattern.compile("(?<="+Pattern.quote(plaintiffAtty)+").*?(?=</span>)");//creates the pattern for the atty
      Matcher plaintiffMatcher = plaintiffPattern.matcher(sourcetext); // looks for a match for the atty

      while (plaintiffMatcher.find()) {
        docketFile.write(plaintiffMatcher.group().toString()+", "); //writes the found atty to the file
      }

      String appraisedValue = "Appraised_"+i+"*>"; //creats the search string for the appraised value
      Pattern appraisedPattern = Pattern.compile("(?<="+Pattern.quote(appraisedValue)+").*?(?=</span>)");//creates the parren for the value
      Matcher appraisedMatcher = appraisedPattern.matcher(sourcetext); //looks for a match to the apreaised value

      while (appraisedMatcher.find()) {
        docketFile.write(appraisedMatcher.group().toString()+"\n"); //writes the found value to the file

      }
      i++;
    }
    docketFile.close(); //closes the file
  }
}

//Scraper类接受字符串的输入，并返回网站的源代码。还可以选择所需的数据
公共类刮刀{
私有静态字符串url；//要刮取的输入网站
public static String sourcetext；//已刮取的源代码
//允许输入URL的构造函数
公共刮板（字符串url）{
this.url=url；
}
//scrapeWebsite运行该方法来刮取输入URL并返回要解析的字符串。
公共静态网站（）引发IOException{
URL urlconnect=newURL（URL）；//从变量创建URL
URLConnection=urlconnect.openConnection（）；//连接到创建的URL
BufferedReader in=新的BufferedReader（新的InputStreamReader（
connection.getInputStream（），“UTF-8”）；//流式处理网站的匿名类
String inputLine；//创建字符串的新变量
StringBuilder sourcecode=new StringBuilder（）；//创建包含源代码的StringBuilder
//只要有信息，循环就会附加到字符串生成器
而（（inputLine=in.readLine（））！=null）
append（inputLine）；//将源代码追加到sting
in.close（）；
sourcetext=sourcecode.toString（）；//获取stringbuilder中的文本并将其转换为字符串
sourcetext=sourcetext.replace（“”，“*”）；//删除引号（“），以便对其进行分析
}
//此方法解析数据并将必要信息添加到指定的CSV文件中
public static void getException（）引发IOException{
PrintWriter docketFile=new PrintWriter（“tester.csv”，“UTF-8”）；//创建csv文件。（必须更改名称，覆盖删除文件）
int i=0；
//While循环遍历源代码中的所有数据。每页有（14）个条目。
while（我这是您的新代码，主要是经过重新格式化、重新设计样式和修改的代码；现在它实际上是可以理解的，您可以解决您自己的问题了。（但是，如果您使用的是java 1.6或更早版本，您可能希望恢复“资源试用”部分，因为它们只是在1.7中添加的。）
/**
*此类包含用于拾取的方法
*从网站源中取出所需的数据。
*/
公共类刮刀{
/**
*此方法将刮取输入URL。
*@返回包含网页数据的字符串。
*@IOException如果访问网站时出现问题。
*/
公共静态字符串网站（字符串url）引发IOException{
字符串输入线；
StringBuilder sourcetext=新的StringBuilder（）；
URL urlconnect=新URL（URL）；
URLConnection=urlconnect.openConnection（）；
try（BufferedReader in=新的BufferedReader(
新的InputStreamReader（connection.getInputStream（），“UTF-8”））{
而（（inputLine=in.readLine（））！=null）
追加（inputLine）；
}
返回sourceText.toString（）.replace（“”，“*”）；
}
/**
*此方法解析数据并将必要的信息添加到
*指定的.CSV文件。
*@param source返回的数据源，例如
*{@link.com（）}。
*@param targetFile目标.csv文件的文件路径。
*@如果访问文件时出现问题，则引发IOException。
*/
公共静态文件（CharSequence源，字符串targetFile）
抛出IOException{
try（PrintWriter docketFile=新的PrintWriter（“tester.csv”、“UTF-8”））{
对于（int i=0；i<14；i++）{
Matcher原告Matcher=Pattern.compile(
"(?).*?(?=)")
.matcher（资料来源）；
while（原告匹配器.find（））
docketFile.println（原告匹配器.group（））；
匹配器评估匹配器=Pattern.compile(
"(?).*?(?=)")
.matcher（资料来源）；
while（evaludedMatcher.find（））
docketFile.println（evaludedMatcher.group（））；
}
}
}
}

（注意可能引入了新的bug；只需修复它们，没什么大不了的。）
EDIT：意识到匹配器的创建确实必须在循环内部完成，因为生成正则表达式需要索引；还用一个更简单的docketWriter.println
station替换了docketWriter.write
。您考虑过将类和方法注释转换为javadoc注释吗？这将使e你的代码比把它们作为行注释放在类/方法之前要好得多。你需要弄清楚下一个按钮做什么，它调用什么url，以及它传递什么参数来检索下一个页面。如果你能得到这些信息，那么你就被设置好了。要添加到ns47731的注释中，我会研究一个名为M将显示你需要调用的下一个URL。如果你必须做任何事情（甚至调用正确的JavaScript函数），考虑查看并看看他们是如何调用JavaScript等的。祝你好运。最后，请查看HTTPclipse来制作你的HTTP请求。
/**
 * This class contains methods for is for picking
 * out needed data from the source of a website.
 */
public class Scraper { 

    /**
     * This method scrapes the input URL.
     * @return A string containing the data from the webpage.
     * @throws IOException if there was a problem with accessing the website.
     */
    public static String scrapeWebsite(String url) throws IOException {

        String inputLine;
        StringBuilder sourcetext = new StringBuilder();

        URL urlconnect = new URL(url);
        URLConnection connection = urlconnect.openConnection();

        try(BufferedReader in = new BufferedReader(
                new InputStreamReader(connection.getInputStream(), "UTF-8"))){

            while ((inputLine = in.readLine()) != null)
                sourcetext.append(inputLine);
        }
        return sourceText.toString().replace('"','*');
    }

    /**
     * This method parses through the data and adds the necesary information to
     * a specified .CSV file.
     * @param source The datasource, for example that returned by
     *               {@link scrapeWebsite()}.
     * @param targetFile The file path for the destination .csv file.
     * @throws IOException if there was a problem with accessing the file.
     */
    public static void getPlaintiff(CharSequence source, String targetFile)
            throws IOException{

        try(PrintWriter docketFile = new PrintWriter("tester.csv", "UTF-8")){

            for(int i = 0; i < 14; i++) {
                Matcher plaintiffMatcher = Pattern.compile(
                        "(?<=PlaintiffAtty_" + i + "\\*>).*?(?=</span>)")
                        .matcher(source);

                while (plaintiffMatcher.find())
                    docketFile.println(plaintiffMatcher.group());

                Matcher appraisedMatcher = Pattern.compile(
                        "(?<=Appraised_" + i + "\\*>).*?(?=</span>)")
                        .matcher(source);

                while (appraisedMatcher.find())
                    docketFile.println(appraisedMatcher.group());
            }
        }
    }
}