Java 减半

Java 减半,java,split,substring,indexof,Java,Split,Substring,Indexof,我有这个密码 public void descargarURL() { try{ URL url = new URL("https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1"); BufferedReader lectura = new BufferedReade

我有这个密码

public void descargarURL() {
    try{
        URL url = new URL("https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1");
        BufferedReader lectura = new BufferedReader(new InputStreamReader(url.openStream()));
        File archivo = new File("descarga2.txt");
        BufferedWriter escritura = new BufferedWriter(new FileWriter(archivo));
        BufferedWriter ficheroNuevo = new BufferedWriter(new FileWriter("nuevoFichero.txt"));
        String texto;

        while ((texto = lectura.readLine()) != null) {
            escritura.write(texto);

            }
        lectura.close();
        escritura.close();
        ficheroNuevo.close();
        System.out.println("Archivo creado!");
        //}

    }
    catch(Exception ex) {
        ex.printStackTrace();
    }
}
public static void main(String[] args) throws FileNotFoundException, IOException {
    Paginaweb2 pg = new Paginaweb2();
    pg.descargarURL();
}
}

我想从URL中删除引用的B078ZYX4R5部分,以及这个实体/

保存在文本文件中的html之后,有一部分代码具有
*”*
,我只想从那里获得1479.00,它作为price=“

我不想使用外部库,我知道它可以通过split、index of和substring来实现


谢谢

您可以使用正则表达式来解决这两个任务。但是对于第二个任务(从HTML中提取价格),您可以使用更适合从HTML中提取内容的方法

以下是一些基于正则表达式的任务可能的解决方案:

1.更改URL 这只是一个替换,使用一个正则表达式,使用正向前瞻
(?=/ref)
(请参阅)

提取价格 如果运行此简单测试用例,您将在控制台上看到此输出:

https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1
1479.00
更新 如果您想从URL中提取引用,可以使用与用于提取价格的代码类似的代码。以下是从模式中提取特定命名组的方法:

private static Optional<String> extractNamedGroup(String str, Pattern pat, String reference) {
    Matcher m = pat.matcher(str);
    if (m.find()) {
        return Optional.of(m.group(reference));
    }
    return Optional.empty();
}
更新2:使用URL 如果您想使用
java.net.URL
类来帮助您缩小搜索范围,您可以这样做。但是您不能使用这个类来进行完全提取。 由于要提取的标记位于URL路径中,因此可以提取路径,然后应用上面解释的正则表达式进行提取

以下是可用于缩小搜索范围的示例代码:

public static void main(String[] args) throws IOException {
    String str = "https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1";
    URL url = new URL(str); 
    extractReference(url.getPath() /* narrowing the search scope here */).ifPresent(System.out::println);
    String html = "<div id =\" cerberus-data-metrics \"style =\" display: none; \"data-asin =\" B078ZYX4R5 \"data-as-price = \"1479.00\" data-asin-shipping = \"0\" data-asin-currency-code = \"EUR\" data-substitute-count = \"0\" data-device-type = \"WEB\" data-display-code = \"Asin is not eligible because it has a retail offer \"> </ div>";
    extractPrice(html).ifPresent(System.out::println);
}
publicstaticvoidmain(字符串[]args)引发IOException{
字符串str=”https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1";
URL=新URL(str);
extractReference(url.getPath()/*缩小此处的搜索范围*/).ifPresent(System.out::println);
字符串html=“”;
extractPrice(html).ifPresent(System.out::println);
}

太棒了,非常感谢,但是如果不将URL设置为默认值,是否还有其他方法可以将URL传递给String?然后在什么出来的控制台,价格是伟大的,但网址,我只想要的参考是B078ZYX4R5非常感谢@Ashe请检查更新的答案,其中包含提取引用的方法。但是,是否有其他方法可以将URL传递到字符串而不将其设置为默认值?@Ashe我实际上没有100%理解“另一种方法可以将URL传递到字符串而不将其设置为默认值”的含义。我假设您希望使用java.net.URL类的方法来执行某种提取,而不是使用普通的输入字符串。请确认我的假设是否正确。确切地说,我想使用URL类的方法,因此在执行函数时,您不会离开我,建议?可能重复
https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1
1479.00
private static Optional<String> extractNamedGroup(String str, Pattern pat, String reference) {
    Matcher m = pat.matcher(str);
    if (m.find()) {
        return Optional.of(m.group(reference));
    }
    return Optional.empty();
}
private static Optional<String> extractReference(String str) {
    Pattern pat = Pattern.compile("/(?<reference>[^/]+)(?=/ref)");
    return extractNamedGroup(str, pat, "reference");
}

private static Optional<String> extractPrice(String html) {
    Pattern pat = Pattern.compile("data-as-price\\s*=\\s*[\"'](?<price>.+?)[\"']", Pattern.MULTILINE);
    return extractNamedGroup(html, pat, "price");
}
public static void main(String[] args) throws IOException {
    String str = "https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1";
    extractReference(str).ifPresent(System.out::println);
    String html = "<div id =\" cerberus-data-metrics \"style =\" display: none; \"data-asin =\" B078ZYX4R5 \"data-as-price = \"1479.00\" data-asin-shipping = \"0\" data-asin-currency-code = \"EUR\" data-substitute-count = \"0\" data-device-type = \"WEB\" data-display-code = \"Asin is not eligible because it has a retail offer \"> </ div>";
    extractPrice(html).ifPresent(System.out::println);
}
B078ZYX4R5
1479.00
public static void main(String[] args) throws IOException {
    String str = "https://www.amazon.es/MSI-Titan-GT73EVR-7RD-1027XES-Ordenador/dp/B078ZYX4R5/ref=sr_1_1?ie=UTF8&qid=1524239679&sr=8-1";
    URL url = new URL(str); 
    extractReference(url.getPath() /* narrowing the search scope here */).ifPresent(System.out::println);
    String html = "<div id =\" cerberus-data-metrics \"style =\" display: none; \"data-asin =\" B078ZYX4R5 \"data-as-price = \"1479.00\" data-asin-shipping = \"0\" data-asin-currency-code = \"EUR\" data-substitute-count = \"0\" data-device-type = \"WEB\" data-display-code = \"Asin is not eligible because it has a retail offer \"> </ div>";
    extractPrice(html).ifPresent(System.out::println);
}