Java 试图从读取特定标记之间的缓冲读取器中提取子字符串_Java_Android_Bufferedreader

Java 试图从读取特定标记之间的缓冲读取器中提取子字符串

java android

Java 试图从读取特定标记之间的缓冲读取器中提取子字符串,java,android,bufferedreader,Java,Android,Bufferedreader,我正在使用bufferedreader提取5个网页，每个网页由一个空格分隔，我想使用一个子字符串来提取每个网页的url、html、源和日期。但是我需要关于如何正确使用子字符串来实现这一点的指导，干杯 public static List<WebPage> readRawTextFile(Context ctx, int resId) { InputStream inputStream = ctx.getResources().openRawResource(

我正在使用bufferedreader提取5个网页，每个网页由一个空格分隔，我想使用一个子字符串来提取每个网页的url、html、源和日期。但是我需要关于如何正确使用子字符串来实现这一点的指导，干杯

public static List<WebPage> readRawTextFile(Context ctx, int resId) {   

    InputStream inputStream = ctx.getResources().openRawResource(
            R.raw.pages);

    InputStreamReader inputreader = new InputStreamReader(inputStream);
    BufferedReader buffreader = new BufferedReader(inputreader);
    String line;
    StringBuilder text = new StringBuilder();

    try {
        while ((line = buffreader.readLine()) != null) {


            if (line.length() == 0) {       
                // ignore for now 
                                //Will be used when blank line is encountered
            }

            if (line.length() != 0)  {
         //here I want the substring to pull out the correctStrings
                int sURL = line.indexOf("<!--");
                    int eURL = line.indexOf("-->");
                line.substring(sURL,eURL);
                **//Problem is here**
            }
        }
    } catch (IOException e) {
        return null;

    }
    return null;
}

公共静态列表readRawTextFile（Context ctx，int resId）{
InputStream InputStream=ctx.getResources（）.openRawResource(
R.原始页）；
InputStreamReader inputreader=新的InputStreamReader（inputStream）；
BufferedReader buffreader=新的BufferedReader（inputreader）；
弦线；
StringBuilder text=新的StringBuilder（）；
试一试{
而（（line=buffreader.readLine（））！=null）{
如果（line.length（）==0）{
//暂时忽略
//将在遇到空行时使用
}
如果（line.length（）！=0）{
//这里我想让子字符串拉出正确的字符串
int sURL=line.indexOf（“”）；
行。子字符串（sURL，eURL）；
**//问题就在这里**
}
}
}捕获（IOE异常）{
返回null；
}
返回null；
}

我想你想要的是这样

public class Test {
   public static void main(String args[]) {
    String text = "<!--Address:google.co.uk.html-->";
    String converted1 = text.replaceAll("\\<!--", "");
    String converted2 = converted1.replaceAll("\\-->", "");
    System.out.println(converted2);
   }

公共类测试{
公共静态void main（字符串参数[]）{
字符串文本=”；
字符串converted1=text.replaceAll（“\\”，“”）；
系统输出打印项次（转换为2）；
}

}

结果显示：地址：google.co.uk.html

在catch块中不要

返回null

，使用

printStackTrace（）。它将帮助你发现是否出了问题
        String str1 = "<!--Address:google.co.uk.html-->";
        // Approach 1
        int st = str1.indexOf("<!--"); // gives index which starts from <
        int en = str1.indexOf("-->");  // gives index which starts from -
        str1 = str1.substring(st + 4, en);
        System.out.println(str1);

        // Approach 2
        String str2 = "<!--Address:google.co.uk.html-->";
        str2 = str2.replaceAll("[<>!-]", "");
        System.out.println( str2);

String str1=“”；
//方法1
int st=str1.indexOf（“”；//给出从开始的索引-
str1=str1.子串（st+4，en）；
系统输出打印项次（str1）；
//方法2
字符串str2=“”；
str2=str2.replaceAll（“[！-]”，“”）；
系统输出打印项次（str2）；

注意$100:请注意，在replaceAll中使用regex将替换包含regex参数的字符串中的所有内容。
对于地址，我希望提取的文本是这样的。我要删除标记，为什么要执行子字符串操作？只需使用String.replace（）。谢谢。我会看看是否可以调整它，以便保存5个URL。正如您使用的ReplaceAll（）
。那为什么这两个转换。您可以通过使用regex
实现同样的效果+无论如何，谢谢，我需要能够从bufferedreader中提取地址。因此，它将遍历并找到文本文件中的每个地址，去掉标记并返回addresses@rob12243我不明白。无论如何，您可以使用任何逻辑来实现您的目标。