Java 如何在到达空字符串时继续处理文件

Java 如何在到达空字符串时继续处理文件,java,string,hashmap,Java,String,Hashmap,我试图读入一个包含DNA序列的文件。在我的程序中,我想读取长度为4的DNA的每个子序列,并将其存储在hashmap中,以计算每个子序列的出现次数。例如,如果我有序列ccaccac,并且我想要长度为4的每个子序列,那么前3个子序列将是: CCAC、CACA、ACAC等 为了做到这一点,我必须在字符串上迭代几次,下面是我的实现 try { String file = sc.nextLine(); BufferedReader reader = new Buff

我试图读入一个包含DNA序列的文件。在我的程序中,我想读取长度为4的DNA的每个子序列,并将其存储在hashmap中,以计算每个子序列的出现次数。例如,如果我有序列
ccaccac
,并且我想要长度为4的每个子序列,那么前3个子序列将是:
CCAC、CACA、ACAC等
为了做到这一点,我必须在字符串上迭代几次,下面是我的实现

try
    {
        String file = sc.nextLine();
        BufferedReader reader = new BufferedReader(new FileReader(file + ".fasta")); 

        Map<String, Integer> frequency = new HashMap<>(); 

        String line = reader.readLine();

        while(line != null)
        {
            System.out.println("Processing Line: " + line);
            String [] kmer = line.split("");

            for(String nucleotide : kmer)
            {
                System.out.print(nucleotide);
                int sequence = nucleotide.length(); 
                for(int i = 0; i < sequence; i++)
                {
                    String subsequence = nucleotide.substring(i, i+5); 
                    if(frequency.containsKey(subsequence))
                    {
                        frequency.put(subsequence, frequency.get(subsequence) +1);
                    }
                    else
                    {
                        frequency.put(subsequence, 1);
                    }
                }
            }
            System.out.println();
            line = reader.readLine();
        }
        System.out.println(frequency);            
    }
    catch(StringIndexOutOfBoundsException e)
    {
        System.out.println();
    }
试试看
{
字符串文件=sc.nextLine();
BufferedReader=new BufferedReader(new FileReader(file+“.fasta”));
映射频率=新HashMap();
字符串行=reader.readLine();
while(行!=null)
{
System.out.println(“处理行:“+行”);
字符串[]kmer=line.split(“”);
用于(字符串核苷酸:kmer)
{
系统输出打印(核苷酸);
int序列=核苷酸长度();
对于(int i=0;i捕捉(StringIndexOutOfBoundsException e)
{
System.out.println();
}

我在到达字符串末尾时遇到问题,由于错误,它将无法继续处理。我该怎么处理这件事呢

问题描述根本不清楚,但我猜您的输入文件以空行结尾

尝试删除输入文件中的最后一个换行符,或者检查while循环中的空换行符:

while (line != null && !line.isEmpty())

根据帖子的标题…尝试更改while循环的条件。而不是使用当前的:

String line = reader.readLine();
while(line != null) {
    // ...... your code .....
}
使用此代码:

String line;
while((line = reader.readLine()) != null) {
    // If file line is blank then skip to next file line.
    if (line.trim().equals("")) {
        continue;
    }
    // ...... your code .....
}
这将包括处理空白文件行


现在谈谈您遇到的StringIndexOutOfBoundsException异常。我相信到现在为止,你已经基本上知道了为什么你会收到这个异常,因此你需要决定你想要做什么。如果要将字符串拆分为特定长度的块,并且该长度与总长度(如果是特定的文件行字符)不相等,则显然有几个选项可用:

  • 忽略文件行末尾的剩余字符。虽然这是一个简单的解决方案,但并不十分可行,因为它会产生不完整的数据。我对DNA一无所知,但我肯定这不是我要走的路
  • 将剩余的DNA序列(即使很短)添加到地图中。同样,我对DNA一无所知,我也不确定这是否是一个可行的解决方案。也许是的,我只是不知道
  • 将剩余的短DNA序列添加到下一个序列的开头 输入文件行并继续将该行拆分为4个字符 大块。继续执行此操作,直到到达文件末尾 如果最终的DNA序列被确定为短,则添加 将其添加到地图(或不添加)
当然,可能还有其他选择,无论它们是什么,都需要您做出决定。然而,为了帮助您,下面是我提到的三个选项的代码:

忽略其余字符:

Map frequency=newhashmap();
字符串子序列;
弦线;
try(BufferedReader=newbufferedreader(newfilereader(“DNA.txt”)){
而((line=reader.readLine())!=null){
//如果文件行为空,则跳到下一个文件行。
if(line.trim()等于(“”){
继续;
}
对于(int i=0;i(line.length()-1)){
打破
}
子序列=行子串(i,i+4);
if(频率容器(子序列)){
frequency.put(子序列,frequency.get(子序列)+1);
}
否则{
频率put(子序列,1);
}
}
}
}
捕获(IOEX异常){
例如printStackTrace();
}
将剩余的DNA序列(即使很短)添加到地图中:

Map frequency=newhashmap();
字符串子序列;
弦线;
try(BufferedReader=newbufferedreader(newfilereader(“DNA.txt”)){
而((line=reader.readLine())!=null){
//如果文件行为空,则跳到下一个文件行。
if(line.trim()等于(“”){
继续;
}
字符串lineRemaining=“”;
对于(int i=0;i(line.length()-1)){
lineRemaining=行子串(i);
打破
}
子序列=行子串(i,i+4);
if(频率容器(子序列)){
frequency.put(子序列,frequency.get(子序列)+1);
}
否则{
频率put(子序列,1);
}
}
如果(lineRemaining.length()>0){
子序列=剩余行数;
if(频率容器(子序列)){
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
    while ((line = reader.readLine()) != null) {
        // If file line is blank then skip to next file line.
        if (line.trim().equals("")) {
            continue;
        }

        for (int i = 0; i < line.length(); i += 4) {
            // Get out of loop - Don't want to deal with remaining Chars
            if ((i + 4) > (line.length() - 1)) {
                   break;
            }

            subsequence = line.substring(i, i + 4);
            if (frequency.containsKey(subsequence)) {
                frequency.put(subsequence, frequency.get(subsequence) + 1);
            }
            else {
                frequency.put(subsequence, 1);
            }
        }
    }
}
catch (IOException ex) {
    ex.printStackTrace();
}
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
    while ((line = reader.readLine()) != null) {
        // If file line is blank then skip to next file line.
        if (line.trim().equals("")) {
            continue;
        }

        String lineRemaining = "";

        for (int i = 0; i < line.length(); i += 4) {
            // Get out of loop - Don't want to deal with remaining Chars
            if ((i + 4) > (line.length() - 1)) {
                lineRemaining = line.substring(i);
                break;
            }

            subsequence = line.substring(i, i + 4);
            if (frequency.containsKey(subsequence)) {
                frequency.put(subsequence, frequency.get(subsequence) + 1);
            }
            else {
                frequency.put(subsequence, 1);
            }
        }
        if (lineRemaining.length() > 0) {
            subsequence = lineRemaining;
            if (frequency.containsKey(subsequence)) {
                frequency.put(subsequence, frequency.get(subsequence) + 1);
            }
            else {
                frequency.put(subsequence, 1);
            }
        }
    }
}
catch (IOException ex) {
    ex.printStackTrace();
}
Map<String, Integer> frequency = new HashMap<>();
String lineRemaining = "";
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
    while ((line = reader.readLine()) != null) {
        // If file line is blank then skip to next file line.
        if (line.trim().equals("")) {
            continue;
        }
        // Add remaining portion of last line to new line.
        if (lineRemaining.length() > 0) {
            line = lineRemaining + line;
            lineRemaining = "";
        }

        for (int i = 0; i < line.length(); i += 4) {
            // Get out of loop - Don't want to deal with remaining Chars
            if ((i + 4) > (line.length() - 1)) {
                lineRemaining = line.substring(i);
                break;
            }

            subsequence = line.substring(i, i + 4);
            if (frequency.containsKey(subsequence)) {
                frequency.put(subsequence, frequency.get(subsequence) + 1);
            }
            else {
                frequency.put(subsequence, 1);
            }
        }
    }
    // If any Chars remaining at end of file then
    // add to MAP
    if (lineRemaining.length() > 0) {
        frequency.put(lineRemaining, 1);
    }
}
catch (IOException ex) {
    ex.printStackTrace();
}
for(int i = 0; i < sequence - 4; i++)
while(reader.hasNextLine())
{
    line = reader.nextLine();
    for(int i = 0; i < line.length; i++)
    {
        String subsequence = "";
        // put the extract operation in a try block
        // to avoid crashing
        try
        {
            subsequence = nucleotide.substring(i, i+4); 
        }
        catch(Exception e)
        {
            // just leave blank to pass the error
        }

        if(frequency.containsKey(subsequence))
        {
            frequency.put(subsequence, frequency.get(subsequence) +1);
        }
        else
        {
            frequency.put(subsequence, 1);
        }
    }