Java 如何在到达空字符串时继续处理文件
我试图读入一个包含DNA序列的文件。在我的程序中,我想读取长度为4的DNA的每个子序列,并将其存储在hashmap中,以计算每个子序列的出现次数。例如,如果我有序列Java 如何在到达空字符串时继续处理文件,java,string,hashmap,Java,String,Hashmap,我试图读入一个包含DNA序列的文件。在我的程序中,我想读取长度为4的DNA的每个子序列,并将其存储在hashmap中,以计算每个子序列的出现次数。例如,如果我有序列ccaccac,并且我想要长度为4的每个子序列,那么前3个子序列将是: CCAC、CACA、ACAC等 为了做到这一点,我必须在字符串上迭代几次,下面是我的实现 try { String file = sc.nextLine(); BufferedReader reader = new Buff
ccaccac
,并且我想要长度为4的每个子序列,那么前3个子序列将是:CCAC、CACA、ACAC等
为了做到这一点,我必须在字符串上迭代几次,下面是我的实现
try
{
String file = sc.nextLine();
BufferedReader reader = new BufferedReader(new FileReader(file + ".fasta"));
Map<String, Integer> frequency = new HashMap<>();
String line = reader.readLine();
while(line != null)
{
System.out.println("Processing Line: " + line);
String [] kmer = line.split("");
for(String nucleotide : kmer)
{
System.out.print(nucleotide);
int sequence = nucleotide.length();
for(int i = 0; i < sequence; i++)
{
String subsequence = nucleotide.substring(i, i+5);
if(frequency.containsKey(subsequence))
{
frequency.put(subsequence, frequency.get(subsequence) +1);
}
else
{
frequency.put(subsequence, 1);
}
}
}
System.out.println();
line = reader.readLine();
}
System.out.println(frequency);
}
catch(StringIndexOutOfBoundsException e)
{
System.out.println();
}
试试看
{
字符串文件=sc.nextLine();
BufferedReader=new BufferedReader(new FileReader(file+“.fasta”));
映射频率=新HashMap();
字符串行=reader.readLine();
while(行!=null)
{
System.out.println(“处理行:“+行”);
字符串[]kmer=line.split(“”);
用于(字符串核苷酸:kmer)
{
系统输出打印(核苷酸);
int序列=核苷酸长度();
对于(int i=0;i 捕捉(StringIndexOutOfBoundsException e)
{
System.out.println();
}
我在到达字符串末尾时遇到问题,由于错误,它将无法继续处理。我该怎么处理这件事呢 问题描述根本不清楚,但我猜您的输入文件以空行结尾
尝试删除输入文件中的最后一个换行符,或者检查while循环中的空换行符:
while (line != null && !line.isEmpty())
根据帖子的标题…尝试更改while循环的条件。而不是使用当前的:
String line = reader.readLine();
while(line != null) {
// ...... your code .....
}
使用此代码:
String line;
while((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
// ...... your code .....
}
这将包括处理空白文件行
现在谈谈您遇到的StringIndexOutOfBoundsException异常。我相信到现在为止,你已经基本上知道了为什么你会收到这个异常,因此你需要决定你想要做什么。如果要将字符串拆分为特定长度的块,并且该长度与总长度(如果是特定的文件行字符)不相等,则显然有几个选项可用:
- 忽略文件行末尾的剩余字符。虽然这是一个简单的解决方案,但并不十分可行,因为它会产生不完整的数据。我对DNA一无所知,但我肯定这不是我要走的路李>
- 将剩余的DNA序列(即使很短)添加到地图中。同样,我对DNA一无所知,我也不确定这是否是一个可行的解决方案。也许是的,我只是不知道
- 将剩余的短DNA序列添加到下一个序列的开头
输入文件行并继续将该行拆分为4个字符
大块。继续执行此操作,直到到达文件末尾
如果最终的DNA序列被确定为短,则添加
将其添加到地图(或不添加)
当然,可能还有其他选择,无论它们是什么,都需要您做出决定。然而,为了帮助您,下面是我提到的三个选项的代码:
忽略其余字符:
Map frequency=newhashmap();
字符串子序列;
弦线;
try(BufferedReader=newbufferedreader(newfilereader(“DNA.txt”)){
而((line=reader.readLine())!=null){
//如果文件行为空,则跳到下一个文件行。
if(line.trim()等于(“”){
继续;
}
对于(int i=0;i(line.length()-1)){
打破
}
子序列=行子串(i,i+4);
if(频率容器(子序列)){
frequency.put(子序列,frequency.get(子序列)+1);
}
否则{
频率put(子序列,1);
}
}
}
}
捕获(IOEX异常){
例如printStackTrace();
}
将剩余的DNA序列(即使很短)添加到地图中:
Map frequency=newhashmap();
字符串子序列;
弦线;
try(BufferedReader=newbufferedreader(newfilereader(“DNA.txt”)){
而((line=reader.readLine())!=null){
//如果文件行为空,则跳到下一个文件行。
if(line.trim()等于(“”){
继续;
}
字符串lineRemaining=“”;
对于(int i=0;i(line.length()-1)){
lineRemaining=行子串(i);
打破
}
子序列=行子串(i,i+4);
if(频率容器(子序列)){
frequency.put(子序列,frequency.get(子序列)+1);
}
否则{
频率put(子序列,1);
}
}
如果(lineRemaining.length()>0){
子序列=剩余行数;
if(频率容器(子序列)){
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
}
catch (IOException ex) {
ex.printStackTrace();
}
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
String lineRemaining = "";
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
lineRemaining = line.substring(i);
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
if (lineRemaining.length() > 0) {
subsequence = lineRemaining;
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
}
catch (IOException ex) {
ex.printStackTrace();
}
Map<String, Integer> frequency = new HashMap<>();
String lineRemaining = "";
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
// Add remaining portion of last line to new line.
if (lineRemaining.length() > 0) {
line = lineRemaining + line;
lineRemaining = "";
}
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
lineRemaining = line.substring(i);
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
// If any Chars remaining at end of file then
// add to MAP
if (lineRemaining.length() > 0) {
frequency.put(lineRemaining, 1);
}
}
catch (IOException ex) {
ex.printStackTrace();
}
for(int i = 0; i < sequence - 4; i++)
while(reader.hasNextLine())
{
line = reader.nextLine();
for(int i = 0; i < line.length; i++)
{
String subsequence = "";
// put the extract operation in a try block
// to avoid crashing
try
{
subsequence = nucleotide.substring(i, i+4);
}
catch(Exception e)
{
// just leave blank to pass the error
}
if(frequency.containsKey(subsequence))
{
frequency.put(subsequence, frequency.get(subsequence) +1);
}
else
{
frequency.put(subsequence, 1);
}
}