Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/svn/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中打印多行上字符串计数的总和_Python - Fatal编程技术网

在Python中打印多行上字符串计数的总和

在Python中打印多行上字符串计数的总和,python,Python,这是一个简单的问题,但我就是解决不了。我想为每一行序列计算A的数量。请参见下面的示例: 这是我的意见: >sca200 ACACGTGYNNNN ACGTCCCGWCNN NNNNNNNNNA >scaf500 AAAAAAAAAAAA TTTTTTTTTTTT WCWCWNNNN >scaf201 AACACACACACC GTGTGTGTGTGT WWRRRYNNNNNN NNNNNN 代码: 输出为: sca200 2 3 4 scaf500 12 12 12 scaf

这是一个简单的问题,但我就是解决不了。我想为每一行序列计算A的数量。请参见下面的示例:

这是我的意见:

>sca200
ACACGTGYNNNN
ACGTCCCGWCNN
NNNNNNNNNA
>scaf500
AAAAAAAAAAAA
TTTTTTTTTTTT
WCWCWNNNN
>scaf201
AACACACACACC
GTGTGTGTGTGT
WWRRRYNNNNNN
NNNNNN
代码:

输出为:

sca200
2
3
4
scaf500
12
12
12
scaf201
6
6
6
6
我如何才能让它只报告最终的数字?即:

sca200
4
scaf500
12
scaf201
6

尝试按换行拆分输入,然后仅当行以开头时才输出总计。这将解决您的问题:

#!/usr/bin/python
from __future__ import division
import sys

fasta = open(sys.argv[1], "r")
total_A = None
for line in fasta:
    line = line.rstrip("\n")
    if line.startswith(">"): 
        print total_A if total_A != None else 0
        total_A = 0
        print line[1:]
    else:
        A = line.count('A')
        total_A += A
 print total_A
您只需在新fasta标头启动时打印
A
的总计数


注意:编辑以解决@Lafexlos提出的评论

这里有一个线性解决方案:

from __future__ import print_function

import sys
import re

with open(sys.argv[1], 'r') as f:
    data = f.read()

"""
1.  find all blocks of text and split it into two groups: (block_name, corresponding_TEXT)
2.  loop through blocks
3.  print 'block_name' and the length of list containing all 'A's from the corresponding_TEXT
"""

[   print('{0}\n{1}'.format(name, len(re.findall(r'A', txt, re.M)))) 
    for name, txt in re.findall(r'>(sca[^\n]*)([^>]*)', data, re.M)
]
输出:

sca200
4
scaf500
12
scaf201
6

这将得到一个错误,因为当您试图打印它时,甚至没有定义total_A。另外,如果您将其移动到total_A=0下,它将打印0。在您第一次编辑之后:现在不打印最后一个值,因此您应该在
for
循环之后添加一个额外的
打印total_A
。这是真的!抓得好!我认为文件对象没有属性拆分,至少我不能这样做。哦,对不起-我认为
fasta
是一个字符串(没有注意到
open
:)-然后将文件内容读到名为
fasta
@m.antkowicz的字符串,我已经在代码中添加了注释。我希望现在读起来会更容易…;)请随时接受和/或投票支持有帮助的答案
from __future__ import print_function

import sys
import re

with open(sys.argv[1], 'r') as f:
    data = f.read()

"""
1.  find all blocks of text and split it into two groups: (block_name, corresponding_TEXT)
2.  loop through blocks
3.  print 'block_name' and the length of list containing all 'A's from the corresponding_TEXT
"""

[   print('{0}\n{1}'.format(name, len(re.findall(r'A', txt, re.M)))) 
    for name, txt in re.findall(r'>(sca[^\n]*)([^>]*)', data, re.M)
]
sca200
4
scaf500
12
scaf201
6