Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python中的生成器函数_Python_Function_Iterator_Generator_String Matching - Fatal编程技术网

Python中的生成器函数

Python中的生成器函数,python,function,iterator,generator,string-matching,Python,Function,Iterator,Generator,String Matching,我目前正在研究麻省理工学院开放式课程的一个习题集,任务是在DNA序列中找到匹配的子串 我正在努力编写一个返回长度为k的子序列的函数。当使用字符串时,我可以让它工作,但问题是使用迭代器设置的,当使用迭代器时,函数似乎每次都会重置,而不是返回其原始位置 下面是我编写的一个使用字符串的正确函数: def subs(seq, k): subseq = '' pos = 0 while pos < len(seq): while len(subseq) <

我目前正在研究麻省理工学院开放式课程的一个习题集,任务是在DNA序列中找到匹配的子串

我正在努力编写一个返回长度为k的子序列的函数。当使用字符串时,我可以让它工作,但问题是使用迭代器设置的,当使用迭代器时,函数似乎每次都会重置,而不是返回其原始位置

下面是我编写的一个使用字符串的正确函数:

def subs(seq, k):
    subseq = ''
    pos = 0
    while pos < len(seq):
        while len(subseq) < k:
            subseq += seq[pos]
            pos += 1
        yield subseq, pos - k
        subseq = subseq[1:] 
我目前的解决方案是:

def subsequenceHashes(seq, k):
    subseq = ''
    pos = 0
    print 'Start of subseqHashes'
    try:
        while True:
            while len(subseq) < k:
                subseq += seq.next()
                pos += 1
            print subseq, pos - k
            yield hash(subseq), pos - k
            subseq = subseq[1:]
    except StopIteration:
        return
运行测试时会发生什么情况:

Start of subseqHashes


yab 0
Start of subseqHashes


xxa 0
starting
iterate
Start of subseqHashes


cab 0
0
iterate
Start of subseqHashes


cab 0
0
iterate
Start of subseqHashes


F..
======================================================================
FAIL: test_one (__main__.TestExactSubmatches)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Alex\Desktop\Pythonwork\6.006\ps4\dist\test_dnaseq.py", line 32, in test_one
    self.assertTrue(len(matches) == len(correct))
AssertionError: False is not true

似乎出了问题的是,每当我使用.next()时,subsequencehash都会被重置,因为它的主体中有一个迭代器,而不是使用字符串时留在循环中。

正如@jornsharpe所提到的,我的错误是多次调用生成器函数,而不是实际对其进行迭代。

每次调用时,例如
子序列hash(b,k)
它将再次启动。你应该在函数开始时创建一次。我将要比较的DNA序列有几千万个核苷酸长,习题集建议创建生成器函数。是的,但是您应该只调用生成器函数一次。在那之后,您只想对它进行迭代,而不想继续重新启动它。从
gen_a=subsequencehash(a,k)
开始,然后从那里开始。注意,只有数千万个字符的字符串很容易放入内存。您应该首先尝试这个简单的解决方案,并且只有当您确实有内存问题时才切换到生成器/迭代器
def subsequenceHashes(seq, k):
    subseq = ''
    pos = 0
    print 'Start of subseqHashes'
    try:
        while True:
            while len(subseq) < k:
                subseq += seq.next()
                pos += 1
            print subseq, pos - k
            yield hash(subseq), pos - k
            subseq = subseq[1:]
    except StopIteration:
        return
def getExactSubmatches(a, b, k, m): 
    # a and b are the strings compared, k is the length of substring, parameter m is unused, need it for later on in the problem set
    ahash, apos = subsequenceHashes(a, k).next()
    bhash, bpos = subsequenceHashes(b, k).next()
    multidict = Multidict()
    print 'starting'
    while ahash:
        print 'iterate'
        multidict.put(ahash, ('a', apos))
        ahash, apos = subsequenceHashes(a, k).next()
        print apos
    while bhash:
        multidict.put(bhash, ('b', bpos))
        bhash, bpos = subsequenceHashes(b, k).next()
    for key in multidict.mydict:
        if len(multidict.get(key)) > 1:
            for t in multidict.get(key):
                if t[0] == 'a':
                    for s in multidict.get(key):
                        if s[0] == 'b':
                            if a[apos:apos+k] == b[bpos:bpos+k]:
                                print apos, bpos
                                yield apos, bpos
Start of subseqHashes


yab 0
Start of subseqHashes


xxa 0
starting
iterate
Start of subseqHashes


cab 0
0
iterate
Start of subseqHashes


cab 0
0
iterate
Start of subseqHashes


F..
======================================================================
FAIL: test_one (__main__.TestExactSubmatches)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Alex\Desktop\Pythonwork\6.006\ps4\dist\test_dnaseq.py", line 32, in test_one
    self.assertTrue(len(matches) == len(correct))
AssertionError: False is not true