Python 2.7 作为一个初出茅庐的蟒蛇，我不'；我不明白为什么我会有一个无限循环？_Python 2.7_While Loop_Infinite Loop

Python 2.7 作为一个初出茅庐的蟒蛇，我不'；我不明白为什么我会有一个无限循环？

python-2.7

Python 2.7 作为一个初出茅庐的蟒蛇，我不'；我不明白为什么我会有一个无限循环？,python-2.7,while-loop,infinite-loop,Python 2.7,While Loop,Infinite Loop,此代码在以下情况下始终提供无限循环： pos1 = 0 pos2 = 0 url_string = '''<h1>Daily News </h1><p>This is the daily news.</p><p>end</p>''' i = int(len(url_string)) #print i # debug while i > 0: pos1 = int(url_string.find('>')

此代码在以下情况下始终提供无限循环：

pos1 = 0
pos2 = 0
url_string = '''<h1>Daily News </h1><p>This is the daily news.</p><p>end</p>'''
i = int(len(url_string))
#print i  # debug
while i > 0:
    pos1 = int(url_string.find('>'))
    #print pos1 # debug
    pos2 = int(url_string.find('<', pos1))
    #print pos2  # debug
    url_string = url_string[pos2:]
    #print url_string  # debug
    print int(len(url_string))  # debug
    i =  int(len(url_string))

pos1=0
pos2=0
url\u string=''每日新闻这是每日新闻。
结束''
i=int（len（url\u字符串））
#打印i#调试
当i>0时：
pos1=int（url_string.find（'>'））
#打印pos1#调试
pos2=int（url\u string.find（'
将使用url\u字符串[-1:]
，这是一个由url\u字符串的最后一个字符组成的片段。在这一点上，Python一直在循环，没有找到，正如@user2357112所指出的那样，您永远不会超过字符串的结尾
有几个解决方案，但一个简单的解决方案（基于不知道您想要实现什么）是在循环中包含pos1和pos2的知识
while (i > 0 && pos1 >= 0 && pos2 >= 0):

如果未找到您要查找的任一字符，则循环将停止。
拆分字符串并按如下方式计算字母数更容易：
map(len, url_string.split('<')) # This equals [0, 14, 4, 25, 3, 5, 3]

这适用于单字符分割
编辑
正如所指出的，要求是只提取不属于标签一部分的东西。然后一行
''.join( map(lambda x: x.split('>')[-1] ,  url_string.split('<')) )

'.join（map（lambda x:x.split（'>'））[-1]，url_string.split（'看起来您试图解析HTML以从元素中获取数据（例如，我希望数据位于h1标记内，如“Daily News”）。如果是这种情况，我建议在此链接使用另一个名为BeautifulSoup4的库：
这就是说，因为我不确定这个程序到底要做什么，所以我分解了你的代码，希望你能更容易地看到变量发生了什么（现在，去掉while循环）。这将让你准确地看到你的代码在没有无限循环的情况下做了什么
# Setup Variables
pos1 = 0
pos2 = 0
url_string = '''<h1>Daily News </h1><p>This is the daily news.</p><p>end</p>'''
i = int(len(url_string)) # the url_string length is 60 characters
print "Setting up Variables with string at ", i, " characters"
print "String is: ", url_string

"""string.find(s, sub[, start[, end]])
Return the lowest index in s where the substring sub is found such that sub is 
wholly contained in s[start:end]. Return -1 on failure. Defaults for start and 
end and interpretation of negative values is the same as for slices.

Source: http://docs.python.org/2/library/string.html
"""

print "Running through program first time"
pos1 = int(url_string.find('>'))
# This finds the first occurrence of '>', which is at position 6

pos2 = int(url_string.find('<', pos1))
# This finds the first occurrence of '<' after position 3 ('>'),
# which is at position 15
print "Pos1 is at:", pos1, " and pos2 is at:", pos2

url_string = url_string[pos2:] # trimming string down?
print "The string is now: ", url_string
# </h1><p>This is the daily news.</p><p>end</p>

print "The string length is now: ", int(len(url_string)) # string length now 45
i = int(len(url_string)) # updating the length var to the new length

#设置变量
pos1=0
pos2=0
url\u string=''每日新闻这是每日新闻。
结束''
i=int（len（url_字符串））#url_字符串长度为60个字符
打印“使用字符串设置变量”，i，“字符”
打印“字符串为：”，url\u字符串
“”“string.find（s，sub[，start[，end]]）
返回在s中找到子字符串sub的最低索引，以便sub
完全包含在s[start:end]中。失败时返回-1。默认值为start和end
负值的结束和解释与切片相同。
资料来源：http://docs.python.org/2/library/string.html
"""
打印“第一次运行程序”
pos1=int（url_string.find（'>'））
#这将查找第一个出现的“>”，它位于位置6
pos2=int（url_string.find（'您的调试输出是什么？它一定是一个很大的提示。print url_string
注意：不需要int
强制转换，html代码不是“url”。对于多个字符，例如在''
处拆分，您需要将最后一行修改为lens=lens+len（''）*arange（lens））
这不也包括所有的标签吗？我相信这个想法是要输出所有不是标签的东西。这些是标签的位置。我想我误解了这个问题。给我一点时间。我会再看一遍代码……在这种情况下，一行代码'.join（map（lambda x:x.split（'>）[-1]，url\u string.split（'1
import numpy as np
lens = np.cumsum( map(len, url_string.split('<')) )

 lens = lens + arange(len(lens))

''.join( map(lambda x: x.split('>')[-1] ,  url_string.split('<')) )

# Setup Variables
pos1 = 0
pos2 = 0
url_string = '''<h1>Daily News </h1><p>This is the daily news.</p><p>end</p>'''
i = int(len(url_string)) # the url_string length is 60 characters
print "Setting up Variables with string at ", i, " characters"
print "String is: ", url_string

"""string.find(s, sub[, start[, end]])
Return the lowest index in s where the substring sub is found such that sub is 
wholly contained in s[start:end]. Return -1 on failure. Defaults for start and 
end and interpretation of negative values is the same as for slices.

Source: http://docs.python.org/2/library/string.html
"""

print "Running through program first time"
pos1 = int(url_string.find('>'))
# This finds the first occurrence of '>', which is at position 6

pos2 = int(url_string.find('<', pos1))
# This finds the first occurrence of '<' after position 3 ('>'),
# which is at position 15
print "Pos1 is at:", pos1, " and pos2 is at:", pos2

url_string = url_string[pos2:] # trimming string down?
print "The string is now: ", url_string
# </h1><p>This is the daily news.</p><p>end</p>

print "The string length is now: ", int(len(url_string)) # string length now 45
i = int(len(url_string)) # updating the length var to the new length