Python 在html标记之间使用re.findall regex进行解析会导致eof错误

Python 在html标记之间使用re.findall regex进行解析会导致eof错误,python,regex,parsing,Python,Regex,Parsing,我想使用re.findall解析以下文本中的所有链接扩展,以将结果存储在数组中 my_text = <td class="1stclass"> <div class="2ndclass"> <div class="2ndclass__img"><a href="link_extension_1.php"><div class="3rdclass"><img alt="hello" border="0" class="image"

我想使用re.findall解析以下文本中的所有链接扩展,以将结果存储在数组中

my_text = <td class="1stclass"> <div class="2ndclass"> <div class="2ndclass__img"><a href="link_extension_1.php"><div class="3rdclass"><img alt="hello" border="0" class="image" height="42" src="https://yoyo.jpg"/></div></a></div> <div class="2ndclass__content"><p><a href="link_extension_1.php"></a> </p> </div> <div class="2ndclass__compare"><label for="comparer2" style="font-size:11px;"><input class="js__media__compare__input" id="comparer2" name="comparer" type="checkbox" value="89453"/> Comparer</label></div> </div></td>
<td class="1stclass"> <div class="2ndclass"> <div class="2ndclass__img"><a href="link_extension_2.php"><div class="3rdclass"><img alt="hello" border="0" class="image" height="42" src="https://yoyo.jpg"/></div></a></div> <div class="2ndclass__content"><p><a href="link_extension_2.php"></a> </p> </div> <div class="2ndclass__compare"><label for="comparer2" style="font-size:11px;"><input class="js__media__compare__input" id="comparer2" name="comparer" type="checkbox" value="89453"/> Comparer</label></div> </div></td>
<td class="1stclass"> <div class="2ndclass"> <div class="2ndclass__img"><a href="link_extension_3.php"><div class="3rdclass"><img alt="hello" border="0" class="image" height="42" src="https://yoyo.jpg"/></div></a></div> <div class="2ndclass__content"><p><a href="link_extension_3.php"></a> </p> </div> <div class="2ndclass__compare"><label for="comparer2" style="font-size:11px;"><input class="js__media__compare__input" id="comparer2" name="comparer" type="checkbox" value="89453"/> Comparer</label></div> </div></td>
我试过:

re.findall(r'\<div class="2ndclass__img"><a href="(.*?)\"><div', my_text)
但我犯了一个错误:

SyntaxError:分析时出现意外的EOF 谢谢你,麦克斯


你的正则表达式对我来说很好

>>> re.findall(r'\<div class="2ndclass__img"><a href="(.*?)\"><div', my_text)
['link_extension_1.php', 'link_extension_2.php', 'link_extension_3.php']
>>> re.findall(r'\<div class="2ndclass__img"><a href="(.*?)\"><div', my_text)
['link_extension_1.php', 'link_extension_2.php', 'link_extension_3.php']
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(my_text, "html.parser")
>>> [div.find('a').get('href') for div in soup.find_all('div', {'class': "2ndclass__img"})]
['link_extension_1.php', 'link_extension_2.php', 'link_extension_3.php']