Python 在空白之间检索字符串_Python_Regex_Beautifulsoup_Whitespace

Python 在空白之间检索字符串

python regex

Python 在空白之间检索字符串,python,regex,beautifulsoup,whitespace,Python,Regex,Beautifulsoup,Whitespace,我有一个属于变量tbody的字符串，如下所示： tbody = '... </td> <td class="Details clearfix"> <div> <b> 9. I want this text and number </b> </div> </td> <td class="flux"> ...' >print type(tbody) <type

我有一个属于变量tbody的字符串，如下所示：

tbody = 
'...
</td>
<td class="Details clearfix">
<div>
<b>

9. I want this text and number

            </b>
</div>
</td>
<td class="flux">
...'

>print type(tbody)
<type 'str'>

这是我得到的结果：

[...'<td class="Details clearfix">', '<div>', '<b>',
'\\', '9. I want this text and number', '\\', '                </b>', '</div>',
'</td>', '<td class="flux>'...]

<代码>[…''，''， “\\”，“9.我想要这个文本和数字“，“\\”，“，”， ''，

来自bs4导入组
tbody=”“”
9.我要这段文字和号码
"""
汤=美汤（t身体）
对于汤中的项目。查找所有（'td'，class=“Details clearfix”）：
打印项.div.b.text.strip（）
#输出=9。我想要这个文本和数字

我认为没有必要拆分，您可以通过在beautiful soup中搜索从bs4 import BeautifulSoup获取预期的输出 tbody=”“” 9.我要这段文字和号码 """ 汤=美汤（t身体）对于汤中的项目。查找所有（'td'，class=“Details clearfix”）：打印项.div.b.text.strip（） #输出=9。我想要这个文本和数字

我认为不需要拆分，您可以通过在beautiful soup中搜索来获取预期的输出

您可以通过Python的re模块使用DOTALL修饰符来实现这一点

>>> import re
>>> m = re.search(r'<td.*?>.*?<b>\s*([^\n]*).*<\/b>.*?<\/td>', tbody, re.DOTALL)
>>> m.group(1)
'9. I want this text and number'

>>重新导入
>>>m=re.search（r.*？\s*（[^\n]*）.*.*？'，tbody，re.DOTALL）
>>>m组（1）
“9.我想要这个文本和数字”

您可以通过Python的re模块使用DOTALL修饰符来实现这一点

>>> import re
>>> m = re.search(r'<td.*?>.*?<b>\s*([^\n]*).*<\/b>.*?<\/td>', tbody, re.DOTALL)
>>> m.group(1)
'9. I want this text and number'

>>重新导入
>>>m=re.search（r.*？\s*（[^\n]*）.*.*？'，tbody，re.DOTALL）
>>>m组（1）
“9.我想要这个文本和数字”

>>> import re
>>> m = re.search(r'<td.*?>.*?<b>\s*([^\n]*).*<\/b>.*?<\/td>', tbody, re.DOTALL)
>>> m.group(1)
'9. I want this text and number'