Python+；Regex:AttributeError:'；非类型'；对象没有属性'；组'；_Python_Regex

Python+；Regex:AttributeError:'；非类型'；对象没有属性'；组'；

python regex

Python+；Regex:AttributeError:'；非类型'；对象没有属性'；组'；,python,regex,Python,Regex,我有一个字符串，我想从中提取一个子集。这是大型Python脚本的一部分这是字符串： import re htmlString = '</dd><dt> Fine, thank you. </dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)' 由于Result.groups（）不起作用，因此我要进行的提取（即Result.group（5）和

我有一个字符串，我想从中提取一个子集。这是大型Python脚本的一部分

这是字符串：

import re

htmlString = '</dd><dt> Fine, thank you.&#160;</dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'

由于

Result.groups（）

不起作用，因此我要进行的提取（即

Result.group（5）

和

Result.group（7）

）也不起作用。

但我不明白为什么我会犯这个错误？正则表达式在TextWrangler中工作，为什么不在Python中工作？我是Python的初学者

您得到的是

AttributeError

，因为您正在

None

上调用

组

，该组没有任何方法

regex.search

None

表示regex无法从提供的字符串中找到任何与模式匹配的内容

使用正则表达式时，最好检查是否已进行匹配：

Result = re.search(SearchStr, htmlString)

if Result:
    print Result.groups()

重新导入
htmlString='很好，谢谢。 ；蜕皮，蜕皮。（mohl behh，GRAH syuhs）'
SearchStr='（\\）+（[\w++，\.\s]+）（[\&\\\\\d\]+）（\\）+（[\w\，\s\s\w\？\！！\.]+（\（\）（[\w\s\，\-]+）（\\）'
结果=重新搜索（SearchStr.decode（'utf-8'）、htmlString.decode（'utf-8'）、re.I | re.U）
打印结果。组（）

就是这样。表达式包含非拉丁字符，因此通常会失败。您必须解码为Unicode并使用re.U（Unicode）标志

我也是一个初学者，我自己也曾多次遇到过这个问题。

试着将你的

htmlString

解码成Unicode，以避免在（mohl behh，GRAH syuhs）中的（）出现问题。我已经试过了'（'和'\（'但两者似乎都不起作用。哇，我没有意识到我所需要的只是

if Result:

…为什么该语句有效？在阅读它时，感觉它缺少if语句的其余部分。因为它比较了对象！=我想没有。如果该语句为真，则输入ifpart@Fumbles-值得一读，你可以if语句有很多很酷的技巧——比如如果你想测试一个列表是否为空，只要在my_list:中执行

，并且它只会在列表为非空时运行。空列表、空字符串、整数0，或者在本例中为空正则表达式匹配，都被认为是“Falsy”，因此它们无法通过条件中的真值测试。
Result = re.search(SearchStr, htmlString)

if Result:
    print Result.groups()

import re

htmlString = '</dd><dt> Fine, thank you.&#160;</dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'

SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'

Result = re.search(SearchStr.decode('utf-8'), htmlString.decode('utf-8'), re.I | re.U)

print Result.groups()