Python 为什么返回的是非类型？_Python_Wikipedia_Attributeerror_Nonetype

Python 为什么返回的是非类型？

python

Python 为什么返回的是非类型？,python,wikipedia,attributeerror,nonetype,Python,Wikipedia,Attributeerror,Nonetype,我试图使用下面的函数从Wikipedia中刮取信息，但我遇到了一个属性错误，因为函数调用返回None。有人能试着解释一下为什么这一次一个也没有返回吗 import wikipedia as wp import string def add_section_info(search): HTML = wp.page(search).html().encode("UTF-8") #gets HTML source from Wikipedia with open("temp.xml

我试图使用下面的函数从Wikipedia中刮取信息，但我遇到了一个属性错误，因为函数调用返回None。有人能试着解释一下为什么这一次一个也没有返回吗

import wikipedia as wp
import string

def add_section_info(search):
    HTML = wp.page(search).html().encode("UTF-8") #gets HTML source from Wikipedia

    with open("temp.xml",'w') as t: #write HTML to xml format
        t.write(HTML)

    table_of_contents = []
    dict_of_section_info = {}

    #This extracts the info in the table of contents
    with open("temp.xml",'r') as r:
        for line in r:
            if "toclevel" in line: 
                new_string = line.partition("#")[2]
                content_title = new_string.partition("\"")[0]
                tbl = string.maketrans("_"," ")
                content_title = content_title.translate(tbl)
                table_of_contents.append(content_title)

    print wp.page(search).section("Aortic rupture") #this is None, but shouldn't be

    for item in table_of_contents:
        section = wp.page(search).section(item).encode("UTF-8")
        print section
        if section == "":
            continue
        else:
            dict_of_section_info[item] = section

    with open("Section_Info.txt",'a') as sect:
        sect.write(search)
        sect.write("------------------------------------------\n")
        for item in dict_of_section_info:
            sect.write(item)
            sect.write("\n\n")
            sect.write(dict_of_section_info[item])
        sect.write("####################################\n\n")

add_section_info("Abdominal aortic aneurysm")

我不明白的是，例如，如果我运行

add\u section\u info（“HIV”）

，它就可以完美地工作

导入的wikipedia的源代码是

我对上述代码的输出如下：

Abdominal aortic aneurysm

Signs and symptoms
Traceback (most recent call last):
  File "/home/pharoslabsllc/Documents/wikitest.py", line 79, in <module>
add_section_info(line)
  File "/home/pharoslabsllc/Documents/wikitest.py", line 30, in add_section_info
    section = wp.page(search).section(item).encode("UTF-8")
AttributeError: 'NoneType' object has no attribute 'encode'

腹主动脉瘤体征和症状回溯（最近一次呼叫最后一次）：文件“/home/pharoslabllc/Documents/wikitest.py”，第79行，在添加章节信息（行）文件“/home/pharoslabllc/Documents/wikitest.py”，第30行，添加信息部分节=wp.页（搜索）.节（项目）.编码（“UTF-8”） AttributeError:“非类型”对象没有属性“encode”

wp.page（search）.section（item）

找不到您要查找的节，并返回

None

。您不检查它，而是尝试将该值作为字符串处理；这可能会失败。

如果找不到标题，则

页面

方法永远不会返回

无

（您可以在源代码中轻松检查这一点），但是

部分

方法会返回

无

。见：

章节（章节标题）

从

self.sections

获取节的纯文本内容。如果未找到
节标题
，则返回
无
，否则返回带空格的字符串
所以答案是，就图书馆而言，你所指的维基百科页面没有标题为“主动脉破裂”的部分
从维基百科本身来看，页面似乎确实有这样一个部分
请注意，如果您尝试检查
wp.page（search.sections
的值，您会得到：
[]
。也就是说，库似乎没有正确解析节。

从找到的库的源代码中，您可以看到此测试：

section = u"== {} ==".format(section_title) try: index = self.content.index(section) + len(section) except ValueError: return None
然而：

In [14]: p.content.find('Aortic') Out[14]: 3223 In [15]: p.content[3220:3220+50] Out[15]: '== Aortic ruptureEdit ===\n\nThe signs and symptoms ' In [16]: p.section('Aortic ruptureEdit') Out[16]: "The signs and symptoms of a ruptured AAA may includes severe pain in the lower back, flank, abdomen or groin. A mass that pulses with the heart beat may also be felt. The bleeding can leads to a hypovolemic shock with low blood pressure and a fast heart rate. This may lead to brief passing out.\nThe mortality of AAA rupture is up to 90%. 65–75% of patients die before they arrive at hospital and up to 90% die before they reach the operating room. The bleeding can be retroperitoneal or into the abdominal cavity. Rupture can also create a connection between the aorta and intestine or inferior vena cava. Flank ecchymosis (appearance of a bruise) is a sign of retroperitoneal bleeding, and is also called Grey Turner's sign.\nAortic aneurysm rupture may be mistaken for the pain of kidney stones, muscle related back pain."
注意
Edit==
。换句话说，库中有一个bug，它没有考虑要编辑的链接
相同的代码适用于的页面，因为在该页面中，标题旁边没有
edit
链接。我不知道为什么会这样，不管怎么说，它看起来要么是一个bug，要么是图书馆的一个缺点，所以你应该在它的问题追踪器上打开一张罚单
同时，您可以使用一个简单的修复程序，如：

def find_section(page, title): res = page.section(title) if res is None: res = page.section(title + 'Edit') return res

并使用此功能，而不是使用
.section
方法。但是，这只能是临时修复。
您能告诉我们此错误发生在哪里吗？只需将回溯添加到问题。尝试在失败的循环中打印（repr（item））。您已经在那里硬编码了一个值。如果您没有打印wp.page（search.section）（“主动脉破裂”）而不是打印wp.page（search.section）（item），会发生什么？如果我在for循环中打印
wp.page（search.section）（item）
，我会得到
None
。这是我不理解的部分-应该是文本。你知道为什么运行
add\u section\u info（“HIV”）
会正常工作吗？因为即使使用“HIV”，调用
wp.page（search.sections
也会返回
[]
，这就是为什么我要做这个变通方法的原因。@MIT\u noob这是由于
编辑
链接，请参阅我的上一次编辑。如果您查看
HIV
页面，大多数标题都缺少该链接，因此库可以正常工作。但是我不熟悉维基百科以及它们如何显示内容。我建议你在图书馆问题跟踪器中打开一张记录单，因为这可能是一个bug，或者是一个没有文档记录的功能缺失。非常感谢！快速提问：你能解释一下
p.content[3220:3220+50]
的作用吗？@MIT
i.content
是一个包含页面文本的字符串。在那里，我只是简单地检查了站点出现在哪个索引上，恰好是3223，所以我使用切片检查了该索引周围的内容。
[3220:3220+50]
简单地表示将字符从
3220
th移到
3220+50
th。