Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/346.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 正在尝试替换标签<;em>;与<;a>; 导入请求 导入字符串 从bs4导入美化组,标记 [...] def disease_spider(最大页面): i=0 从我的理解来看,我是_Python_Tags_Beautifulsoup_Replacewith - Fatal编程技术网

Python 正在尝试替换标签<;em>;与<;a>; 导入请求 导入字符串 从bs4导入美化组,标记 [...] def disease_spider(最大页面): i=0 从我的理解来看,我是

Python 正在尝试替换标签<;em>;与<;a>; 导入请求 导入字符串 从bs4导入美化组,标记 [...] def disease_spider(最大页面): i=0 从我的理解来看,我是,python,tags,beautifulsoup,replacewith,Python,Tags,Beautifulsoup,Replacewith,,但你想用它的文本替换em 换句话说,a元素包含: import requests import string from bs4 import BeautifulSoup, Tag [...] def disease_spider(maxpages): i = 0 while i <= maxpages: url = 'http://www.cdc.gov/DiseasesConditions/az/'+ alpha[i]+'.html' source_code =

,但你想用它的文本替换
em

换句话说,
a
元素包含:

import requests
import string
from bs4 import BeautifulSoup, Tag
[...]
def disease_spider(maxpages):
    i = 0
while i <= maxpages:
    url = 'http://www.cdc.gov/DiseasesConditions/az/'+ alpha[i]+'.html'
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text)
    for l in soup.findAll('a', {'class':'noLinking'}):
        x =l.find("em")
        if x is not None:
            return x.em.replaceWith(Tag('a'))

    i += 1

作为旁注,可能不需要进行替换,因为
a
标记的
.text
将为您提供节点的全文,包括其子节点:

for em in soup.select('a.noLinking > em'):
    em.replace_with(em.text)
[1]中的
:从bs4导入BeautifulSoup
在[2]:data=“”
...:     
...: """
[3]中:soup=BeautifulSoup(数据)
在[4]中:打印soup.a.text
包括Hib感染(流感嗜血杆菌感染)

是否可以在列表标记中找到所有带有的标记?@ks4929是。例如,将
a.noLinking>em
替换为
li a.noLinking>em
<a class="noLinking" href="http://www.cdc.gov/hi-disease/index.html">
    including Hib Infection (Haemophilus influenzae Infection) 
</a>
for em in soup.select('a.noLinking > em'):
    em.replace_with(em.text)
In [1]: from bs4 import BeautifulSoup

In [2]: data = """
   ...:     <a class="noLinking" href="http://www.cdc.gov/hi-disease/index.html">
   ...:         including Hib Infection (<em>Haemophilus influenzae</em> Infection)   
   ...:     </a>
   ...: """

In [3]: soup = BeautifulSoup(data)

In [4]: print soup.a.text

        including Hib Infection (Haemophilus influenzae Infection)