Python .不要在漂亮的汤里工作_Python_Replace_Beautifulsoup

Python .不要在漂亮的汤里工作

python replace

Python .不要在漂亮的汤里工作,python,replace,beautifulsoup,Python,Replace,Beautifulsoup,这只是代码的一部分。它不会替换div和a href的值。这是一个漂亮的汤类标签 soup = BeautifulSoup(ourUrl) dem = soup.findAll('p') for i in range(0,len(dem)-1): dk = dem[i] if ('<div') in dk: print "it here" dk =dk.re

这只是代码的一部分。它不会替换div和a href的值。这是一个漂亮的汤类标签

soup = BeautifulSoup(ourUrl)
dem = soup.findAll('p')
for i in range(0,len(dem)-1):
              dk = dem[i]


              if ('<div') in dk:
                   print "it here"
                   dk =dk.replace('<div','<!--')
                   dk =dk.replace('</div>','--->')
                   dem[i] = dk
for i in range(0,len(dem)-1):
              dk = dem[i]
              if ('<a href') in dk:
                   print "it here"
                   dk =dk.replace('<a href','<!--')
                   dk =dk.replace('</a>','--->')
                   dem[i] = dk

soup=BeautifulSoup（我们的URL）
dem=soup.findAll（'p'）
对于范围（0，长度（dem）-1）内的i：
dk=dem[i]
如果（“”）
dem[i]=dk
对于范围（0，长度（dem）-1）内的i：
dk=dem[i]
如果（'
如果要使用包含替换标记的注释替换元素，请使用新的bs4.comment（）
对象替换对象：
from bs4 import Comment

for para in soup.find_all('p'):
    for div in para.find_all('div'):
        div.replace_with(Comment(unicode(div)))
    for link in para.find_all('a', href=True):
        link.replace_with(Comment(unicode(link)))

在Python中，与使用range（）
的循环不同，只需直接在序列上循环；在上面的代码中，我直接在上循环。find_all（）
结果
BeautifulSoup元素可能会打印为HTML文本，但实际上它们不是字符串，而是字符串。不要试图将它们视为字符串。
没有.replace（）
方法。BeautifulSoup元素不是字符串；您想注释掉
元素还是完全删除它们？请注意，段落中的任何地方都没有div
标记，因此部分代码在任何情况下都没有可替换的内容。这很好，但有一些段落带有div即使对于那些它不工作的人，你确定它的find_都是因为它返回了一个错误。只有在更改为findall时，它才会进入循环。对于replacewith@user2707082：我在这里假设为BeautifulSoup 4；如果您仍在使用BeautifulSoup版本3，则这些方法被称为.findAll（）
和.replaceWith（）
，导入将是来自BeautifulSoup import Comment的。我再次测试了代码，它的工作方式与广告中的一样。而且它肯定是正确的，因此其他方面出现了问题。
from bs4 import Comment

for para in soup.find_all('p'):
    for div in para.find_all('div'):
        div.replace_with(Comment(unicode(div)))
    for link in para.find_all('a', href=True):
        link.replace_with(Comment(unicode(link)))