Python 我试图使用BeautifulSoup分解代码的HTML部分_Python_Html_Beautifulsoup

Python 我试图使用BeautifulSoup分解代码的HTML部分

python html

Python 我试图使用BeautifulSoup分解代码的HTML部分,python,html,beautifulsoup,Python,Html,Beautifulsoup,以下是输入HTML代码段： <p>both before<img src="it is free"style="width:590px;height:228px;"> and after img tag</p> 但是我得到了以下错误，尽管我非常确定我有标记内容： raise ValueError("Tag.index: element not in tag") ValueError: Tag.index: element not in t

以下是输入HTML代码段：

<p>both before<img src="it is free"style="width:590px;height:228px;"> and after img tag</p>

但是我得到了以下错误，尽管我非常确定我有标记内容：

     raise ValueError("Tag.index: element not in tag")
     ValueError: Tag.index: element not in tag

请使用“html5lib”作为BeautifulSoup ex的解析器：

doc = BeautifulSoup(open(input_doc), 'html5lib')

有没有办法摆脱这个错误？

提前感谢。

以下是代码的工作版本：

p_tag_contents = p_tag.contents
current = p_tag
for i in range(len(p_tag_contents)):
    item = copy.copy(p_tag_contents[i])
    if isinstance(item, NavigableString):
        new_element = doc.new_tag('p')
        new_element.string = item
        current.insert_after(new_element)
        current = current.next_sibling
    else:
        current.insert_after(item)
        current = current.next_sibling
p_tag.decompose()

我们需要做的是首先获取给定标记的所有内容，然后在给定标记之后插入它们，最后销毁给定标记。我在这里使用copy的原因是，没有复制每次迭代中的内容

len（p_tag.contents）

会导致编程错误

doc = BeautifulSoup(open(input_doc), 'html5lib')

p_tag_contents = p_tag.contents
current = p_tag
for i in range(len(p_tag_contents)):
    item = copy.copy(p_tag_contents[i])
    if isinstance(item, NavigableString):
        new_element = doc.new_tag('p')
        new_element.string = item
        current.insert_after(new_element)
        current = current.next_sibling
    else:
        current.insert_after(item)
        current = current.next_sibling
p_tag.decompose()