Python 按类名搜索并替换HTML标记，并替换为非HTML标记_Python_Beautifulsoup_Hugo_Hugo Shortcode

Python 按类名搜索并替换HTML标记，并替换为非HTML标记

python

Python 按类名搜索并替换HTML标记，并替换为非HTML标记,python,beautifulsoup,hugo,hugo-shortcode,Python,Beautifulsoup,Hugo,Hugo Shortcode,我想用类名“figure”替换所有div标记一些内容带有一个非HTML标记（在我的例子中是一个） {{%row%} 一些内容 {{%/行%}} 这很容易，但如果涉及到非html标记，我不知道如何操作。如果您使用记事本或任何其他具有搜索和替换功能的文本编辑器可以直接替换的功能 'with'{{%row%}}和'with'{%/row%}}我看不到“简单”的解决方案，因为短码也可以包含//code>，字符，所以不能将它们作为文档树的一部分一种解决方案是用自定义标记替换，最后用您的短代码

我想用类名“figure”替换所有

div

标记


一些内容

带有一个非HTML标记（在我的例子中是一个）

{{%row%}
一些内容
{{%/行%}}

这很容易，但如果涉及到非html标记，我不知道如何操作。

如果您使用记事本或任何其他具有

搜索和替换功能的文本编辑器

可以直接替换的功能

with

'{{%row%}}

和

with

'{%/row%}}

我看不到“简单”的解决方案，因为短码也可以包含

//code>，
字符，所以不能将它们作为文档树的一部分
一种解决方案是用自定义标记替换
，最后用您的短代码替换这些自定义标记：
from bs4 import BeautifulSoup

txt = '''
<div>
    <div class="figure">
        <p>Some content.</p>
    </div>
</div>

<div class="figure">
    <p>Some other content.</p>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

for div in soup.select('div.figure'):
    t = soup.new_tag('xxx-row')
    t.contents = div.contents
    div.replace_with(t)

s = str(soup).replace('<xxx-row>', '{{% row %}}')
s = s.replace('</xxx-row>', '{{% /row %}}')

print(s)

从bs4导入美化组
txt=“”
一些内容
其他一些内容
'''
soup=BeautifulSoup（txt，'html.parser'）
对于汤中的div。选择（'div.figure'）：
t=汤。新标签（'xxx-row'）
t、 contents=div.contents
将分区替换为（t）
s=str（soup）.replace（“”，{{%row%}}}）
s=s.replace（“”，{%/行%}}）
印刷品

印刷品：
<div>
{{% row %}}
<p>Some content.</p>
{{% /row %}}
</div>
{{% row %}}
<p>Some other content.</p>
{{% /row %}}


{{%row%}}
一些内容
{{%/行%}}
{{%row%}}
其他一些内容
{{%/行%}}
这不起作用，因为可能还有另一端的
标记：）请确保有更多具有不同类的div。不可能。如何将与不同的类分开？
from bs4 import BeautifulSoup

txt = '''
<div>
    <div class="figure">
        <p>Some content.</p>
    </div>
</div>

<div class="figure">
    <p>Some other content.</p>
</div>
'''

soup = BeautifulSoup(txt, 'html.parser')

for div in soup.select('div.figure'):
    t = soup.new_tag('xxx-row')
    t.contents = div.contents
    div.replace_with(t)

s = str(soup).replace('<xxx-row>', '{{% row %}}')
s = s.replace('</xxx-row>', '{{% /row %}}')

print(s)

<div>
{{% row %}}
<p>Some content.</p>
{{% /row %}}
</div>
{{% row %}}
<p>Some other content.</p>
{{% /row %}}