Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/328.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
python如何计算html中开始和结束标记的数量_Python_Html_Tags_Beautifulsoup_Findall - Fatal编程技术网

python如何计算html中开始和结束标记的数量

python如何计算html中开始和结束标记的数量,python,html,tags,beautifulsoup,findall,Python,Html,Tags,Beautifulsoup,Findall,如何计算html中开始和结束标记的数量 ya.html <div class="side-article txt-article"> <p> <strong> </strong> <a href="http://batam.tribunnews.com/tag/polres/" title="Polres"> </a> <a href="http://batam.tribun

如何计算html中开始和结束标记的数量

ya.html

<div class="side-article txt-article">
<p>
    <strong>
    </strong> 
    <a href="http://batam.tribunnews.com/tag/polres/" title="Polres">
    </a> 
    <a href="http://batam.tribunnews.com/tag/bintan/" title="Bintan">
    </a>
</p>
<p>
    <br>
</p>
<p>
    <a href="http://batam.tribunnews.com/tag/polres/" title="Polres">
    </a>
</p>
<p>
    <a href="http://batam.tribunnews.com/tag/polres/" title="Polres">
    </a> 
    <a href="http://batam.tribunnews.com/tag/bintan/" title="Bintan">
    </a>
</p>
<br>
输出

13
但这不是我想要的,因为我的代码作为一个计数,而我想要分别计算开始标记和结束标记

如何计算html中开始和结束标记的数量? 因此,输出将是

23 

谢谢

我建议您使用html解析器来解决此问题:

from HTMLParser import HTMLParser

number_of_starttags = 0
number_of_endtags = 0

# create a subclass and override the handler methods
class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        global number_of_starttags
        number_of_starttags += 1

    def handle_endtag(self, tag):
        global number_of_endtags
        number_of_endtags += 1

# instantiate the parser and fed it some HTML
parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head><body><h1>Parse me!</h1></body></html>')

print(number_of_starttags, number_of_endtags)
从HTMLParser导入HTMLParser
开始标记的数量=0
_结束标记的数量=0
#创建一个子类并重写处理程序方法
类MyHtmlPasser(HtmlPasser):
def句柄\u开始标记(自身、标记、属性):
全球启动次数
开始标记的数量+=1
def handle_endtag(self,tag):
尾端标签的全局编号
_结束标记的数量+=1
#实例化解析器并向其提供一些HTML
parser=MyHTMLParser()
feed('TestParse me!')
打印(起始标记的数量、结束标记的数量)

这对我不起作用,我得到了UnboundLocalError:赋值前引用的局部变量'number\u of\u starttags'。对,因为类。只需为变量指定全局变量,它就可以正常工作。
from HTMLParser import HTMLParser

number_of_starttags = 0
number_of_endtags = 0

# create a subclass and override the handler methods
class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        global number_of_starttags
        number_of_starttags += 1

    def handle_endtag(self, tag):
        global number_of_endtags
        number_of_endtags += 1

# instantiate the parser and fed it some HTML
parser = MyHTMLParser()
parser.feed('<html><head><title>Test</title></head><body><h1>Parse me!</h1></body></html>')

print(number_of_starttags, number_of_endtags)