Python 从带有换行符的列表中计数对联\r\n

Python 从带有换行符的列表中计数对联\r\n,python,list,Python,List,我想从一组歌词中数一数对联。假设歌词是: I saw a little hermit crab His coloring was oh so drab It’s hard to see the butterfly Because he flies across the sky 等等等等 Once upon a time She made a little rhyme Of course, of course Before we say again The pain the pain A h

我想从一组歌词中数一数对联。假设歌词是:

I saw a little hermit crab
His coloring was oh so drab

It’s hard to see the butterfly
Because he flies across the sky
等等等等

Once upon a time
She made a little rhyme
Of course, of course

Before we say again
The pain the pain
A horse, a horse

Lightening, thunder, all around
Soon the rain falls on the ground

I tire of writing poems and rhyme
它们以字符串的形式存储在数据库中,由
u'\r\n'
和string.splitlines(树)分隔,对象将它们存储如下:

>>> lyrics[6].track_lyrics['lyrics']
[u'I saw a little hermit crab\r\n', u'His coloring was oh so drab\r\n', u'\r\n', u'It\u2019s hard to see the butterfly\r\n', u'Because he flies across the sky\r\n', u'\r\n',  u'\r\n', u'Before we say again\r\n', u'The pain the pain\r\n', u'A horse, a horse\r\n', u'\r\n', u'Lightening, thunder, all around\r\n', u'Soon the rain falls on the ground\r\n', u'\r\n', u'I tire of writing poems and rhyme\r\n']
>>> if len(lyric_string) > 1:
...     for k, v in enumerate(lyric_string):
...             if k == 0 and lyric_string[k+2] == "\r\n":
...                     print(v)
...             elif lyric_string[k-1] == "\r\n" and lyric_string[k+2] == "\r\n":
...                     print(v)
... 
I saw a little hermit crab

It’s hard to see the butterfly

Hear the honking of the goose

His red sports car is just a dream

The children like the ocean shore

I made the cookies one by one

My cat, she likes to chase a mouse,

Lightening, thunder, all around

Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
IndexError: list index out of range
我可以接近这一点:

len([i for i in lyrics if i != "\r\n"]) / 2
但它也将一行、三行或更多行作为对联

我是这样理解的,它基本上说,如果前面有一行,后面有两行,我们就是一对:

>>> for k,v in enumerate(lyric_list):
...     if lyric_list[k+2] == "\r\n" and lyric_list[k-1] == "\r\n":
...             print(v)
... 
It’s hard to see the butterfly

Hear the honking of the goose


Lightening, thunder, all around
但是,当然:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
IndexError: list index out of range
我考虑过做这样的事情,这更难看,也不管用!(仅获取第一行和最后一行):


但我敢打赌,有一种更雄辩、甚至是python式的方法。

一种更简单的方法:用“”连接整个数组,并计算换行符的出现次数

>>> s = """Once upon a time
... She made a little rhyme
... Of course, of course
...
... Before we say again
... The pain the pain
... A horse, a horse
...
... Lightening, thunder, all around
... Soon the rain falls on the ground
...
... I tire of writing poems and rhyme"""
然后做:

>>> s.strip().count("\n\n") + 1
4
要在上述代码中获得
s
,您需要进行额外的连接。一个例子

s = "".join(lyrics[6].track_lyrics['lyrics'])

我在我的系统上使用
\n
,您可能必须在您的系统上使用
\r\n

我假设对联是一组包含两行的行

您可以通过拆分为块,然后计算每个块中的行数来实现这一点。在本例中,我计算一个块中的换行数(对联中应为1)

这还假设每个块之间正好有两条换行线。要处理2+个换行符,可以使用:

import re
...
.. block for block in re.split(r'(?:\r\n){2,}', text) ..

感谢
s=“”.join(a_list)
,但是我认为
s.strip().count(“\n\n”)+1
不起作用,因为一个对联正好是两行在一起的时候-我将编辑问题来指定-所以
s
中的对联计数实际上是1。将正则表达式更新为
http://stackoverflow.com/questions/27928630/find-2-or-more-newlines/27928748#27928748
,基于
>>> text = """I saw a little hermit crab
... His coloring was oh so drab
... 
... It’s hard to see the butterfly
... Because he flies across the sky
... 
... etc etc...
... 
... Once upon a time
... She made a little rhyme
... Of course, of course
... 
... Before we say again
... The pain the pain
... A horse, a horse
... 
... Lightening, thunder, all around
... Soon the rain falls on the ground
... 
... I tire of writing poems and rhyme
... """.replace('\n', '\r\n')
>>> len([block for block in text.split('\r\n\r\n') if block.count('\r\n') == 1])
3
import re
...
.. block for block in re.split(r'(?:\r\n){2,}', text) ..