Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/svg/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python中使用正则表达式\r\n_Python_Regex - Fatal编程技术网

如何在python中使用正则表达式\r\n

如何在python中使用正则表达式\r\n,python,regex,Python,Regex,我有这样的文本: 1 00:00:01,860 --> 00:00:31,210 Affil of fifth at fat at all the social ball and said, with all this little in the 2 00:00:31,210 --> 00:01:03,060 mid limited and will cost a lot, for want of a lot of it is I never do this or below ar

我有这样的文本:

1
00:00:01,860 --> 00:00:31,210
Affil of fifth at fat at all the social ball and said, with all this little in the

2
00:00:31,210 --> 00:01:03,060
mid limited and will cost a lot, for want of a lot of it is I never do this or below are the innocent of fat in the annual own none will bit less often were a little the earth the oven for the area of some of them some of the atom in the long will recall the law, will cost you the ball a little less of Odessa and coal rule the Vikings in at a loss

3
00:01:03,980 --> 00:01:33,150
of our lady of one of the will of the wall routing visiting little sign of the limited use of a lot of wind up with a loss of 14 and uncivil will find a site to lop off call them into solid, a London, can we stop go to work as a gay sailor kissing a lot of that scene of the law that on them in this case

4
00:01:33,950 --> 00:02:03,190
will almost a kind wilkinson's, and that a settlement, or the fog collared of the unknown, some would call and all of this was a little, some of us up a lot of letters, union would quit them or not will be or will lend money to zoning and will open the door to that of the novel opens in

5
00:02:04,240 --> 00:02:24,180
it and solidity can cut later with boats can die to only see not open only to six and 0:50 and world go back a at the fat of that at that
  "1\r\n00:00:01,860 --> 00:00:31,210\r\nAffil of fifth at fat at all the social ball and said, with all this little in the\r\n\r\n2\r\n00:00:31,210 --> 00:01:03,060\r\nmid limited and will cost a lot, for want of a lot of it is I never do this or below are the innocent of fat in the annual own none will bit less often were a little the earth the oven for the area of some of them some of the atom in the long will recall the law, will cost you the ball a little less of Odessa and coal rule the Vikings in at a loss\r\n\r\n3\r\n00:01:03,980 --> 00:01:33,150\r\nof our lady of one of the will of the wall routing visiting little sign of the limited use of a lot of wind up with a loss of 14 and uncivil will find a site to lop off call them into solid, a London, can we stop go to work as a gay sailor kissing a lot of that scene of the law that on them in this case\r\n\r\n4\r\n00:01:33,950 --> 00:02:03,190\r\nwill almost a kind wilkinson's, and that a settlement, or the fog collared of the unknown, some would call and all of this was a little, some of us up a lot of letters, union would quit them or not will be or will lend money to zoning and will open the door to that of the novel opens in\r\n\r\n5\r\n00:02:04,240 --> 00:02:24,180\r\nit and solidity can cut later with boats can die to only see not open only to six and 0:50 and world go back a at the fat of that at that\r\n\r\n"
我只想从课文中摘录句子。比如 “第五名的阿菲尔在所有的社交舞会上对fat说,由于缺少一个……”

因此,原始文本如下所示:

1
00:00:01,860 --> 00:00:31,210
Affil of fifth at fat at all the social ball and said, with all this little in the

2
00:00:31,210 --> 00:01:03,060
mid limited and will cost a lot, for want of a lot of it is I never do this or below are the innocent of fat in the annual own none will bit less often were a little the earth the oven for the area of some of them some of the atom in the long will recall the law, will cost you the ball a little less of Odessa and coal rule the Vikings in at a loss

3
00:01:03,980 --> 00:01:33,150
of our lady of one of the will of the wall routing visiting little sign of the limited use of a lot of wind up with a loss of 14 and uncivil will find a site to lop off call them into solid, a London, can we stop go to work as a gay sailor kissing a lot of that scene of the law that on them in this case

4
00:01:33,950 --> 00:02:03,190
will almost a kind wilkinson's, and that a settlement, or the fog collared of the unknown, some would call and all of this was a little, some of us up a lot of letters, union would quit them or not will be or will lend money to zoning and will open the door to that of the novel opens in

5
00:02:04,240 --> 00:02:24,180
it and solidity can cut later with boats can die to only see not open only to six and 0:50 and world go back a at the fat of that at that
  "1\r\n00:00:01,860 --> 00:00:31,210\r\nAffil of fifth at fat at all the social ball and said, with all this little in the\r\n\r\n2\r\n00:00:31,210 --> 00:01:03,060\r\nmid limited and will cost a lot, for want of a lot of it is I never do this or below are the innocent of fat in the annual own none will bit less often were a little the earth the oven for the area of some of them some of the atom in the long will recall the law, will cost you the ball a little less of Odessa and coal rule the Vikings in at a loss\r\n\r\n3\r\n00:01:03,980 --> 00:01:33,150\r\nof our lady of one of the will of the wall routing visiting little sign of the limited use of a lot of wind up with a loss of 14 and uncivil will find a site to lop off call them into solid, a London, can we stop go to work as a gay sailor kissing a lot of that scene of the law that on them in this case\r\n\r\n4\r\n00:01:33,950 --> 00:02:03,190\r\nwill almost a kind wilkinson's, and that a settlement, or the fog collared of the unknown, some would call and all of this was a little, some of us up a lot of letters, union would quit them or not will be or will lend money to zoning and will open the door to that of the novel opens in\r\n\r\n5\r\n00:02:04,240 --> 00:02:24,180\r\nit and solidity can cut later with boats can die to only see not open only to six and 0:50 and world go back a at the fat of that at that\r\n\r\n"

通过检查原始文本,我们可能会用类似这样的“\r\n”分隔文本,但我不知道如何编写正则表达式。

为什么不简单地从第三行开始,每四行取一行呢?然后你可以加入一个空间

text = '''1
00:00:01,860 --> 00:00:31,210
Affil of fifth at fat at all the social ball and said, with all this little in the

2
00:00:31,210 --> 00:01:03,060
mid limited and will cost a lot, for want of a lot of it is I never do this or below are the innocent of fat in the annual own none will bit less often were a little the earth the oven for the area of some of them some of the atom in the long will recall the law, will cost you the ball a little less of Odessa and coal rule the Vikings in at a loss

3
00:01:03,980 --> 00:01:33,150
of our lady of one of the will of the wall routing visiting little sign of the limited use of a lot of wind up with a loss of 14 and uncivil will find a site to lop off call them into solid, a London, can we stop go to work as a gay sailor kissing a lot of that scene of the law that on them in this case

4
00:01:33,950 --> 00:02:03,190
will almost a kind wilkinson's, and that a settlement, or the fog collared of the unknown, some would call and all of this was a little, some of us up a lot of letters, union would quit them or not will be or will lend money to zoning and will open the door to that of the novel opens in

5
00:02:04,240 --> 00:02:24,180
it and solidity can cut later with boats can die to only see not open only to six and 0:50 and world go back a at the fat of that at that'''
t = ' '.join(text.splitlines()[2::4])
结果:

>>> import textwrap
>>> for line in textwrap.wrap(t, width=50):
...     print(line)
...
Affil of fifth at fat at all the social ball and
said, with all this little in the mid limited and
will cost a lot, for want of a lot of it is I
never do this or below are the innocent of fat in
the annual own none will bit less often were a
little the earth the oven for the area of some of
them some of the atom in the long will recall the
law, will cost you the ball a little less of
Odessa and coal rule the Vikings in at a loss of
our lady of one of the will of the wall routing
visiting little sign of the limited use of a lot
of wind up with a loss of 14 and uncivil will find
a site to lop off call them into solid, a London,
can we stop go to work as a gay sailor kissing a
lot of that scene of the law that on them in this
case will almost a kind wilkinson's, and that a
settlement, or the fog collared of the unknown,
some would call and all of this was a little, some
of us up a lot of letters, union would quit them
or not will be or will lend money to zoning and
will open the door to that of the novel opens in
it and solidity can cut later with boats can die
to only see not open only to six and 0:50 and
world go back a at the fat of that at that
见演示


text.split('\n').strip()
?实际上,
text.splitlines()[2::4]
看起来更像它。@vks-到底是什么让你认为这个文本是人为生成的?这是一个字幕文件,几乎可以保证通过OCR自动生成。我能问你一个问题吗?比如如何把文本一句一句地分开。因为成绩单没有真正的标点。@dd90p只是不做
join
@dd90p-在
''之前的原件。join()
有每一行,但是如果没有这些数据,就无法恢复实际的句子结构。这里根本不需要正则表达式。@TigerhawkT3没有必要伴随着否决票吗?我发现这个表达式比urs更稳定。这个特殊的表达式也很脆弱,因为如果相关的行以数字开头,它就会中断;在未来的一段时间里,这对Y来说仍然是一个脆弱的黑客。以目前的形式,它是如此脆弱,我几乎可以预期它会打破。。。但只需默默地删除一行相关的代码,而不是以硬编码字符开头。这完全是错误的方法。@TigerhawkT3 sry,但我会用它来对付你的解决方案。