如何从字幕中提取文本?(在python中)

如何从字幕中提取文本?(在python中),python,regex,subtitle,Python,Regex,Subtitle,我要把这个转换成: 1 00:00:01,710 --> 00:00:03,830 Now react came out in 2013. 2 00:00:03,840 --> 00:00:07,890 But what do we have before then before we act. 3 00:00:07,890 --> 00:00:15,040 Well the front fronting landscape was very different initi

我要把这个转换成:

1
00:00:01,710 --> 00:00:03,830
Now react came out in 2013.

2
00:00:03,840 --> 00:00:07,890
But what do we have before then before we act.

3
00:00:07,890 --> 00:00:15,040
Well the front fronting landscape was very different initially back in the 90s and early 2000s.
对这样的事情:

thisdict = {
  "1": "Now react came out in 2013.",
  "1time": '00:00:01,710 --> 00:00:03,830'
}

有人能帮忙吗?

你的意思是这样的吗

with open('subtitle.srt') as file:
    subtitle = file.readlines()
    
    sub_list = [subtitle[i : i+4] for i in range(0, len(subtitle), 4)]
    
    this_dict = {}
    
    for item in sub_list:
        number = item[0].strip('\n')
        this_dict[number] = item[2].strip('\n')
        this_dict[f"{number}time"] = item[1].strip('\n')
        
    print(this_dict)
输出:

{'1': 'Now react came out in 2013.', '1time': '00:00:01,710 --> 00:00:03,830', '2': 'But what do we have before then before we act.', '2time': '00:00:03,840 --> 00:00:07,890', '3': 'Well the front fronting landscape was very different initially back in the 90s and early 2000s.', '3time': '00:00:07,890 --> 00:00:15,040'}

到目前为止你试过什么?这似乎是一个简单的“逐行读取文件并对其进行操作”交易。