Regex VBScript通过重新编号从字幕文件中删除重复的数字

Regex VBScript通过重新编号从字幕文件中删除重复的数字,regex,replace,vbscript,pattern-matching,subtitle,Regex,Replace,Vbscript,Pattern Matching,Subtitle,我有一个字幕(.srt)文件,如下所示: 2 00:04:22,504 --> 00:04:23,520 Hello? 3 00:04:27,860 --> 00:04:29,112 Hey wait! Hello! 3 00:06:18,860 --> 00:06:21,112 Uhh! 3 00:06:29,860 --> 00:06:32,112 Ah! 4 00:07:19,232 --> 00:07:21,284 What are you doin

我有一个字幕(.srt)文件,如下所示:

2
00:04:22,504 --> 00:04:23,520
Hello?

3
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

3
00:06:29,860 --> 00:06:32,112
Ah!

4
00:07:19,232 --> 00:07:21,284
What are you doing here?

5
00:07:21,608 --> 00:07:22,708
Tell me!

...
如您所见,编号
3
在该文件中重复了三次,我希望通过重新编号整个字幕文件来替换该编号(我想这是唯一的选项,因为此重复在该文件中的多个位置)

我已经创建了下面的脚本来选择该文件,并试图用新生成的数字(迭代数字)替换重复的数字,但它不起作用

Dim strFile, objFS

strFile = SelectFile( )
If strFile = "" Then
    WScript.Echo "No file selected."
End If


Function SelectFile( )
    Dim objExec, strMSHTA, wshShell

    SelectFile = ""

    strMSHTA = "mshta.exe ""about:" & "<" & "input type=file id=FILE>" _
             & "<" & "script>FILE.click();new ActiveXObject('Scripting.FileSystemObject')" _
             & ".GetStandardStream(1).WriteLine(FILE.value);close();resizeTo(0,0);" & "<" & "/script>"""

    Set wshShell = CreateObject( "WScript.Shell" )
    Set objExec = wshShell.Exec( strMSHTA )

    SelectFile = objExec.StdOut.ReadLine( )

    Set objExec = Nothing
    Set wshShell = Nothing
End Function

Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFile = objFS.OpenTextFile(strFile)
Set objFile2 = objFS.OpenTextFile(strFile, 8, True)
x = 0
Do Until objFile.AtEndOfStream
    strLine = objFile.ReadLine
    Set objRegEx = CreateObject("VBScript.RegExp")
    objRegEx.Global = True
    objRegEx.Pattern = "^\d+$"
    Set colMatches = objRegEx.Execute(strLine)
    If colMatches.Count > 0 Then
        x = x + 1
        strLine = x
        strNewLine = Replace(strLine,strLine,x)
        objFile2.WriteLine strLine
    End If
Loop
Dim strFile,objFS
strFile=SelectFile()
如果strFile=“”,则
Echo“未选择任何文件。”
如果结束
函数SelectFile()
Dim objExec、strMSHTA、wshShell
SelectFile=“”
strMSHTA=“mshta.exe”关于:&”_
&“FILE.click();新建ActiveXObject('Scripting.FileSystemObject')”_
&.GetStandardStream(1).WriteLine(FILE.value);close();resizeTo(0,0);“&”
设置wshShell=CreateObject(“WScript.Shell”)
Set objExec=wshShell.Exec(strMSHTA)
SelectFile=objExec.StdOut.ReadLine()
设置objExec=Nothing
设置wshShell=Nothing
端函数
设置objFS=CreateObject(“Scripting.FileSystemObject”)
设置objFile=objFS.OpenTextFile(strFile)
设置objFile2=objFS.OpenTextFile(strFile,8,True)
x=0
直到objFile.AtEndOfStream
strLine=objFile.ReadLine
设置objRegEx=CreateObject(“VBScript.RegExp”)
objRegEx.Global=True
objRegEx.Pattern=“^\d+$”
Set colMatches=objRegEx.Execute(strLine)
如果colMatches.Count>0,则
x=x+1
斯特林=x
strNewLine=替换(strLine,strLine,x)
objFile2.WriteLine strLine
如果结束
环

有人能帮你弄明白怎么做吗?

如果你有一个
Unix
框,或者
Unix
VM,或者如果你可以用
awk
模拟Unix环境,那么只需一行操作即可:

命令:

awk 'BEGIN{c=1} $0~/^[0-9]+$/ {print c++} $0~/[a-zA-Z,:\-!?]|^$/{print}' input_sub.txt > output_sub.txt
2
00:04:22,504 --> 00:04:23,520
Hello?

3
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

3
00:06:29,860 --> 00:06:32,112
Ah!

4
00:07:19,232 --> 00:07:21,284
What are you doing here?

5
00:07:21,608 --> 00:07:22,708
Tell me!
1
00:04:22,504 --> 00:04:23,520
Hello?

2
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

4
00:06:29,860 --> 00:06:32,112
Ah!

5
00:07:19,232 --> 00:07:21,284
What are you doing here?

6
00:07:21,608 --> 00:07:22,708
Tell me!
测试日期:

awk 'BEGIN{c=1} $0~/^[0-9]+$/ {print c++} $0~/[a-zA-Z,:\-!?]|^$/{print}' input_sub.txt > output_sub.txt
2
00:04:22,504 --> 00:04:23,520
Hello?

3
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

3
00:06:29,860 --> 00:06:32,112
Ah!

4
00:07:19,232 --> 00:07:21,284
What are you doing here?

5
00:07:21,608 --> 00:07:22,708
Tell me!
1
00:04:22,504 --> 00:04:23,520
Hello?

2
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

4
00:06:29,860 --> 00:06:32,112
Ah!

5
00:07:19,232 --> 00:07:21,284
What are you doing here?

6
00:07:21,608 --> 00:07:22,708
Tell me!
输出:

awk 'BEGIN{c=1} $0~/^[0-9]+$/ {print c++} $0~/[a-zA-Z,:\-!?]|^$/{print}' input_sub.txt > output_sub.txt
2
00:04:22,504 --> 00:04:23,520
Hello?

3
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

3
00:06:29,860 --> 00:06:32,112
Ah!

4
00:07:19,232 --> 00:07:21,284
What are you doing here?

5
00:07:21,608 --> 00:07:22,708
Tell me!
1
00:04:22,504 --> 00:04:23,520
Hello?

2
00:04:27,860 --> 00:04:29,112
Hey wait!
Hello!

3
00:06:18,860 --> 00:06:21,112
Uhh!

4
00:06:29,860 --> 00:06:32,112
Ah!

5
00:07:19,232 --> 00:07:21,284
What are you doing here?

6
00:07:21,608 --> 00:07:22,708
Tell me!
在VBScript中,将与和全局计数器一起使用:

f = "C:\path\to\your.srt"
n = 1  'global counter

Function Renumber(m, g1, g2, pos, src)
  Renumber = g1 & n & g2
  n = n + 1  'increment global counter after current value was used
End Function

Set re = New RegExp
re.Pattern = "(^|\r\n\r\n)\d+(\r\n)"
re.Global = True

Set fso = CreateObject("Scripting.FileSystemObject")
txt = fso.OpenTextFile(f).ReadAll
txt = re.Replace(txt, GetRef("Renumber"))
fso.OpenTextFile(f, 2).Write txt

可以使用vbscript以外的其他技术来完成吗?因为我已经创建了vbscript,所以我更喜欢基于vbscript的答案,但如果您有可行的解决方案,请在注释中添加您的答案或为您的脚本添加url。