Java 在character'上没有可行的替代方案&书信电报;EOF>';在我的语法中
我正在尝试为SRT格式创建语法: 以下是srt文件的一个示例:Java 在character'上没有可行的替代方案&书信电报;EOF>';在我的语法中,java,parsing,antlr,antlr3,Java,Parsing,Antlr,Antlr3,我正在尝试为SRT格式创建语法: 以下是srt文件的一个示例: 1 00:00:02,218 --> 00:00:04,209 [SHELDON SPEAKING IN MANDARIN] 2 00:00:04,721 --> 00:00:05,745 No, it's: 3 00:00:05,922 --> 00:00:07,913 [SPEAKING IN MANDARIN] 4 00:00:09,392 --> 00:00:11,383 [SPEAKING I
1
00:00:02,218 --> 00:00:04,209
[SHELDON SPEAKING IN MANDARIN]
2
00:00:04,721 --> 00:00:05,745
No, it's:
3
00:00:05,922 --> 00:00:07,913
[SPEAKING IN MANDARIN]
4
00:00:09,392 --> 00:00:11,383
[SPEAKING IN MANDARIN]
5
00:00:13,430 --> 00:00:15,193
What's this?
6
00:00:16,266 --> 00:00:18,029
That's what you did.
7
00:00:18,201 --> 00:00:22,467
I assumed, as in a number of languages,
that the gesture was part of the phrase.
8
00:00:22,639 --> 00:00:25,233
- Well, it's not.
- Why am I supposed to know that?
9
00:00:25,408 --> 00:00:28,900
As teacher, it's your obligation
to separate your personal idiosyncrasies...
10
00:00:29,079 --> 00:00:30,512
...from the subject matter.
11
00:00:31,081 --> 00:00:33,845
- I'm glad you decided to learn Mandarin.
- Why?
326
00:18:56,818 --> 00:19:00,720
Actually, I've heard
far too much about Schrödinger's cat.
327
00:19:01,623 --> 00:19:03,022
Good.
328
00:19:09,131 --> 00:19:11,895
All right, the cat's alive.
Let's go to dinner.
329
00:19:12,000 --> 00:19:15,072
Download Movie Subtitles Searcher from www.OpenSubtitles.org
这是我的antlr语法(3.4节)
我的简单代码:
String input = IOUtils.toString(Test.class.getResourceAsStream("/subtitles.srt"));
ExpLexer lexer = new ExpLexer(new ANTLRStringStream(input));
CommonTokenStream stream = new CommonTokenStream(lexer);
ExpParser parser = new ExpParser(stream);
parser.parse();
如果在文件末尾有两行新行,几乎所有的东西都能完美地工作。如果没有,我会得到这个错误:
line 1484:0 no viable alternative at character '<EOF>'
第1484:0行字符“”处没有可行的替代方案
有什么建议可以让我的语法更灵活吗?接受最后会有一行、两行或更多行 原因是
TEXT
需要在末尾添加两行新行
您可以尝试从文本
中删除一个尾随NL,并将其作为字幕
之间的分隔符
比如:
parse
: SUBTITLE (NL SUBTITLE)*
;
顺便说一句,文本只能有一行或两行吗?原因是
TEXT
需要在末尾添加两行
您可以尝试从文本
中删除一个尾随NL,并将其作为字幕
之间的分隔符
比如:
parse
: SUBTITLE (NL SUBTITLE)*
;
顺便说一句,文本只能有一行或两行吗?您使用的词法规则太多了 试着这样做:
grammar T;
options {
output=AST;
}
tokens {
BLOCKS;
BLOCK;
TIME_RANGE;
LINES;
LINE;
WORD;
}
parse
: LineBreak* blocks LineBreak* EOF -> blocks
;
blocks
: block (LineBreak LineBreak+ block)* -> ^(BLOCKS block+)
;
block
: Number Spaces? LineBreak time_range LineBreak text_lines -> ^(BLOCK Number time_range text_lines)
;
time_range
: Time Spaces? Arrow Spaces? Time Spaces? -> ^(TIME_RANGE Time Time)
;
text_lines
: line (LineBreak line)* -> ^(LINES line+)
;
line
: Spaces? word (Spaces word)* Spaces? -> ^(LINE word+)
;
word
: (Other | Number | Dashes | Arrow)+ -> WORD[$text]
;
Time : Number ':' Number ':' Number ',' Number;
Arrow : '-->';
Dashes : '-'+;
Number : '0'..'9'+;
LineBreak : '\r'? '\n' | '\r';
Spaces : (' ' | '\t')+;
Other : . ;
这将解析输入:
1
00:00:02,218 --> 00:00:04,209
[A B C]
2
00:00:04,721 --> 00:00:05,745
-- Line 1
-- Line 2
3
00:00:05,922 --> 00:00:07,913
mu --> MU
你用的词法规则太多了 试着这样做:
grammar T;
options {
output=AST;
}
tokens {
BLOCKS;
BLOCK;
TIME_RANGE;
LINES;
LINE;
WORD;
}
parse
: LineBreak* blocks LineBreak* EOF -> blocks
;
blocks
: block (LineBreak LineBreak+ block)* -> ^(BLOCKS block+)
;
block
: Number Spaces? LineBreak time_range LineBreak text_lines -> ^(BLOCK Number time_range text_lines)
;
time_range
: Time Spaces? Arrow Spaces? Time Spaces? -> ^(TIME_RANGE Time Time)
;
text_lines
: line (LineBreak line)* -> ^(LINES line+)
;
line
: Spaces? word (Spaces word)* Spaces? -> ^(LINE word+)
;
word
: (Other | Number | Dashes | Arrow)+ -> WORD[$text]
;
Time : Number ':' Number ':' Number ',' Number;
Arrow : '-->';
Dashes : '-'+;
Number : '0'..'9'+;
LineBreak : '\r'? '\n' | '\r';
Spaces : (' ' | '\t')+;
Other : . ;
这将解析输入:
1
00:00:02,218 --> 00:00:04,209
[A B C]
2
00:00:04,721 --> 00:00:05,745
-- Line 1
-- Line 2
3
00:00:05,922 --> 00:00:07,913
mu --> MU
你好,巴特,谢谢你的回复。当文本中的数字和冒号时,我有一些问题第1季第15集:“或”“我会在11:00给你打电话。维多利亚。”“试图修改你的示例,但没有成功。@克里斯,没问题,请检查我的编辑。你好,巴特,谢谢你的回复。”。当文本中的数字和冒号时,我有一些问题第1季第15集:“或”“我会在11:00给你打电话。维多利亚。”“试图修改你的示例,但没有成功。@克里斯,没问题,请检查我的编辑。”。